Category Archives: testing

December 28, 2022

What’s a National Merit Scholar?

If you know anything at all about the PSAT, then the current conservative media hysteria charging the elite Virginia high school with “withholding announcement” of National Merit awards makes no sense at all. Before I heard about this brouhaha, I knew 90% of everything about the PSAT. Figuring out how to explain that this hysteria is absurd required me to learn the other 10%, which I’ll now share with you, dear readers.

The PSAT is formally known as the PSAT National Merit, Scholarship Qualifying Test, or PSAT/NMSQT. Back in the dark ages, from 1959 to some point in the 19990s, the test had a legitimate function as an initiation to the SAT proper. Juniors took the PSAT and seniors took the SAT in October, which was SAT Day. By the time I began tutoring in 2002, “SAT Day” had already moved back to March of junior year, per College Board recommendation. Before the pandemic, competitive students began taking the SAT much earlier in their junior year. So as a practice test, the PSAT long ago lost its utility.

The real value of the PSAT comes from its association with the National Merit awards. Originally, the Scholarship Qualifying Test was a different entity that identified top students as semifinalists, who then “confirmed” their SQT score with an SAT. But tests are expensive to write, so in 1971 the two organizations joined forces and the PSAT became the qualifying test.

National Merit qualification is the PSAT’s reason for existence these days. For students, qualification permits recognition that has otherwise been erased from the modern era–and by modern, I don’t mean the current post-Floyd “tests are bad” phase but going back to the 80s or even earlier. All you need for National Merit recogntion is a really high PSAT test score that puts you in the top 1% of all testers. The SAT has no equivalent. There’s no official SAT 1600 Club, no “Top 1% Score” label students can include in their CV.

So a major point of the PSAT–arguably its only real value–is to identify the top 1% and giving them bragging rights. Since Asian immigrants and months or years of prep broke the tests, the labels aren’t as impactful as they once were. But to Asians, they are particularly important because the scores can’t be gamed by state discrimination. While states with high Asian populations have higher cut scores, the scores themselves are the only way to play. No one is culling out Asians or making them meet a higher standard. The single standard is so essential to the National Merit qualification that, rather than change this, a national recognition category was defined for the lower scoring ethnicities.

If, as I claim, that’s the primary feature of the PSAT, one might wonder why other kids not in the top 1% would bother taking the PSAT at all, given the wide range of SAT test prep and the complete lack of value the PSAT has in their lives. Hard truth: Most kids are only taking the PSAT to provide a decent-sized mountain for the winners to sit atop.

To keep the PSAT tradition alive despite the fact that the nearly three hour tet has little benefit for the other 96%, the College Board gives complete control to high schools. Students don’t register for the PSAT with the College Board (as they do for the SAT). High schools administer and own the PSAT. They decide what day to run the test (Tuesday or Saturday). They decide if the test will be limited to their students or if they will sell seats to kids from other schools. They decide whether or not they will require their students to take it.The scores and notifications are sent to the schools, not the students–these are precisely the circumstances that created the TJ hysteria currently in the news. While the SAT registration fee increases almost every year, the PSAT is just $18–very affordable to states who might want to pick up the tab.

In short, we still have the PSAT because the College Board uses the National Merit awards to increase their cachet and in exchange gives control and affordability to high schools, who have various reasons for wanting their student population to take college admissions tests, from bragging rights about national merit to ensuring they aren’t missing bright unmotivated kids to..whatever, I haven’t gamed out all the advantages. Otherwise, it’d be long gone.

The essential category achievement in National Merit ranking is “semifinalist”.

National Merit Semi-finalists are, roughly, those receiving the top 1% of PSAT scores. Designation isn’t an exact science, because the finalists are apportioned by state. Different states have tighter (California, Virginia) or looser (South Dakota) cut scores, but it’s basically the top 1%.

Most semifinalists go on to be finalists. Not all do–for example, I was a semi-finalist whose school didn’t even bother to tell me there was paperwork to apply for the next step because my GPA was a 3.3 and I had something like 4 Ds so didn’t have a shot at finalist. But most.

The scholarships themselves aren’t all that big a deal. There are three categories of scholarships: NM corporate, NM university, and National Merit itself. Awards are far more subjective. The corporate scholarships are usually limited to students whose parents are employees, or living in a particular region. The university scholarships aren’t even offered by most schools but are used by less prestigious schools to offer full-rides to smart kids if they commit. The NM-sponsored scholarships are for $2500. So not that much money and–crucially–determined in late spring, long after college offers have been made. No application gloss factor.

So for all practical purposes, semi-finalists are the ball-game. They’re declared in September, and are a pretty reliable indicator that the student was in the 99th percentile for his state.

Prior to the TJ story, that’s what I knew.

The TJ story has various iterations but makes this charge:

Last fall, along with about 1.5 million US high school juniors, the Yashar teen took the PSAT, which determines whether a student qualifies as a prestigious National Merit scholar. When it came time to submit his college applications this fall, he didn’t have a National Merit honor to report — but it wasn’t because he hadn’t earned the award. The National Merit Scholarship Corporation, a nonprofit based in Evanston, Illinois, had recognized him as a Commended Student in the top 3 percent nationwide — one of about 50,000 students earning that distinction. Principals usually celebrate National Merit scholars…

To those of us generally familiar with the PSAT, there are two unfamiliar terms that immediately jump out: Commended Student and National Merit scholar.

The other fact that jumps out is the 3%. Remember, semi-finalist is top 1%. So Commended Student is way downstream from the only important category. Basically, a participation trophy. I thought the term might be relatively new, but I found mention of it going back 30 years, so it’s not new–just unimportant. In fact, I think my own son must have qualified and DAMN NO ONE TOLD ME EITHER.

Then I dug into “National Merit Scholar” and learned that it is the formal term for kids who make it all the way through to a scholarship from the organization. No one uses the term “National Merit Scholar” for anything less. And as I said, the actual scholarship winners aren’t as big a deal as the semi-finalists, which is why I hadn’t heard the term. (On the Advanced Placement side, AP Scholar has far more significance.) Here’s a University of South Florida campus celebrating the presence of eight National Merit Scholars on campus. USF Petersburg gives each student a full ride on tuition and board in exchange for the student openly committing to the school. This is not a list of semifinalists, even, as can be seen by the declaration of major.

So again: Semi-finalist status is coin of the realm. Commended is also-ran, Finalist is status too late to matter, Scholar is more status too late, but also some money.

So that’s nearly 1300 words on the National Merit and PSAT.

In Part II, I’ll explain why the l’affaire TJPSAT is not just nothing, but a really embarrassing nothing. Hint: everyone retweeting this story either genuinely thinks or is perpetuating a lie “commended” is “scholar”.

31 Comments | tags: asian immigrants, merit, Thomas Jefferson High School of Science and Technology, war on woke | posted in College Admissions, testing

September 18, 2021

False Positives

By educationrealist

I quit writing about tests. And test prep. Five, six, years ago? I still taught test prep until this year, always giving in to my old employer’s pleas to teach his Saturday classes. But I largely quit the SAT after the last changes, focusing on the ACT. I still love tests, still enjoy coaching kids for the big day.

Explaining why has been a task I’ve avoided for several years, as the doubt is hard to put into words.

It was an APUSH review course, the last one I taught, I think. Class hadn’t started for the day, but one of my five students was sitting there highlighting notes. She was a tiny little thing, perky and eager but not intellectually remarkable and it was March of what would have been her junior year.

“This is my last test prep course. I’ve taken the SAT for the fourth time, took AP Calculus BC last year, and I’m all done.”

“Yay! How’d you do on the SAT?”

“2400,” she said, casually. “I got 2000 the first time, but I spent the whole summer in two prep courses, plus over Christmas.”

Boom.

Like I said, she was….ordinary. Bright, sure. But her APUSH essays were predictable, regurgitating the key points she’d read in the prep material–pedestrian grammar, too many commas. Her lexile level was unimpressive. Nothing terrible. I gave her some tips.

This girl had placed in the 99th percentile for the SAT but couldn’t write a grammatically complex sentence, much less an interesting one. Couldn’t come up with interesting ways to use data (graphs, statistics). Couldn’t accurately use the words she’d memorized and didn’t understand their nuance in reading text

She was a false positive.

I’ve known a lot of high scoring students of every ethnicity over the years–and by high scoring, I mean 1400-1600 on the 1600 SAT, and 2200-2400 on the 10 years with the three tests. 5s on all AP tests, 700+ on all Subject tests. Until that conversation, I would have said kids had high test scores were without exception tremendously impressive kids: usually creative, solid to great writing, opinionated, spotted patterns, knew history, knew the underlying theory of anything that interested them. I could see the difference, I’d say, between these kids and those slightly lower on the score scale–the 1200s, the kids who were well rounded with solid skills who were sometimes as impressive, sometimes not, sometimes a swot, sometimes a bright kid who didn’t see much point in striving.

Every time saying it, though, I’d push back memories of a few kids who’d casually mentioned a 5 score, or a 1600 or 2400, that took me aback. That particular kid who didn’t seem all that remarkable for such a high score. But in all these cases, I was only relying on gut instinct and besides, disappointingly high IQ folks exist. For every Steven Hawking there’s a Ron Hoeflin. Or a Marilyn vos Savant, telling us whether or not larks are happy. Surely the test would sometimes capture intellect that just wasn’t there in the creative original ways I looked for. Or hey, maybe some of those kids were stretching the truth.

But here, I had my own experience of her work and her scores were easily confirmable, as my employer kept track (her name was on the “2400 list”, the length of which was another shock to my prior understanding). She got a perfect score despite being a banal teen who couldn’t write or think in ways worthy of that score.

Since that first real awareness, I’ve met other kids with top 1% test scores who are similarly…unimpressive. 98+ percentile SAT scores, eight 5 AP scores, and a 4.5 GPA with no intellectual depth, no ability to make connections, or even to use their knowledge to do anything but pick the correct letter on the multiple choice test or regurgitate the correct answer for a teacher. Some I could confirm their high scores, others I just trusted my gut, now that I’d validated instinct. These are kids with certainly decent brains, but not unusually so. No shame in that. But no originality, not even the kind I’d expect from their actual abilities. No interest in anything but achieving high scores, without any interest in what that meant.

It probably won’t come as a shock to learn that all the kids with scores much higher than demonstrated ability were born somewhere in east Asia, that they all spent months and months learning how to take the test, taking practice tests, endlessly prepping.

The inverse doesn’t hold. I know dozens, possibly hundreds, of exceptional Asian immigrants with extraordinary brains and the requisite intellectual depth and heft I would expect from their profile of perfect SAT scores and AP Honors status. But when I am shocked at a test score that is much higher than demonstrated ability, the owner of that score is Chinese or Korean of recent vintage.

I don’t know whether American kids (of any race) could achieve similar scores if they swotted away endlessly. Maybe some of them are. But my sample size of all races is pretty high, and I’ve not seen it. On the other hand, I’m certain that very few American kids would find this a worthwhile goal.

Brief aside: when I taught ELL, I had a kid who was supposedly 18. That’s what his birth certificate said, although there’s a lot of visa fraud in Chinese immigrants, so who knows. He didn’t look a day older than fourteen. And he had very little interest in speaking or learning English. Maybe he was just shy, like Taio, although I’d test him every so often by offering him chocolate or asking him about his beloved bike and he showed no sign of comprehension. But then he’d ace multiple choice reading passages. Without reading the passage. He had no idea what the words meant, but he’d pick the right A, B, or C, every time. I mentioned this to the senior ELL teacher, a Chinese American, and she snorted, “It’s in our genes.”

I don’t think she was kidding but the thing is, I don’t much care how it happens. If American kids are doing this, then it changes not a whit about my unhappiness. It’s not a skill I want to see transferred to the general teenage American population. (That said, the college admissions scandal makes it pretty clear that, as I’ve said many times, rich parents are buying or bribing their way in, not prepping. And unsurprisingly, it appears that Chinese parents were the biggest part of his business.)

Now, before everyone cites data that I probably know better than they do, let me dispatch with the obvious. Many people think test prep doesn’t work at all. That was never my opinion When people asked me if test prep “worked”, I’d always say the same thing: depends on the kid. “Average score improvement” is a useless metric; some kids don’t improve, some improve a bit, some improve a huge amount. Why not pay to see if your kid improves a lot? But I also felt strongly that test prep couldn’t distort measured ability to beyond actual ability, and I no longer believe that.

But I didn’t believe what critics at the time said, that test prep worked…..too well. I didn’t believe that false positives were a real problem. And the terrible thing is–at least to me–is that I still believe normal test prep is a good thing. Distortion of ability, however, is not.

As the push to de-emphasize tests came, as test-entry high schools came under attack, as colleges turn to grades only–a change I find horrifying–I could no longer join the opposition because the opposition focused their fire almost exclusively on their dismay at the end of meritocracy and the concomitant discrimination against Asian immigrants. I oppose the discrimination, but I no longer really believe the tests we have reliably reveal merit to a granular degree. The changes I want to see in the admissions process would almost certainly reduce Asian headcount not by design, but by acknowledging that specific test scores aren’t as important.

I have other topics I’ve been holding off discussing:

why I support an end to test-based high schools in its current form
why we still need tests
how the SAT changes made all this worse
how the emphasis on grades for the past 20 years has exacerbated this insanity
why we need to stop using hard work as a proxy for merit

But I needed to try, at least, to express how my feelings have changed. This is a start. It’s probably badly written, but as you all know, I’ve been trying to write more even if the thoughts aren’t fully baked, so bear with me.

12 Comments | tags: asian immigrants, meritocracy, SAT, test prep | posted in College Admissions, philosophy, testing

December 30, 2019

Bush/Obama Ed Reform: Zenith

By educationrealist

(This is part 2 of my brief (hahahah) history of the rise and fall of modern education reform. This part is longer because much more happened. Unlike the events in part 1, I experienced the Obama reforms as a teacher, having graduated from ed school the year of his inauguration. I began blogging the year he was re-elected.)

Bipartisan Achievements

Barack Obama won the presidency in 2008 while simultaneously blasting NCLB and praising charters and merit pay for teachers. In practice, he and Secretary of Education Arne Duncan kept giving reformers everything they wanted–although in fairness, reformers got increasingly nervous about their gifts as his presidency matured.

Ironically, given the general sympathy that the Obama administration had for education reform, a new version of the ESEA was impossible throughout most of the Obama presidency. This proves to be an extremely significant limitation. Arne Duncan and Obama, rather than force states to live with the unpopular mandates, invited the states to submit waivers asking to be exempt from the penalties. This gave the Obama administration considerable power to force states to adopt policies the federal government wanted. Conservatives were unnerved by what most would considera a violation of Section 438 of the General Education Provisions Act banning any federal control over state educational choices.

Bribing the States, round I: Race to the Top, Waivers

First up was Race to the Top, enacted as part of the economic stimulus plan of 2009, in which over $3 billion was set aside for rewards to competitive bids. Compared to the moon shot by Arne Duncan, the competition demanded compliance with most key aspects of education reform. Of the 500 points awarded, 313 of them (63%) were for teacher effectiveness (138 points), adopting “common core” standards (70 points), supporting the growth of “high quality” charters (55 points) and intervention into low-performing schools (50 points). Schools that didn’t promise to fulfill ed reformers’ wildest dreams didn’t stand much of a chance. From the link above: “Between 2001 and 2008, states on average enacted about 10 percent of reform policies. Between 2009 and 2014, however, they had enacted 68 percent. And during this later period, adoption rates increased every single year.”

Around 2010, it became possible to observe two developments that were in fact completely forseeable to everyone back in 2001, when NCLB was signed.

First, NCLB allowed states to define proficiency and then penalized schools that didn’t meet that definition. That might not have been a problem except for the second development: no matter how easy the tests got, 100% proficiency never happened. And the gaps were the usual ones.

But now 2014 was squarely in sight and closer and schools well outside the usual urban dystopias were getting hammered into program improvement.

Since a new ESEA was still politically impossible, the Obama administration began offering “waivers” from the consequences of extended failure to meet NCLB, in exchange for setting their own higher, more honest standards for student success:

State must adopt college and career ready standards
Schools must be held accountable
Teacher and principal evaluation systems

Some education reformers (the conservatives) were concerned about the quid pro quo nature of the waiver requirements. Other education reformers (the neoliberals) pishtoshed those concerns, saying (much as they said later about immigration) that Congressional gridlock made the waivers and demands logical and reasonable. A typical debate, in which Andrew Rothernam, neoliberal reformer from the Clinton administration, rationalized the Obama waivers “This dysfunction matters because when NCLB was passed in 2001, no one involved imagined the law would run for at least a decade without a congressional overhaul.” (translated, good god, no one took that nonsense about 100% proficiency, we expected to modify it before then!)

Obama announced the waivers in February, 2012, and by July of that year 26 states had waivers, with another 9 awaiting approval. A year later, all but seven states had waivers. Jerry Brown and the California team flatly refused to intervene in “failing schools” or evaluate teachers by test results and never got a waiver (although a few districts applied separately and got one).

While we refer to the testing consortiums (consortia?) as the Common Core tests, I was surprised to learn that the original competition for the grants was part of Race to the Top. Arne Duncan announced the winners, PARCC, which had 26 states signing on, and SBAC, which had 33 (some states joined both), in 2010.

The tests, almost more than the standards, excited education reformers. No more would individual states be able to dumb down their tests to reach NCLB standards. All the states would be held to the same standard.

But it wasn’t federal mandates, of course. No, no. This was all voluntary!

Bribing the States, round II: Common Core

The Common Core initiative was originally the brainchild of Janet Napolitano when she heading up the National Governor’s conference, documented in 2007’s Benchmarking for Success: Ensuring US Students Receive a World-Class Education (note: it’s kind of amazing how hard this document is to find. All the links to it reference the NGA doc, but that’s been deleted. I think this is the only existing online copy). She convened a group, and they came up with a set of five action items, three of which you can see reiterated above in the Obama waiver, because they were basically copied.

But it would never have gone anywhere had not Gene Wilhoit (head of school superintendant organization) and David Coleman, described in the link ahead as “emerging evangelical of standards” but actually little more than an ex-McKinsey guy with an assessment display (display. not design) startup went to see Bill Gates, whose enthusiasm should have been a big neon light of warning, given his track record. Gates funded the development of standards. Coleman used the money to start “found” Student Achievement Partners and hire Jason Zimba, an ex-business partner who now worked for Coleman’s mothert(or, was a professorat Bennington College, where Coleman’s mom was president). Zimba, Phil Daro, and William McCallum wrote the math standards. Coleman and Susan Pimental wrote the ELA standards. The original Benchmarking report stated that the standards would be based on the American Diploma Project, but for reasons I don’t understand and might be interesting for someone else to explore, Coleman and crew rewrote a lot of it.

As the history shows, education reformer groups–those involved with accountability and choice–weren’t directly involved in the birth of Common Core, although it’s also clear from the verbiage in the Benchmarking report that education reform initiatives like teacher value-added measurement, charters, and school takeovers were very much in political parlance at that time, and very much bipartisan.

But education reformer groups loved the Common Core because they saw it as a way to bail them out of the two serious failures of NCLB described above. As Rick Hess observed in a five-year retrospective of Common Core, “The problem with that is if you had hard tests or hard standards you made your schools look bad. So there was a real, kind of perverse incentive baked into NCLB [to make the tests easier]“. Hilariously, Michael Petrilli, who was in the Bush administration and was a key bureaucrat in the passage, has often said he disagreed that the 100% proficiency goal but “his boss” forced it on him. So now that NCLB was in a bind, the ed reformers were all for Common Core bailing them out.

The waiver process is often blamed for the rapid adoption, but in fact every state but Alaska, Texas, Nebraska, and Virginia had adopted Common Core standards by 2012, and all of those but Wyoming had done so long before Obama announced the waivers. Apart from the conservatives “in principle” objections, the original hullaballoo over heavy-handed federal interference was teachers’ outrage at a president–a Democrat, no less–using money to bribe states into evaluating teachers by their students’ test scores.

Regardless, states eagerly adopted the Common Core standards and in 2012, all seemed right in the world of education reform.

Governance

Technically, all of the above was the Obama Administration’s bribes to the states to change their governance. These are just some specific cases or other items of interest.

Tennessee won the Race to the Top, getting $500 million to enact First to the Top. Initiated by Governor Phil Bredesen, a Democrat, carried through by Bill Haslam, Republican. Tennessee’s application promised two things of note, First, it would use its existing, longstanding teacher evaluation system (TVAAS) and use it as a formal evaluation tool, responsible for 35% of teacher evaluations. Then, in order to invervene in “failing” schools, it set up a state-run district, the Achievement School District, creating a as opposed to a state taking over a district. The lowest performing schools were simply placed in that district. The stated goal of the ASD was to take schools from the bottom 5% and “vault” them to the top 25%. In 2011, Haslam appointed Kevin Huffman, ex-TFA teacher and executive, as well as Michelle Rhee’s ex-husband, as Commissioner of Education. The first ASD superintendent was Chris Barbic, former TFA teacher and founder of Yes Prep, another charter system in Houston.

Mark Zuckerberg went on Oprah in 2010 and, with great fanfare, donated $100 million to Newark, New Jersey schools. Chris Christie appointed Cami Anderson, alumni of TFA management, as superintendent of the district in May 2011. A year later, Anderson signed a contract with the Newark Teachers Union giving bonus pay for higher test scores or teaching math and science (although teachers could choose to be paid traditionally). The pot was sweetened with a lot of back pay which, to put it mildly, was not what Zuckerberg wanted the money to be spent on.

Michelle Rhee got a lot of attention, bragging of giving DC schools a “clean sweep”, dumping all the “bad” teachers and administrators who didn’t get test scores up. Eva Moskowitz was dumping students who didn’t get test scores up. Joel Klein left his NYC post in 2011; Bloomberg’s pick of Cathy Black, a woman with no teaching or administrative experience, was extremely unpopular. Bloomberg gave up on Black after four months and appointed Dennis Walcott, who was accepted at face value as an improvement. School turnaround consultant Paul Vallas ran the Louisiana Recovery District (mostly New Orleans Schools) for 4 years.

Education reform generally became more popular in Democratic circles, given Obama’s strong support. Steven Brill’s article The Rubber Room called attention to NYC’s practice of housing teachers who’d been removed from the classroom but couldn’t actually be fired. Waiting for Superman, a documentary promoting choice and blasing unions and tenure, opened to universal praise by media, politicians, and other thought leaders. In 2010, Obama openly supported the dismissalof a Rhode Island high school’s entire staff, saying, “our kids get only one chance at an education, and we need to get it right.”

All this criticism kept building. 2012 was a nadir year terms of establishment discourse about public school teachers, although their reputation among the public seemed largely unchanged. It became increasingly popular to attack teacher tenure, again by both Democrats and Republicans, and certainly in the generally left of center media. Many states had agreed to evaluate teachers by test scores and both major unions had signed onto the Common Core standards, although teachers themselves were very doubtful. A preponderance of politicians and academics were more than willing to agree that teacher quality needed to improve, that tenure might be problematic, and that teachers should be judged at least in part on test scores. The Chicago Teachers Union went on strike, pitting union president Karen Lewis against Rahm Emmanuel, and media sympathies were entirely with Rahm. Governor Scott Walker ended collective bargaining for public workers (except cops and firefighters!).

One major setback: DC’s 2010 election, in which black voters booted Adrian Fenty, the media-popular mayor, largely because they wanted to get rid of Michelle Rhee, who stepped down the day after the election. Her successor, Kaya Henderson, kept firing teachers, but she’s black, which might have made a difference. Rhee immediately announced a new organization, Students First, and let Richard Whitmire write an admiring biography.

Standards

In 2008, California made algebra I the “test of record” for eight graders, meaning that 8th graders would take an algebra end of course test or the schools would receive a penalty towards average yearly progress.

High school exit exams mostly held constant; this 2008 Edweek article actually says that fewer than half of the states required exams, but that may be because of lawsuits. California, for example, was sued constantly about the use of the CAHSEE in the early 2002.

Charter Growth, Choice, TFA

Just one state, Washington, authorized charters during the Obama administration. Absolute growth was still slow through 2011, but then recovered from 2012 to 2017. As a percentage, though, the decline from 2001 to 2011 was steep, slowed slightly but still declined through 2017. By 2012, charter advocates began pushing the suburban progressive charter, realizing that growth would continue to slow if they couldn’t disengage white folks from their beloved public schools. Suburban charters were (and are) popular with whites in racially diverse areas, particularly in the south; for example, Wake County charter schools were 62% white in 2012.

When the 2007-2008 meltdown hit, TFA recruitment soared ever higher as elite grads sought shelter from a horrible job market. Relay Graduate School began in 2011, basically providing a teaching credential for new hires of inner city charters.

In 2010, Douglas County (major Colorado suburb) began a highly contested investigation into a voucher program, one that would give public money for all private schools, including religious ones. The school board ultimately supported a move forward, despite a split community.

And that’s the end of the very nearly straightforward rise of education reform. It’s impossible to cover every major development, but I really tried to look at advances in every major area.

I’m going to call 2012 as the peak of the era, for reasons I’ll go through in the next post. It’s not that all progress stopped. It took four more years before education reformers even began to consider how badly they’d been beaten. But most of them would realize that they were now fighting significant opposition that they couldn’t easily dismiss.

Something I’ve mentioned before: it’s amazing that Republican media folk, as opposed to education reformers and even politicians, still talk like it’s 2008-2012. There’s really no understanding in the pundit world how badly they’ve been beaten.

Next Up: Bush/Obama Ed Reform: Core Meltdown Came

14 Comments | tags: algebra one, Arne Duncan, Barack Obama, Common Core, David Coleman, jason zimba, Michelle Rhee, nclb waiver, race to the top, waiting for superman | posted in charters, policy, politics, reform, testing, unions

April 29, 2019

Getting Smarter, or Getting Better at Using Smarts?

By educationrealist

Influence of young adult cognitive ability and additional education on later-life cognition

Or, as Stuart Richie says:

Cool new PNAS paper about potential educational effects on IQ.

Previous work: control for age-11 IQ, still find edu-IQ correlation later in life.

This paper: control for age-20 IQ, correlation is gone. Suggests education has limited influence after age 20.

IQ measurement doesn’t interest me much, but IQ development or change over time does, for ego-driven reasons. As long-time readers know, I have a very high IQ (I qualified and participated in a research study for 3SD+), but my spatial abilities are very weak and I was stymied in advanced math (past algebra 1) until, in my early 40s, I learned how to compensate using logic. I also was late to learn how I learned; my brain won’t acquire new information unless it’s tagged with all sorts of meta-data. Learning new concepts was so laborious that in my teens, I simply assumed I was incapable of learning; not until I was faced with job-related challenges did I learn how I learned. My verbal skills are extraordinarily high, although it’s hard for me to compare to others because my particular combination of smarts would have required a more thorough classical education, which I don’t have. I read 1000 wpm and can acquire extraordinary amounts of information through inference, which of course can sometimes lead me astray.

So. In 2001, I took the GRE and got 790V, 640Q, 690A. It was the last time the Analytical section was included. 640 quant was the 65th percentile that year, 690 analytical was in the 85th percentile–some logic games are brutally spatial. Anything over 700 Verbal is in the 99th percentile. I was very proud of that quant score.

In 2008, I took the GRE and got 780V, 800Q. (I’m still annoyed by the 780; if I’d focused in more, I might have gotten a double 800.) 800 Q is just the top 4% in any given year, but it’s probably more accurate to call it a top 10% score.

According to this GRE IQ estimator, my original GRE V+Q of 1340 is 99.452nd percentile, and my second GRE V+Q of 1580 is 99.993. But (forgive me IQ estimators), any IQ based on combining V+Q makes no sense, because an 800V-640Q is considerably more difficult to achieve than an 640V-800Q, so how can they be identical, IQ wise? Plus, the decimal point specificity is just goofy.

Looking at my quant score alone, in seven years from the age of 39 to 46, I jumped from just above average in math to pretty close to 2SD.

Did I get smarter, or did I simply learn how to use my existing intelligence?

The quant section of the old GRE was extremely g-loaded. I used to tutor for the test and ran into dozens of people who’d majored in college in math, knew more calculus and all that nonsense about vectors and matrix determinants and ordered ring fields than I’ll ever know, yet scored in the high 600s. Which is not to say that it wasn’t relatively easy, just that lots of smart people would occasionally miss questions because they were more about g than math competency. The new GRE combines both. I can’t find an online GRE practice test, but I did the problems on the ETS site, and I think I’d still get in the top 10%.

The GRE Math Subject test is what used to be called an “achievement” test. God, testing lingo has changed so dramatically in so little time. g is involved in the sense that a certain level of intelligence is required to learn the material. But a 120 IQ who’s taken calculus and number theory would outscore a 145+ IQ who has not.

I got 13 right out of 56. A 390. I wonder how many people get an 800 on the GRE General Quant and a 390 on the GRE Math? That’s a terrific illustration, really, of the difference between achievement and aptitude. I knew none of the number theory, only some of the stats, none of the integral questions, but all of the limit and derivative questions, and random other stuff.

Every single one of the questions I answered correctly was using math I’ve learned in the past fifteen years. Had I taken the test in 2005, I would have gotten zero correct. I took AP Calculus as a senior, remember none of it. All the math I know today is from my tutoring days or my time in teaching.

Did I get smarter, or did I simply learn how to use my existing intelligence? Here, it seems clearer than in the first case. The GRE Quant (old form) is definitely an aptitude test, which makes my big score jump odd. But acquiring new knowledge isn’t the same as having a higher IQ. Right? (asking seriously). I could do much better on this test if I studied up on integrals and 3-dimensional systems. Hold that thought.

To contrast, I took the GRE English Literature Subject Test, all 230 questions. For me, this test is diametrically opposite the GRE Math Subject test. The latter requires actual math knowledge. But the English Lit test is about 70% interpretation, 10% terminology (literary terms) and 20% content knowledge (knowing the plot of Ben Jonson’s plays, or familiarity with Matthew Arnold’s poems). I missed 56 questions, scoring a 650, in the 86th percentile (although I’ve always distrusted the scoring on English lit tests). Two of the misses were analysis and both were careless errors I’d never make in a real test. All the other missed problems were content knowledge–not anything I’d forgotten, but things I’d never learned. My English degree wasn’t terribly rigorous, but what I learned thirty five years ago, I remembered. I recognized Shakespeare’s writing in a sonnet I’d never read before–ditto Donne and Milton. I even guessed my way through Derrida and Foucault. But Wiliam Caxton, Nikki Giovanni–eh. Never heard of them. I read a lot of Faulkner short stories, but avoided his novels. And so on.

Most of the high difficulty questions (less than 30% answered correctly) were literary analysis, and I nailed them. The only hard questions I missed were three content knowledge (obscure authors) and one grammar question. The rest were in the 45-65% range, which is typical when the test is covering a broad range of material and no one knows everything. I think I could probably learn my way to a 700, but higher than that would require more interest in literature than I have.

So the GRE Math subject test requires specific knowledge, while the GRE Literature subject test allows people with high aptitude to do very well, even if their specific literature content knowledge is weak.

There aren’t many forty-something folks taking GRE Subject tests, but doesn’t it seem likely that it’s more common for someone to develop math content knowledge later in life than it is to suddenly develop excellent reading skills? Which suggests that reading comprehension, verbal ability, is more hard-wired to cognitive ability than math is. That might explain why math test scores have improved more than reading scores, generally. For all the wailing about math achievement, we do better at teaching students to improve their math abilities than we do at making them better readers.

From the study abstract: “Education does improve cognitive ability”

It does? That seems backwards to me. Cognitive ability improves educability. If all we had to do was educate people to make them smarter, I wouldn’t have this blog.

Does education actually make people smarter, or does it just teach them how to use their existing intelligence?

I have no answers, so I’ll stop here.

10 Comments | tags: cognitive ability, GRE Literature, GRE Math, Stuart Richie | posted in philosophy, testing

September 28, 2017

Killing My Own Snakes

By educationrealist

When I was hired to teach at Southeastern in May, 1979, the Academic Dean at the time gave me only two pieces of advice: “Make your own way,” and “Kill your own snakes.”-Steven Fettke

One of the most valuable pieces of advice I received, from two different teachers in two different years (student teaching, first year), was that a new teacher had to know what “quiet” is. If kids wouldn’t shut up, then kick them out until finally, the teacher experiences….silence. Without that baseline, a new teacher has no gauge to assess the ambient classroom noise.

I began teaching as a better than average classroom manager, and somewhat shrugged this wisdom off until I got the advice the second time after five particularly troublesome geometry students wouldn’t shut up during an entire lesson. So the next day, I warned them once and then tossed one then another off to the office. After two were gone, the other three realized I was serious and shut up, after growling a bit about unfairness. Turning back to the board, I suddenly heard…..silence. Utter, attentive, silence. And from that point on, I knew what silence was, and what to expect when I demanded it.

As a mentor, I always advise new teachers to err on the side of excess with disruptive students. If they have an entire class out of control, ask for help. If they have a few students misbehaving, toss them out after a warning. Screw fair. Get silence. Know what it sounds like.

New teachers are often fearful of sending students out. They worry that administrators will judge them. They’re right to worry. Administrators often notice. At my last job, the volume of my referrals was a constant source of tension. In really poorly managed schools, the admins refuse to accept students and send them back. (Note: leave that school.)

This is where mentors come in. Mentors can, and should, give balance to new teachers. My induction mentor’s support and acknowledgement of my unimaginably disruptive students finally forced administrators to take action. If the teacher is weak, by all means help shore up the crumbles. But in the meantime, encourage the teacher to boot students who disrupt teaching time. I get impatient with people who bleat that removing kids from the class is depriving them of education. All students deserve an education. Students who are determined to prevent that can step outside.

In my experience, novice teachers stuck with unusually unruly students will improve their management skills if given the opportunity to remove the disruptors. As time goes on, these teachers will improve their handling of rambunctious students. Part of that improvement involves knowing what silence sounds like.

So new teachers should not try to kill all their snakes, particularly given the likelihood that they’ll have the toughest students.

I assume most teachers kill their own snakes after the first few years. But I’m often amazed at what senior teachers will tolerate. Sample statements, followed by my (usually unspoken) response.

“I’m teaching an Algebra 10-12 class, and the kids start packing up their stuff with fifteen minutes to the bell. Does that ever happen to you? What do you do to prevent that?”

I tell them to unpack their damn books and get back to work. Right now. And if they don’t start moving right away, oh my goodness, pop quiz.

“I’ve been having so much trouble with kids using cell phones constantly in class, not paying attention at all. What do you do?”

I take their damn cellphones away, giving myself extra points if I can swipe it from under their nose without signaling intent. Students who can’t keep off their phones lose them until the end of the day instead of the end of class. And they don’t dare complain, because I can always hand it over to the administrators, whose penalties are far more stringent.

“I have these two kids who constantly talk to each other, but when I try to separate them, they insist on sitting together. It’s so frustrating.”

Why the hell do you give them a choice? Tell them where to sit. In fact, tell everyone where to sit.

“I tell the kids not to bring food to the class, but what do you do when they’ve just bought lunch?”

You take the lunch away and tell them they can enjoy it cold later.

“I’ve tried taking away phones/telling them where to sit/taking their lunch but they refuse to give it over, and I don’t know what to do.”

You call and have them removed from the class.

“What? For something so minor?”

Listen well, little teachlings. Defiance of a teacher is not minor. It’s one of the few snakes that even experienced teachers should hand off to an administrator if they can’t convince the student to comply. Give the kid a chance to walk back. Offer alternatives. Draw a line, though, and if the line gets crossed, have the kid removed for the day.

And of course, logistics get in the way sometimes. More than once, I’ve picked up the phone to call for a supervisor to come take a defiant kid away–and no one answers the damn phone. So I have to call another number. Sometimes no one answers. All that drama and then….man, turning back around to face the class really sucks.

But well over half the time, simply picking up the phone has results, and the defiant one says something like “Well, you want me to give up my lunch AND my drink! No way!” and I say quickly, “No. Just the lunch. I insist on the lunch!” which leads to “Oh, I thought you wanted my drink, too. OK, have my lunch. BUT I KEEP MY DRINK!”

Other times, the troublesome kid smirks. “Ha, ha, you can’t catch me, copper!” Shrug. Just shrug. And then later, call again, after the smirker has forgotten all about it, and have him pulled from the room, protesting. Don’t gloat. Just go on with the lesson like this is no big deal.

So you might be reading all this saying, wow, Ed’s a tyrant. Which is hysterical, because I’m one of the loosest teachers you’ll ever run into. Remember, I don’t assign homework. My kids sit in groups. I have a non-existent detention rate, the lowest in the school. I rarely give an F grade. To my considerable pride, I’ve gotten the coolest of the Student Nominations three years running (best story teller, most unpredictable, most dramatic). My classes are noisy and boisterous affairs. In many ways, my classroom environment is a progressive’s dream, the kind of place that Ed Boland dreamed of having before he realized he hated students.

I have five rules, handwritten seven years ago on still bright yellow poster paper. Students should avoid:

arguing with the ref (me)
eating, drinking, or grooming
setting objects airborne
travelling without consent
incessant yammering

But bottom line, do what I tell you. My lines are very clearly marked, albeit occasionally negotiable. Just pay close attention to when I say “when”. As I tell my kids every year at syllabus time: in order for “all this”–school, teaching, classroom environment–to work, I have to be in charge. Students have to obey my direct orders.

I realize that many teachers feel that schools already exert a great deal of control over student lives. They feel that rules about eating, phones, and seating are an unfair imposition. These same teachers often feel that “consequences” must be “deserved”, that their restrictions on those who have made bad choices, are somehow more reasonable.

Shrug. I’m not saying there’s only one way. Other teachers can make their own choices. Me, I avoid morality plays. I don’t talk about what students deserve or earn, simply about what helps me teach and others learn. I handle even cheating as a pragmatic issue, not a value judgment.

From students’ perspective, their least favorite of my management techniques is my yelling, specifically calling out or putting a student on blast. They prefer teachers who rebuke quietly and in private. But they also agree that when you aren’t being the one called out, it’s fun to watch me rant.

As I invariably mention when going through the syllabus, the only action a student can take to earn a permanent black mark is deliberate cruelty to another student. I will punish that and I’m much better at being mean.

Note that I prohibit being mean to other students. Nowhere in my rules is it verboten to be mean to me, the teacher.

At least once a year, I (usually inadvertently) get a student furious, and the exchange goes something like this:

Student: “F*** YOU!!!!”

Me, unfussed and occasionally confused: “Sit down.”

Student: “NO!!! You F******* *****! F*** YOU!! F*** OFF”

Me: “Sit down.”

Student, walking to the door: “NO WAY. EAT SH**. I’m OUT! YOU #*@#W%@#W%!”

Me: “DO NOT WALK OUT THAT DOOR!”

Student: “WHY NOT?”

Me: “BECAUSE UP TO NOW, YOU HAVEN’T DONE ANYTHING WRONG!”

This usually stops the student for a minute or so, giving me a chance to calm things down. In every case, after a brief talk with a fascinated class watching on, the student sits back down and everyone gets back to work. Show’s over.

Which is not to say I let students take nasty potshots at me. Like I said, I’m much better at being mean than your average adolescent. But I don’t demand respectful behavior, and don’t get upset at rudeness. This will not come as a shock to people who know me online.

Look. Teaching is very much an expression of personality. Mine is a teacher-centered classroom. But nowhere is it written that teacher-centered classrooms must be ruthlessly controlled environments of churchlike stillness. My classroom is, like me, loud and often disorderly, friendly, sarcastic. It sometimes changes on a dime. But its purpose is always there, driving things along, moving everyone forward.

New teachers: does your classroom environment reflect your personality, your values? Experienced teachers: are you setting rules that matter? Are you sure?

7 Comments | tags: classroom discipline, classroom management, detention, responsibility center, student behavior, teaching philosophy | posted in engagement, philosophy, testing

March 31, 2017

The Challenge of Black Students and AP Courses

By educationrealist

When the bell rings at Wheaton North High School, a river of white students flows into Advanced Placement classrooms. A trickle of brown and black students joins them. —The Challenge of Creating Schools That Work for Everybody, Catherine Gewertz

Gewertz’s piece is one of a million or so outlining the earnest efforts of suburban schools to increase their black and Hispanic student representation in AP classes. And indeed, these efforts are real and neverending. I have been in two separate schools that have been mandated in no uncertain terms to get numbers up.

But the data does not suggest overrepresentation. I’m going to focus on African American representation for a few reasons. Until recently, the College Board split up Hispanic scores into three categories, none of them useful, and it’s a real hassle to combine them. Moreover, the Hispanic category has an ace in the hole known as the Spanish Language test. Whenever you see someone boasting of great Hispanic AP scores, ask how well they did in non-language courses. (Foreign language study has largely disappeared as a competitive endeavor in the US. It’s just a way for Hispanic students to get one good test score, and Chinese students to add one to their arsenal.)

College Board data goes back twenty years, so I built a simple table:

blkaptable

I eliminated foreign language tests and those that didn’t exist back in 1997. It’s pretty obvious from the table that the mean scores for each test have declined in almost every case:

Enter a caption

While the population for each test has increased, it’s been lopsided.

blkapgrowthbytest

It’s not hard to see the pattern behind the increases. The high-growth courses are one-offs with no prerequisites. It’s hard to convince kids to take these courses year after year–even harder to convince suburban teachers to lower their standards for that long. So put the kids in US History, Government–hey, it’s short, too!– and Statistics, which technically requires Algebra II, but not really.

The next three show data that isn’t often compiled for witnesses. I’m not good at presenting data, so there might be better means of presenting this. But the message is clear enough.

First, here’s the breakdown behind the test growth. I took the growth in each score category (5 high, 1 low) and determined its percentage of the overall growth.

blkapscoredistributiongrowth

See all that blue? Most of the growth has been taken up by students getting the lowest possible score. Across the academic test spectrum, black student growth in 5s and 4s is anemic compared to the robust explosion of failing 1s and 2s. Unsurprisingly, the tests that require a two to three year commitment have the best performace. Calc AB has real growth in high scores–but, alas, even bigger growth in low scores. Calc BC is the strongest performance. English Lang & Comp has something approaching a normal distribution of scores, even.

Here you can see the total scores by test and category. Calc BC and European History, two of the tests with the smallest growth, have the best distributions. Only four tests have the most scores in the 1 category; most have 2 as their modal score.

blkap1997

The same chart in 2016 is pretty brutally slanted. Eight tests now fail most students with a one, just four have a two. Worst is the dramatic drop in threes. In 1997, test percentages with 3 scores ranged from 10-38%. In 2016, they range from 10-20%. Meanwhile, the 4s and 5s are all well below 10%, with the cheery exception of Calculus BC.

blkap2016

Jay Mathews’ relentless and generally harmful push of Advanced Placement has been going strong since the 80s, even if the Challenge Index only began in 1998. So 1997’s result include a decade of “AP push”. But the last 20 years have been even worse, as Jay, Newsweek, and the Washington Post all hawked the Index as a quality signifier: America’s Best High Schools! Suddenly, low-achieving, high-minority students had a way to bring some pride to their schools–just put their kids in AP classes.

As I wrote a couple years ago, this effort wasn’t evenly distributed. High achieving, diverse suburban high schools couldn’t just dump uninterested, low-achieving students (of any race) into a class filled with actually qualified students (of any race). Low achieving schools, on the other hand, had nothing to lose. Just dub a class “Advanced Placement” and put some kids in it. Most states cover AP costs, often using federal Title I dollars, so it’s a cheap way to get some air time.

African American AP test scores don’t represent a homogeneous population, and you can see that in the numbers. Black students genuinely committed to academic achievement in a school with equally committed peers and qualified teachers are probably best reflected in the Calculus BC scores, as BC requires about four years of successful math. Black students dumped in APUSH and AP Government are the recourse of diverse suburban schools not rich enough to ignore bureaucratic pressure to up their AP diversity. They are taking promising students with low motivation and putting them in AP classes. This annoys the hell out of the parents and kids who genuinely want the rigorous course, and quite often angers the “promising” students, who are known to fail the class and refuse to take the test. The explosion of 1s across the board comes from the low-achieving urban schools who want to make the Challenge Index and don’t have any need to keep the standards high.

Remember each test costs $85 and test fees are waived by taxpayers for students who can’t afford them. Consider all the students being forced, in many cases, to take classes they have no interest in. Those smaller increases in passing scores are purchased with considerable wasted time and taxpayer expense.

But none of this should be news. Let’s talk about the real challenge of black students and AP scores and methods to fix the abuses.

First, schools and students should be actively restricted from using the AP grade “boost” for fraudulent purposes. The grades should be linked to the test scores without exception. Students who receive 4s and 5s get an A, even if the teacher wants to give a B¹. Students who get a 3 receive a B, even if the teacher wants to give an A² . Students who get a 2 receive a C. Students who get a 1 or who don’t take the test get a D–which, remember, will be bumped to a C for GPA purposes. This sort of grade link, first suggested by Saul Geiser (although I’ve extended it to the actual high school grade) would dramatically reduce abuse not only by predominantly minority schools, but also by all students gaming the AP system to get inflated GPAs. That should reduce a lot of the blue in this picture:

blkapscoredistributiongrowth

Then we should ask a simple question: how can we bump those yellows to greys? That is, how can we get the students who demonstrated enough competence to score a 2 on the AP test to get enough motivation and learning to score a 3?

I’ve worked in test prep for years with underachieving blacks and Hispanics, and now teaching a lot of the kids not strong enough or not motivated enough to take AP classes. My school is under a great deal of pressure to get more low income, under-represented minorities in these classes as well (and my school administration is entirely non-white, as a data point). A couple years ago, I taught a US History course that resulted in four kids being “tagged” for an advanced placement class the next year–that is, they did so well in my class, having previously shown no talent or motivation, that they were put in AP Government the next year. I kept in touch with one, who got an A in the class and passed the test.

My advice to my own principal, which I would repeat to the principal in Gewertz’s piece, is to create a class full of the promising but unmotivated students, separate from the motivated students. Give them a teacher who will be rigorous but low key, who won’t give much homework, who will focus on skill improvement in class. (ahem. I’m raising my hand.) Focus on getting the kids to pass the test. If they pass, they will get a guaranteed B in the class, which will count as an A for GPA purposes. (Even if the College Board doesn’t change the rules, schools can guarantee this policy.)

This strategy would work for advanced placement classes in English, history, government, probably economics. It could work for statistics. Getting unmotivated kids to pass AP Calculus may be more difficult, as it would involve using the strategy consistently for 3 years with no test to guarantee a grade.

The challenge of increasing the abilities and college-readiness of promising but not strongly motivated students (of any race) lies in understanding their motives. Teachers need to give their first loyalty to the students, not the content. Traditional AP teachers are reluctant to do this, and I don’t think they should be required to change. But traditional AP teachers are, perhaps, not the best teachers for this endeavor.

In order for this proposal to get any serious attention, however, reporters would have to stop pretending that talented black students aren’t taking AP courses. The data simply doesn’t support that charge. We are putting too many black students into AP courses. Too many of them are completely unfit, have remedial level skills that high schools aren’t allowed to address. Much of the growth of Advanced Placement has relied on this fraud–and again, not just for black students.

It’s what we do with the kids in the middle, the skeptics, the uncertain ones, the ones who dearly want to be proven wrong about their own skills, that will help us improve these dismal statistics.

¹I can’t even begin to tell you how many teachers in suburban districts do this.
²The same teachers who give students with 4s and 5s Bs are also prone to giving As to kids who got 3s. But of course, this is also the habit of teachers in low achieving urban districts. Consider this 2006 story celebrating the first two kids ever to pass the AP English test, and wonder how many of the students got As notwithstanding.

17 Comments | tags: Advanced Placement, Catherine Gewertz, College Board, testing | posted in policy, politics, testing

May 20, 2016

The Many Failings of Value-Added Modeling

By educationrealist

Scott Alexander reviews the research on value-added models measuring teacher quality¹. While Scott’s overview is perfectly fine, any such effort is akin to a circa 1692 overview of the research literature on alchemy. Quantifying teacher quality will, I believe, be understood in those terms soon enough.

High School VAM is Impossible

I have many objections to the whole notion of modeling what value a teacher adds, but top of the idiocy heap is how little attention is paid to the fact that VAM is only even possible with elementary school teachers. First, reading and basic math are the primary learning objectives of years 1-5. Second, elementary schools think of reading and math ability in terms of grade level. Finally, elementary teachers or their schools have considerable leeway in allocating instruction time by subject.

Now, go to high school (of which middle school is, as always, a pale imitation with similar issues). We don’t evaluate student reading skills by grade level, but rather “proficiency”. We don’t say “this 12th grader reads at the 10th grade level”. We have 12th graders who read at the 8th grade level, of course. We have 12th graders who read at the third grade level. But we don’t acknowledge this in our test scores, and so high school tests can’t measure reading progress. Which is good, because high school teachers aren’t tasked with reading instruction, so we wouldn’t expect students to make much progress. What’s that? Why don’t we teach reading instruction in high school, if kids can’t read at high school level, you ask? Because we aren’t allowed to. High school students with remedial level skills have to wait until college acknowledges their lack of skills.

And that’s reading, where at least we have a fighting shot of measuring progress, even though the tests don’t currently measure it–if we had yearly tests, which of course we don’t. Common Core ended yearly high school tests in most states. Math, it’s impossible because we pass most kids (regardless of ability) into the next class the next year, so there’s no “progress”, unless we measure kids at the beginning and end of the year, which introduces more tests and, of course, would show that the vast majority of students entering, say, algebra 2 don’t in fact understand algebra 1. Would the end of year tests measure whether or not the students had learned algebra 1, or algebra 2?

Nor can high school legally just allocate more time to reading and math instruction, although they can put low-scoring kids in double block instruction, which is a bad, bad thing.

Scope Creep

Most teachers at all levels don’t teach tested subjects and frankly, no one really cares about teacher quality and test scores in anything other than math or reading, but just pretend on everything else. Which leads to a question that proponents answer implicitly by picking one and ignoring the other: do we measure teacher quality to improve student outcomes or to spend government dollars effectively?

If the first, then what research do we have that art teachers, music teachers, gym teachers, or, god save us, special education teachers improve student outcomes? (answer: none.) If the second, then what evidence do we have that the additional cost of testing in all these additional topics, as well as the additional cost of defending the additional lawsuits that will inevitably arise as these teachers attack the tests as invalid, will be less strain on the government coffers than the cost of the purportedly inadequate teachers? What research do we have that any such tests on non-academic subjects are valid even as measures of knowledge, much less evidence of teacher validity?

None, of course. Which is why you see lawsuits by elective teachers pointing out it’s a tad unfair to be judged on the progress of students they’ve never actually met, much less taught. While many of those lawsuits get overturned as unfair but not constitutional, the idiocy of these efforts played no small part in the newest version of the federal ESEA, the ESSA, killed the student growth measure (SGM) requirement.

So while proponents might argue that math and English score growth have some relationship to teacher quality in those subjects, they can’t really argue for testing all subjects. Sure, people can pretend (a la Common Core) that history and science teachers have an impact on reading skills, but we have no mechanism to, and are years away from, changing instruction and testing in these topics to require reading content and measuring the impact of that specific instruction in that specific topic. And again, that’s just reading. Not math, where it’s easy enough to test students on their understanding of math in science and history, but very difficult to tangle out where that instruction came from. Of course, this is only an issue after elementary school. See point one.

Abandoning false gods

For the past 20 years or so, school policy has been about addressing “preparation”, which explains the obsession with elementary school. Originally, the push for school improvement began in high school. Few people realize or acknowledge these days that the Nation at Risk, that polemic seen as groundbreaking by education reformers but kind of, um, duh? by any regular people who take the time to read it, was entirely focused on high school, as can be ascertained by a simple perusal of its findings and recommendations. Stop coddling kids with easy classes, make them take college prep courses! That’s the ticket. It’s the easy courses, the low high school standards that cause the problem. Put all kids in harder classes. And so we did, with pretty disastrous results through the 80s. Many schools began tracking, but Jeannie Oakes and disparate impact lawsuits put an end to that.

I’m not sure when the obsession with elementary school began because I wasn’t paying close attention to ed policy during the 90s. But at some point in the early 90s, it began to register that putting low-skilled kids in advanced high school classes was perhaps not the best idea, leading to either fraud or a lot of failing grades, depending on school demographics. And so, it finally dawned on education reformers that many high school students weren’t “academically prepared” to manage the challenging courses that they had in mind. Thus the dialogue turned to preparing “underserved” students for high school. Enter KIPP and all the other “no excuses” charters which, as I’ve mentioned many times, focus almost entirely on elementary school students.

In the early days of KIPP, the scores seemed miraculous. People were bragging that KIPP completely closed the achievement gap back then, rather than the more measured “slight improvement controlling for race and SES” that you hear today. Ed reformers began pushing for all kids to be academically prepared, that is hey! Let’s make sure no child is left behind! And so the law, which led to an ever increasing push for earlier reading and math instruction, because hey, if we can just be sure that all kids are academically prepared for challenging work by high school, all our problems will be fixed.

Except, alas, they weren’t. I believe that the country is nearing the end of its faith in the false god of elementary school test scores, the belief that the achievement gap in high school is caused simply by not sufficiently challenging black and Hispanic kids in elementary school. Two decades of increasing elementary scores to the point that they appear to have topped out, with nary a budge in high school scores has given pause. Likewise, Rocketship, KIPP, and Success Academy have all faced questions about how their high-scoring students do in high school and college.

As I’ve said many times, high school is brutally hard compared to elementary school. The recent attempt to genuinely shove difficulty down earlier in the curriculum went over so well that the new federal law gave a whole bunch of education rights back to the states as an apology. Kidding. Kind of.

And so, back to VAM….Remember VAM? This is an essay about VAM. Well, all the objections I pointed out above–the problems with high school, the problems with specific subject teachers–were mostly waved away early on, because come on, folks, if we fix elementary school and improve instruction there, everything will fall into place! Miracles will happen. Cats will sleep with dogs. Just like the NCLB problem with 100% above average was waved away because hey, by them, the improvements will be sooooo wonderful that we won’t have to worry about the pesky statistical impossibilities.

I am not sure, but it seems likely that the fed’s relaxed attitude towards test scores has something to do with the abandonment of this false idol, which leads inevitably to the reluctant realization that perhaps The Nation At Risk was wrong, perhaps something else is involved with academic achievement besides simply plopping kids in the right classes. I offer in support the fact that Jerry Brown, governor of California, has remained almost entirely unscathed for shrugging off the achievement gap, saying hey, life’s a meritocracy. Who’s going to be a waiter if everyone’s “elevated” into some important job? Which makes me wonder if Jerry reads my blog.

So if teacher’s don’t make any difference and VAM is pointless, how come any yutz can’t become a teacher?

No one, ever, has argued that teachers don’t make any difference. What they do say is that individual teacher qualities make very little difference in student test scores and/or student academic outcomes, and the differences aren’t predictable or measurable.

If I may quote myself:

Teaching, like math, isn’t aspirin. It’s not medicine. It’s not a cure. It is an art enhanced by skills appropriate to the situation and medium, that will achieve all outcomes including success and failure based on complex interactions between the teachers and their audience. Treat it as a medicine, mandate a particular course of treatment, and hundreds of thousands of teachers will simply refuse to comply because it won’t cure the challenges and opportunities they face.

And like any art, teaching is not a profession that yields to market justice. Van Gogh died penniless. Bruces Dern and Davison are better actors than Chrisses Hemsworth and Evans, although their paychecks would never know it. Teaching, like art and acting, runs the range from velvet Elvis paint by numbers to Renoir, from Fast and Furious to Short Cuts. There are teaching superstars, and journeyman teachers, and the occasional lousy teacher who keeps working despite this–just as Rob Scheider still finds work, despite being so bad that Roger Ebert wrote a book about it.

Unlike art and acting, teaching is a government job. So while actors will get paid lots of money to pretend to be teachers, the job itself will never lead to the upside achieved by the private sector, despite the many stories about famous Korean tutors. Upside, practicing our craft won’t usually lead to poverty, except perhaps in North Carolina.

Most teachers understand this. It’s the outside world and the occasional short-termers who want teachers to be rewarded for excellence. Most teachers don’t support merit pay and vehemently oppose “student growth measures”.

The country appears to be moving towards a teacher shortage. I anticipate all talk of VAM to vanish. But if you want to improve teacher quality beyond its current much-better-than-it’s-credited condition, I suggest we consider limiting the scope of public education. Four of these five education policy proposals will do just that.

**************************************************************************
¹ I was writing this up in the comments section of Scott Alexander’s commentary on teacher VAM research, when I remembered I was behind on my post quota. What the heck. I’m turning this into a post. It’s a long answer, but not as long-winded as Scott Alexander, the one blogger who makes me feel brusque.

33 Comments | tags: Bill Bennett, Checker Finn, Nation at Risk, No Child Left Behind, Value-add, VAM | posted in politics, reform, testing

October 4, 2015

The Prima Donna Rock Star Tester Treatment

By educationrealist

I met with her the first time last Sunday a week before the SAT, mother looking on, and the conversation went something like this.

“I want to specialize in one test. Which one should I take?”

“Yeah, okay, back up a bit. You took SAT test prep over the summer, right?”

“Yeah, but I knew everything they told me. It didn’t help.”

“What’s your course load?” (she goes to a 50% Asian school.)

” I’m taking a history honors class now, but it’s my first. Precalc for math.”

“And your GPA? What colleges are you considering? ”

Shrug. “3.8 or so. Colleges, I have no idea. But what I want to know is, should I specialize in the ACT or the SAT? And should I take the old one or the new one?”

“Do you have a target SAT score?”

“2000. What’s the equivalent in ACT? But I really think I should take the old SAT and be done. ”

“Your last practice test was a 1400.” She winced. “Even if all colleges take the old SAT for 2016 admissions–something I find unlikely despite assurances to the contrary–I’m not sure how you can find the time to focus on improvement between now and January, the last sitting of the old test. Besides, why the hurry?”

She waved dismissively. “I want to be done with all this. I hate the SAT. Maybe I should specialize in the ACT. I don’t want to learn the new SAT.”

“Yeah, we’re back to this whole ‘pick a test’ thing. Let’s discuss something touchier. Are you frustrated by the difference between your school performance and your test performance?”

She got very still. “Yes.”

“When I see an academic profile significantly higher than a test score, the student usually mentions it first. I’ve met many kids, a lot of them girls, with a profile like yours. They’ll tell me that they really just want to improve, to get their score into a respectable range, and that they haven’t had good luck with test prep so far. I didn’t hear any of that from you. Instead it’s ‘gotta pick a test’, need a 2000′ despite no college plans, without any acknowledgment of what must be a very disappointing practice history.”

I said all this as delicately as possible, but she was already surreptitiously wiping away tears.

” I don’t see your mom behind this. You’re causing your own pressure but are also very resistant to making more effort or exploring options.”

She started nodding before I finished, and her mom handed her a Kleenex. “I just think I’m wasting my time.”

“So let’s start there. Do you have trouble with school tests? No? How about your state tests? So it’s not a general testing problem, just big standardized tests. Is it nerves?”

She laughed, sadly. “No. My big problem is motivation.”

I snorfed involuntarily, and she looked up in shock. “Sorry. I’m not at all laughing at you. Just the idea that the kid I see in front of me barking orders like an executive suffers from motivation problems.”

The mother demurred here. “Well, her GPA is only a 3.8.”

“Forgive me, but you’re Chinese and prone to distortion on this point.” They’re American enough to laugh. ” I see an articulate, bright, driven girl who appears to have an intellect that I would put conservatively three or four hundred points above this practice score. You are using that intellect in school. I don’t see an obvious motivation issue.”

“No, not in school. Not studying. When I’m testing–you know, like the practice tests? I lose all motivation.”

Well, hey now.

“Tell me if any of this is familiar: The test begins and you’re working away, feeling good. Then you run into a problem that you don’t know how to solve and suddenly, as you try to figure the problem out, everything seems pointless. You give up, make a guess, go on to the next problem. Except now you aren’t sure what to do with this one, either. Suddenly, nothing matters. You simply stop caring. I see by your face that I’m not off-base.”

“How did you know?”

“I’ve seen it before. I describe it as a sort of stress reaction.¹”

” I’m not nervous at all.”

” You should be so lucky. Jitters don’t usually affect performance. You get bored by stress. What happens, best I can tell after hearing many students describe the feeling, is that your brain shuts down to avoid feeling stress.”

My first case was a short, slight blond boy back before the SAT changes, so before 2005. I was going through his practice test explaining the missed problems, and he’d finish my sentences. That is, he knew how to do many of the problems he’d gotten incorrect on the test.

So why the high error count, I asked.

It was after I got bored, he replied. Once the boredom hit, he’d start to randomly bubble. I was aghast. He may as well have told me he sucked dead chickens’ eyeballs for candy, so incomprehensible was his behavior.

“So what you have to start doing, have to understand, is that you are a testing prima donna.”

“A prima donna?”

“You know how movie stars always order off-menu? Because they’re just too special for the pre-arranged menu that the rest of us use. Or the ballerinas or opera stars who simply refuse to be rushed, because they are artists. Or rock stars, the kind who make huge demands for their hotel rooms sometimes—Van Halen famously demanded brown M&Ms be removed from the candy bowl (yes, I know they had another reason, but her parents are never going to let her listen to Van Halen, so I’m safe). You need to be a prima donna rock star tester.”

“How?”

“Take two SAT sections daily, from the blue book. Use deadly serious test conditions. No music. No interruptions. No stopping the clock. No laying on the floor or on your bed. Sit at a table, door shut, start the timer.”

“That’s not even an hour.”

“And when the timer starts, I want you to take two minutes, at least, to go through the test and cherrypick. Circle the problems you’ll deign to do.”

“Um. What?”

“In math, pick and choose your problems. Circle the good ones. ‘This one, I shall do. This one, pah!’ Spit upon it. If you don’t instantly vibe to the question, avert your eyes and scratch an X next to that problem, which clearly must be for peasants and other little people. Can you do that?”

She giggled. “Really? What about reading?”

” Skip anything with long paragraphs that looks less desirable than root canal. You like sentence completions?”

“Yes!”

“Do them first, then evaluate each reading passage to determine whether or not Her Majesty–that’s you–is interested. Which part of the writing section do you like best, the paragraph at the end?”

“How do you know this?”

“Do those six questions at the end first. Then go back to the front. The second–I mean the second—you find a long sentence you can’t instantly decipher, that question OFFENDS you. Turn up your nose. Move on.”

“So that’s all I want for the week. Two sections. Vary the subject. Every night. Take them like a rock star looking at candy bowls to make sure there are no…oh, look there’s a brown M&M. Skip it.”

“But I might only want to do four or five questions a section.”

“Great. Do those. Then, oh, hey. You’ve still got 20 minutes to kill. What’ll help pass the time? Let’s look at the other questions to see if they hold any interest. You are a movie star stuck in Podunk, in search of decent dim sum.”

“But the whole thing is a lie. The problems I can’t do aren’t stupid.”

“Sure, but we need to fake out your psyche. You have a fragile testing temperament that must be coddled and swathed in protective coating.”

The mom was a bit stunned, but accepting. “So none of the strategies she learned in test prep?”

“Mom, they didn’t work anyway. But what if I don’t have enough time to go back and do the problems that bored me?”

“Then you will have spent a whole test section working on problems you can do. How is that worse?”

“But if I try to read the long passages, I know I will get bored.”

“Well, I have some ideas for that later, but for now, read the passages that meet with your approval, and do the questions. Then for the rest, amuse yourself with the peasant passages. Do the vocabulary questions. The ones with line numbers. Don’t read them if they bore you. Normally, you understand, I wouldn’t suggest this.”

“So practice that all week. Eat pizza, chocolate, noodles, sesame balls with red bean paste, whatever your favorite food is Friday night. Saturday, have a good breakfast and visualize rejecting all those peasant problems.”

“What if I get bored anyway?”

“That’s a very real possibility. At the first moment you identify boredom, put your pencil down. Take a breath. Remind yourself that while it’s scary, this boredom is a valuable opportunity to practice dealing with it. That it only feels like boredom. Do not give up. Do not let yourself randomly bubble. If you feel done and can’t fight off the boredom, put your head down and take a nap. Otherwise, go back to the test and look for test questions that pique your curiosity.”

“But you said I didn’t have to read the passages.”

“Sure. But don’t randomly bubble, or give up. Estimate. Eliminate known wrong answers. Guess based on the context. But if you can’t kick off the boredom and feel hopeless, take a rest until the next section.”

“And here’s the important part: under no conditions are you to worry about your score. You’re not there for the score. You’re there to practice being a rock star who picks and chooses her projects. We’ll do scores later, if you like.”

“That’s okay. I don’t think I’m going to improve now, so at least I might know why.”

“It’s helpful just to know what the problem is,” her mother agreed.

They actually smiled as I left, both noticeably less anxious than they were when I arrived.

Note: she’s a junior, and has no reason whatsoever to take the SAT in October. I tried to talk the mom out of that, but she was determined to keep the date. Ideally, I wouldn’t send a student to try out this method on a live test, but that was the only option.

Will it work, this refusal to tolerate brown M&Ms and uninviting questions? Typically, yes, although since I’ve cut back on tutoring I haven’t run into the prima donna tester in several years. The cases I remember always saw an instant boost of 100-150 points the first time they took the test in rock star mode. In every case, they were also mentally exhausted afterwards. They’d never worked the entire test before, having mentally checked out. Prima donnas are fixable. The ones who go into a fugue state, not so much. Fortunately, that’s even rarer.

I started to make a larger point, but it’s too complicated and, since returning this August I’ve vowed to post more. I had too many ideas piling up that just weren’t…perfect, and so I kept putting them off, even though each idea had more than enough for a post. Time for me to limit scope and bite off achievable chunks. Otherwise I’ll think I’m bored and don’t care when really I’m stressed out….hey. Good thing I don’t get like this for tests.

So don’t read too much into this beyond an interesting behavior that I’ve learned to treat. Don’t apply it to policy. Do I think some people underperform their abilities on tests? Yes, I do. Do I think that tests can be gamed by people whose essential intelligence is high on mimicry and memory, giving the impression of skills they don’t actually have? Yes, I do. Do I think tests are mostly accurate? Yes, for most people. It’s a big ol’ world out there. Many cases exist simultaneously.

Meanwhile, I hope all you testers out there did well yesterday. And if you know any fragile testing temperaments, give this strategy a try.

**********************************
¹ While writing this piece, I googled and learned that researchers call it stress, too.

9 Comments | tags: ACT, new SAT, test stress, testing strategy | posted in College Admissions, testing

April 16, 2015

Evaluating the New PSAT: Math

By educationrealist

Well, after the high drama of writing, the math section is pretty tame. Except the whole oh, my god, are they serious? part. Caveat: I’m assuming that the SAT is still a harder version of the PSAT, and that this is a representative test.

Metric	Old SAT	Old PSAT	ACT	New PSAT
Questions	54 44 MC, 10 grid	38 28 MC, 10 grid	60 MC	48 40 MC, 8 grid
Sections	1: 20 q, 25 m 2: 18 q, 25 m 3: 16 q, 20 m	1: 20 q, 25 m 2: 18 q, 25 m	1: 60 q, 60 m	NC: 17 q, 25 m Calc: 31 q, 45 m
MPQ	1: 1.25 mpq 2: 1.38 mpq 3: 1.25 mpq	1: 1.25 mpq 2: 1.38 mpq	1 mpq	NC: 1.47 mpq Calc: 1.45 mpq
Category	Number Operations Algebra & Functions Geometry & Measurement Data & Statistics	Same	Pre-algebra Algebra elem & intermed. Geometry coord & plane Trigonometry	1) Heart of Algebra 2) Passport to Advanced Math 3) Probability & 4) Data Analysis Additional Topics in math

It’s going to take me a while to fully process the math section. For my first go-round, I thought I’d point out the instant takeaways, and then discuss the math questions that are going to make any SAT expert sit up and take notice.

Format
The SAT and PSAT always gave an average of 1.25 minutes for multiple choice question sections. On the 18 question section that has 10 grid-ins, giving 1.25 minutes for the 8 multiple choice questions leaves 1.5 minutes for each grid in.

That same conversion doesn’t work on the new PSAT. However, both sections have exactly 4 grid-ins, which makes a nifty linear system. Here you go, boys and girls, check my work.

The math section that doesn’t allow a calculator has 13 multiple choice questions and 4 grid-ins, and a time limit of 25 minutes. The calculator math section has 27 multiple choice questions and 4 grid-ins, and a time limit of 45 minutes.

13x + 4y = 1500
27x + 4y = 2700

Flip them around and subtract for
14x = 1200
x = 85.714 seconds, or 1.42857 minutes. Let’s round it up to 14.3
y = 96.428 seconds, or 1.607 minutes, which I shall round down to 1.6 minutes.

If–and this is a big if–the test is using a fixed average time for multiple choice and another for grid-ins, then each multiple choice question is getting a 14.4% boost in time, and each grid-in a 7% boost. But the test may be using an entirely different parameter.

Question Organization

In the old SAT and ACT, the questions move from easier to more difficult. The SAT and PSAT difficulty level resets for the grid-in questions. The new PSAT does not organize the problems by difficulty. Easy problems (there are only 4) are more likely to be at the beginning, but they are interlaced with medium difficulty problems. I saw only two Hard problems in the non-calculator section, both near but not at the end. The Hard problems in the calculator section are tossed throughout the second half, with the first one showing up at 15. However, the coding is inexplicable, as I’ll discuss later.

As nearly everyone has mentioned, any evaluation of the questions in the new test doesn’t lead to an easy distinction between “no calc” and “calc”. I didn’t use a calculator more than two or three times at any point in the test. However, the College Board may have knowledge about what questions kids can game with a good calculator. I know that the SAT Math 2c test is a fifteen minute endeavor if you get a series of TI-84 programs. (Note: Not a 15 minute endeavor to get the programs, but a 15 minute endeavor to take the test. And get an 800. Which is my theory as to why the results are so skewed towards 800.) So there may be a good organizing principle behind this breakdown.

That said, I’m doubtful. The only trig question on the test is categorized as “hard”. But the question is simplicity itself if the student knows any right triangle trigonometry, which is taught in geometry. But for students who don’t know any trigonometry, will a calculator help? If the answer is “no”, then why is it in this section? Worse, what if the answer is “yes”? Do not underestimate the ability of people who turned the Math 2c into a 15 minute plug and play to come up with programs to automate checks for this sort of thing.

Categories

Geometry has disappeared. Not just from the categories, either. The geometry formula box has been expanded considerably.

There are only three plane geometry questions on the test. One was actually an algebra question using the perimeter formula Another is a variation question using a trapezoid’s area. Interestingly, neither rectangle perimeter nor trapezoid formula were provided. (To reinforce an earlier point, both of these questions were in the calculator section. I don’t know why; they’re both pure algebra.)

The last geometry question really involves ratios; I simply picked the multiple choice answer that had 7 as a factor.

I could only find one coordinate geometry question, barely. Most of the other xy plane questions were analytic geometry, rather than the basic skills that you usually see regarding midpoint and distance–both of which were completely absent. Nothing on the Pythagorean Theorem, either. Freaky deaky weird.

When I wrote about the Common Core math standards, I mentioned that most of geometry had been pushed down into seventh and eighth grade. In theory, anyway. Apparently the College Board thinks that testing geometry will be too basic for a test on college-level math? Don’t know.

Don’t you love the categories? You can see which ones the makers cared about. Heart of Algebra. Passport to Advanced Math! Meanwhile, geometry and the one trig question are stuck under “Additional Topic in Math”. As opposed to the “Additional Topic in History”, I guess.

Degree of Difficulty;

I worked the new PSAT test while sitting at a Starbucks. Missed three on the no-calculator section, but two of them were careless errors due to clatter and haste. In one case I flipped a negative in a problem I didn’t even bother to write down, in the other I missed a unit conversion (have I mentioned before how measurement issues are the obsessions of petty little minds?)

The one I actually missed was a function notation problem. I’m not fully versed in function algebra and I hadn’t really thought this one through. I think I’ve seen it before on the SAT Math 2c test, which I haven’t looked at in years. Takeaway— if I’m weak on that, so are a lot of kids. I didn’t miss any on the calculator section, and I rarely used a calculator.

But oh, my lord, the problems. They aren’t just difficult. The original, pre-2005 SAT had a lot of tough questions. But those questions relied on logic and intelligence—that is, they sought out aptitude. So a classic “diamond in the rough” who hadn’t had access to advanced math could still score quite well. Meanwhile, on both the pre and post 2005 tests, kids who weren’t terribly advanced in either ability or transcript faced a test that had plenty of familiar material, with or without coaching, because the bulk of the test is arithmetic, algebra I, and geometry.

The new PSAT and, presumably, the SAT, is impossible to do unless the student has taken and understood two years of algebra. Some will push back and say oh, don’t be silly, all the linear systems work is covered in algebra I. Yeah, but kids don’t really get it then. Not even many of the top students. You need two years of algebra even as a strong student, to be able to work these problems with the speed and confidence needed to get most of these answers in the time required.

And this is the PSAT, a test that students take at the beginning of their junior year (or sophomore, in many schools), so the College Board has created a test with material that most students won’t have covered by the time they are expected to take the test. As I mentioned earlier, California alone has nearly a quarter of a million sophomores and juniors in algebra and geometry. Will the new PSAT or the SAT be able to accurately assess their actual math knowledge?

Key point: The SAT and the ACT’s ability to reflect a full range of abilities is an unacknowledged attribute of these tests. Many colleges use these tests as placement proxies, including many, if not most or all, of the public university systems.

The difficulty level I see in this new PSAT makes me wonder what the hell the organization is up to. How can the test will reveal anything meaningful about kids who a) haven’t yet taken algebra 2 or b) have taken algebra 2 but didn’t really understand it? And if David Coleman’s answer is “Those testers aren’t ready for college so they shouldn’t be taking the test” then I have deep doubts that David Coleman understands the market for college admissions tests.

Of course, it’s also possible that the SAT will yield the same range of scores and abilities despite being considerably harder. I don’t do psychometrics.

Examples:

Here’s the function question I missed. I think I get it now. I don’t generally cover this degree of complexity in Precalc, much less algebra 2. I suspect this type of question will be the sort covered in new SAT test prep courses.

These two are fairly complicated quadratic questions. The question on the left reveals that the SAT is moving into new territory; previously, SAT never expected testers to factor a quadratic unless a=1. Notice too how it uses the term “divisible by x” rather than the more common term, “x is a factor”. While all students know that “2 is a factor of 6” is the same as “6 is divisible by 2”, it’s not a completely intuitive leap to think of variable factors in the same way. That’s why we cover the concept–usually in late algebra 2, but much more likely in pre-calc. That’s when synthetic division/substitution is covered–as I write in that piece, I’m considered unusual for introducing “division” of this form so early in the math cycle.

The question on the right is a harder version of an SAT classic misdirection. The test question doesn’t appear to give enough information, until you realize it’s not asking you to identify the equation and solve for a, b, and c–just plug in the point and yield a new relationship between the variables. But these questions always used to show up in linear equations, not quadratics.

That’s the big news: the new PSAT is pushing quadratic fluency in a big way.

Here, the student is expected to find the factors of 1890:

This is a quadratic system. I don’t usually teach these until Pre-Calc, but then my algebra 2 classes are basically algebra one on steroids. I’m not alone in this.

No doubt there’s a way to game this problem with the answer choices that I’m missing, but to solve this in the forward fashion you either have to use the quadratic formula or, as I said, find all the factors of 1890, which is exactly what the answer document suggests. I know of no standardized test that requires knowledge of the quadratic formula. The old school GRE never did; the new one might (I don’t coach it anymore). The GMAT does not require knowledge of the quadratic formula. It’s possible that the CATs push a quadratic formula question to differentiate at the 800 level, but I’ve never heard of it. The ACT has not ever required knowledge of the quadratic formula. I’ve taught for Kaplan and other test prep companies, and the quadratic formula is not covered in most test prep curricula.

Here’s one of the inexplicable difficulty codings I mentioned–this is coded as of Medium difficulty.

As big a deal as that is, this one’s even more of a shock: a quadratic and linear system.

The answer document suggests putting the quadratic into vertex form, then plugging in the point and solving for a. I solved it with a linear system. Either way, after solving the quadratic you find the equation of the line and set them equal to each other to solve. I am….stunned. Notice it’s not a multiple choice question, so no plug and play.

Then, a negative 16 problem–except it uses meters, not feet. That’s just plain mean.

Notice that the problem gives three complicated equations. However, those who know the basic algorithm (h(t)=-4.9t² + v₀ + s₀) can completely ignore the equations and solve a fairly easy problem. Those who don’t know the basic algorithm will have to figure out how to coordinate the equations to solve the problem, which is much more difficult. So this problem represents dramatically different levels of difficulty based on whether or not the student has been taught the algorithm. And in that case, the problem is quite straightforward, so should be coded as of Medium difficulty. But no, it’s tagged as Hard. As is this extremely simple graph interpretation problem. I’m confused.

Recall: if the College Board keeps the traditional practice, the SAT will be more difficult.

So this piece is long enough. I have some thoughts–rather, questions–on what on earth the College Board’s intentions are, but that’s for another test.

tl;dr Testers will get a little more time to work much harder problems. Geometry has disappeared almost entirely. Quadratics beefed up to the point of requiring a steroids test. Inexplicable “calc/no calc” categorization. College Board didn’t rip off the ACT math section. If the new PSAT is any indication, I do not see how the SAT can be used by the same population for the same purpose unless the CB does very clever things with the grading scale.

42 Comments | tags: 2016 SAT, College Board, David Coleman, new PSAT, new SAT | posted in College Admissions, testing

April 12, 2015

Evaluating the New PSAT: Reading and Writing

By educationrealist

The College Board has released a new practice PSAT, which gives us a lot of info on the new SAT. This essay focuses on the reading and writing sections.

As I predicted in my essay on the SAT’s competitive advantage, the College Board has released a test that has much in common with the ACT. I did not predict that the homage would go so far as test plagiarism.

This is a pretty technical piece, but not in the psychometric sense. I’m writing this as a long-time coach of the SAT and, more importantly, the ACT, trying to convey the changes as I see them from that viewpoint.

For comparison, I used these two sample ACT, this practice SAT (old version), and this old PSAT.

Reading

The old SAT had a reading word count of about 2800 words, broken up into eight passages. Four passages were very short, just 100 words each. The longest was 800 words. The PSAT reading count was around 2000 words in six passages. This word count is reading passages only; the SAT has 19 sentence completions to the PSAT’s 13.

So SAT testers had 70 minutes to complete 19 sentence completions and 47 questions over eight passages of 2800 words total. PSAT testers had 50 minutes to complete 13 sentence and 27 questions over six passages of 2000 words total.

The ACT has always had 4 passages averaging 750 words, giving the tester 35 minutes to complete 40 questions (ten for each passage). No sentence completions.

Comparisons are difficult, but if you figure about 45 seconds per sentence completion, you can deduct that from the total time and come up with two rough metrics comparing reading passages only: minutes per question and words per question (on average, how many words is the tester reading to answer the questions).

Metric	Old SAT	Old PSAT	ACT	New PSAT
Word Count	2800	2000	3000	3200
Passage Count	8	6	4	5
Passage Length	100-850	100-850	750	500-800
MPQ	1.18	1.49	1.14	1.27
WPQ	59.57	74.07	75	69.21

I’ve read a lot of assertions that the new SAT reading text is more complex, but my brief Lexile analysis on random passages in the same category (humanities, science) showed the same range of difficulty and sentence lengths for old SAT, current ACT, and old and new PSAT. Someone with more time and tools than I have should do an indepth analysis.

Question types are much the same as the old format: inference, function, vocabulary in context, main idea. The new PSAT requires the occasional figure analysis, which the College Board will undoubtedly flaunt as unprecedented. However, the College Board doesn’t have an entire Science section, which is where the ACT assesses a reader’s ability to evaluate data and text.

Sentence completions are gone, completely. In passage length and overall reading demands, the new PSAT is remarkably similar in structure and word length to the ACT. This suggests that the SAT is going to be even longer? I don’t see how, given the time constraints.

tl;dr: The new PSAT reading section looks very similar to the current ACT reading test in structure and reading demands. The paired passage and the questions types are the only holdover from the old SAT/PSAT structure. The only new feature is actually a cobbled up homage to the ACT science test in the form of occasional table or graph analysis.

Writing

I am so flummoxed by the overt plagiarism in this section that I seriously wonder if the test I have isn’t a fake, designed to flush out leaks within the College Board. This can’t be serious.

The old PSAT/SAT format consisted of three question types: Sentence Improvements, Identifying Sentence Error, and Paragraph Improvements. The first two question types presented a single sentence. In the first case, the student would identify a correct (or improved) version or say that the given version was best (option A). In the ISEs, the student had to read the sentence cold with no alternatives and indicate which if any underlined word or phrase was erroneous (much, much more difficult, option E was no change). In Paragraph Improvements, the reader had to answer grammar or rhetoric questions about a given passage. All questions had five options.

The ACT English section is five passages running down the left hand side of the page, with underlined words or phrases. As the tester goes along, he or she stops at each underlined section and looks to the right for a question. Some questions are simple grammar checks. Others ask about logic or writing choices—is the right transition used, is the passage redundant, what would provide the most relevant detail. Each passage has 15 questions, for a total of 75 questions in 45 minutes (9 minutes per passage, or 36 seconds per question). The tester has four choices and the “No Change” option is always A.

The new PSAT/SAT Writing/Language section is four passages running down the left hand side of the page, with underlined words or phrases. As the tester goes along, he or she stops at each underlined section and looks to the right for a question. Some questions are simple grammar checks. Others ask about logic or writing choices—is the right transition used, is the passage redundant, what would provide the most relevant detail. Each passage has 11 questions, for a total of 44 questions in 35 minutes (about 8.75 minutes per passage or 47 seconds a question). The tester has four choices and the “No Change” option is always A.

Oh, did I forget? Sometimes the tester has to analyze a graph.

The College Board appears to have simply stolen not only the structure, but various common question types that the ACT has used for years—as long as I’ve been coaching the test, which is coming on for twelve years this May.

I’ll give some samples, but this isn’t a random thing. The entire look and feel of the ACT English test has been copied wholesale—I’ll add “in my opinion” but don’t know how anyone could see this differently.

Writing Objective:

Best Conclusion: ACT vs. New PSAT
Add Best Detail: ACT vs. New PSAT
Best Transitioning for Paragraph:ACT vs. PSAT

Style and Logic:

Logical Placement: ACT vs. New PSAT
Check for Conciseness/Redundancy: ACT vs. New PSAT
Check for Formal Style: ACT vs. New PSAT
Logically Correct Transition (cause, contrast, continuation): ACT vs. New PSAT
Writer Choice: ACT vs. New PSAT

Grammar/Punctuation:

Dash vs. Semicolon: ACT vs. New PSAT
The apostrophe/plural check: ACT vs. New PSAT

tl;dr: The College Board ripped off the ACT English test. I don’t really understand copyright law, much less plagiarism. But if the American College Test company is not considering legal action, I’d love to know why.

The PSAT reading and writing sections don’t ramp up dramatically in difficulty. Timing, yes. But the vocabulary load appears to be similar.

The College Board and the poorly informed reporters will make much of the data analysis questions, but I hope to see any such claims addressed in the context of the ACT’s considerably more challenging data analysis section. The ACT should change the name; the “Science” section only uses science contexts to test data analysis. All the College Board has done is add a few questions and figures. Weak tea compared to the ACT.

As I predicted, The College Board has definitely chosen to make the test more difficult for gaming. I’ve been slowly untangling the process by which someone who can barely speak English is able to get a high SAT verbal and writing score, and what little I know suggests that all the current methods will have to be tossed. Moving to longer passages with less time will reward strong readers, not people who are deciphering every word and comparing it to a memory bank. And the sentence completions, which I quite liked, were likely being gamed by non-English speakers.

In writing, leaving the plagiarism issue aside for more knowledgeable folk, the move to passage-based writing tests will reward English speakers with lower ability levels and should hurt anyone with no English skills trying to game the test. That can only be a good thing.

Of course, that brings up my larger business question that I addressed in the competitive advantage piece: given that Asians show a strong preference for the SAT over the ACT, why would Coleman decide to kill the golden goose? But I’ll put big picture considerations aside for now.

Here’s my evaluation of the math section.

15 Comments | tags: ACT, College Board, PSAT, reading test, writing test | posted in College Admissions, testing

educationrealist

Category Archives: testing

What’s a National Merit Scholar?

False Positives

Bush/Obama Ed Reform: Zenith

Bipartisan Achievements

Bribing the States, round I: Race to the Top, Waivers

Bribing the States, round II: Common Core

Governance

Standards

Charter Growth, Choice, TFA

Getting Smarter, or Getting Better at Using Smarts?

Killing My Own Snakes

The Challenge of Black Students and AP Courses

The Many Failings of Value-Added Modeling

The Prima Donna Rock Star Tester Treatment

Evaluating the New PSAT: Math

Evaluating the New PSAT: Reading and Writing

Recent Posts

Articles

Blogroll

Encyclopedia of Ed

Categories

Monthly