Category Archives: testing

Killing My Own Snakes

When I was hired to teach at Southeastern in May, 1979, the Academic Dean at the time gave me only two pieces of advice: “Make your own way,” and “Kill your own snakes.”-Steven Fettke

One of the most valuable pieces of advice I received, from two different teachers in two different years (student teaching, first year), was that a new teacher had to know what “quiet” is.  If kids wouldn’t shut up, then kick them out until finally, the teacher experiences….silence. Without that baseline, a new teacher has no gauge to assess the ambient classroom noise.

I began teaching as a better than average classroom manager, and somewhat shrugged this wisdom off until I got the advice the second time after five particularly troublesome geometry students wouldn’t shut up during an entire lesson. So the next day, I warned them once and then tossed one then another off to the office. After two were gone, the other three realized I was serious and shut up, after growling a bit about unfairness. Turning back to the board, I suddenly heard…..silence. Utter, attentive, silence. And from that point on, I knew what silence was, and what to expect when I demanded it.

As a mentor, I always advise new teachers to err on the side of excess with disruptive students. If they have an entire class out of control, ask for help. If they have a few students misbehaving, toss them out after a warning. Screw fair. Get silence. Know what it sounds like.

New teachers are often fearful of  sending students out. They worry that administrators will judge them. They’re right to worry. Administrators often notice. At my last job, the volume of my referrals was  a constant source of tension.  In really poorly managed schools, the admins refuse to accept students and send them back. (Note: leave that school.)

This is where mentors come in. Mentors can, and should, give balance to new teachers. My induction mentor’s support and acknowledgement of my unimaginably disruptive students finally forced administrators to take action. If the teacher is weak, by all means help shore up the crumbles. But in the meantime, encourage the teacher to boot students who disrupt teaching time. I get impatient with people who bleat that removing kids from the class is depriving them of education. All students deserve an education. Students who are determined to prevent that can step outside.

In my experience, novice teachers stuck with unusually unruly students will improve their management skills if given the opportunity to remove the disruptors. As time goes on, these teachers will improve their handling of rambunctious students. Part of that improvement involves knowing what silence sounds like.

So new teachers should not try to kill all their snakes, particularly given the likelihood that they’ll have the toughest students.

I assume most teachers kill their own snakes after the first few years. But I’m often amazed at what senior teachers will tolerate. Sample statements, followed by my (usually unspoken) response.

“I’m teaching an Algebra 10-12 class, and the kids start packing up their stuff with fifteen minutes to the bell. Does that ever happen to you? What do you do to prevent that?”

I tell them to unpack their damn books and get back to work. Right now. And if they don’t start moving right away, oh my goodness, pop quiz.

“I’ve been having so much trouble with kids using cell phones constantly in class, not paying attention at all. What do you do?”

I take their damn cellphones away, giving myself extra points if I can swipe it from under their nose without signaling intent. Students who can’t keep off their phones lose them until the end of the day instead of the end of class. And they don’t dare complain, because I can always hand it over to the administrators, whose penalties are far more stringent.

“I have these two kids who constantly talk to each other, but when I try to separate them, they insist on sitting together. It’s so frustrating.”

Why the hell do you give them a choice? Tell them where to sit. In fact, tell everyone where to sit.

“I tell the kids not to bring food to the class, but what do you do when they’ve just bought lunch?”

You take the lunch away and tell them they can enjoy it cold later.

“I’ve tried taking away phones/telling them where to sit/taking their lunch but they refuse to give it over, and I don’t know what to do.”

You call and have them removed from the class.

“What? For something so minor?”

Listen well, little teachlings. Defiance of a teacher is not minor. It’s one of the few snakes that even experienced teachers should hand off to an administrator if they can’t convince the student to comply. Give the kid a chance to walk back. Offer alternatives. Draw a line, though, and if the line gets crossed, have the kid removed for the day.

And of course, logistics get in the way sometimes. More than once, I’ve picked up the phone to call for a supervisor to come take a defiant kid away–and no one answers the damn phone. So I have to call another number. Sometimes no one answers. All that drama and then….man, turning back around to face the class really sucks.

But well over half the time, simply picking up the phone has results, and the defiant one says something like “Well, you want me to give up my lunch AND my drink! No way!” and I say quickly, “No. Just the lunch. I insist on the lunch!” which leads to “Oh, I thought you wanted my drink, too. OK, have my lunch. BUT I KEEP MY DRINK!”

Other times,  the troublesome kid smirks. “Ha, ha, you can’t catch me, copper!” Shrug. Just shrug. And then later, call again, after the smirker has forgotten all about it, and have him pulled from the room, protesting. Don’t gloat. Just go on with the lesson like this is no big deal.

 

So you might be reading all this saying, wow, Ed’s a tyrant. Which is hysterical, because I’m one of the loosest teachers you’ll ever run into. Remember, I don’t assign homework. My kids sit in groups. I have a non-existent detention rate, the lowest in the school. I rarely give an F grade.  To my considerable pride, I’ve gotten the coolest of the Student Nominations three years running (best story teller, most unpredictable, most dramatic).  My classes are noisy and boisterous affairs. In many ways, my classroom environment is a progressive’s dream, the kind of place that Ed Boland dreamed of having before he realized he hated students.

I have five rules, handwritten seven years ago on still bright yellow poster paper. Students should avoid:

  1. arguing with the ref (me)
  2. eating, drinking, or grooming
  3. setting objects airborne
  4. travelling without consent
  5. incessant yammering

But bottom line, do what I tell you.  My lines are very clearly marked, albeit occasionally negotiable. Just pay close attention to when I say “when”. As  I tell my kids every year at syllabus time: in order for “all this”–school, teaching, classroom environment–to work, I have to be in charge. Students have to obey my direct orders.

I realize that many teachers feel that schools already exert a great deal of control over student lives. They feel that rules about eating, phones, and seating are an unfair imposition. These same teachers often feel that “consequences” must be “deserved”, that their restrictions on those who have made bad choices, are somehow more reasonable.

Shrug. I’m not saying there’s only one way. Other teachers can make their own choices. Me, I avoid morality plays. I don’t talk about what students deserve or earn, simply about what helps me teach and others learn.  I handle even cheating as a pragmatic issue, not a value judgment.

From students’ perspective, their least  favorite of my management techniques is  my yelling, specifically  calling out or putting a student on blast.  They prefer teachers who rebuke quietly and in private. But they also agree that when you aren’t being the one called out, it’s fun to watch me rant.

As I invariably mention when going through the syllabus, the only action a student can take to earn a permanent black mark is deliberate cruelty to another student. I will punish that and I’m much better at being mean.

Note that I prohibit being mean to other students.  Nowhere in my rules is it verboten to be mean to me, the teacher.

At least once a year, I (usually inadvertently) get a student furious, and the exchange goes something like this:

Student: “F*** YOU!!!!”

Me, unfussed and occasionally confused: “Sit down.”

Student: “NO!!! You F******* *****! F*** YOU!! F*** OFF”

Me: “Sit down.”

Student, walking to the door: “NO WAY. EAT SH**. I’m OUT! YOU #*@#W%@#W%!”

Me: “DO NOT WALK OUT THAT DOOR!”

Student: “WHY NOT?”

Me: “BECAUSE UP TO NOW, YOU HAVEN’T DONE ANYTHING WRONG!”

This usually stops the student for a minute or so, giving me a chance to calm things down. In every case, after a brief talk with a fascinated class watching on, the student sits back down and everyone gets back to work. Show’s over.

Which is not to say I let students take nasty potshots at me. Like I said, I’m much better at being mean than your average adolescent. But I don’t demand respectful behavior, and don’t get upset at rudeness.  This will not come as a shock to people who know me online.

Look. Teaching is very much an expression of personality.  Mine is a teacher-centered classroom. But nowhere is it written that teacher-centered classrooms must be ruthlessly controlled environments of churchlike stillness.  My classroom is, like me, loud and often disorderly, friendly, sarcastic. It sometimes changes on a dime. But its purpose is always there, driving things along, moving everyone forward.

New teachers: does your classroom environment reflect your personality, your values? Experienced teachers: are you setting rules that matter? Are you sure?

 

Advertisements

The Challenge of Black Students and Advanced Placement

When the bell rings at Wheaton North High School, a river of white students flows into Advanced Placement classrooms. A trickle of brown and black students joins them. —The Challenge of Creating Schools That Work for Everybody, Catherine Gewertz

Gewertz’s piece is one of a million or so outlining the earnest efforts of suburban schools to increase their  black and Hispanic student representation in AP classes. And indeed, these efforts are real and neverending. I have been in two separate schools that have been mandated in no uncertain terms to get numbers up.

But the data does not suggest overrepresentation. I’m going to focus on African American representation for a few reasons. Until recently, the College Board split up Hispanic scores into three categories, none of them useful, and it’s a real hassle to combine them. Moreover, the Hispanic category has an ace in the hole known as the Spanish Language test. Whenever you see someone boasting of great Hispanic AP scores, ask how well they did in non-language courses. (Foreign language study has largely disappeared as a competitive endeavor in the US. It’s just a way for Hispanic students to get one good test score, and Chinese students to add one to their arsenal.)

College Board data goes back twenty years, so I built a simple table:

blkaptable

I eliminated foreign language tests and those that didn’t exist back in 1997. It’s pretty obvious from the table that the mean scores for each test have declined in almost every case:

blkapmeanscorechg

Enter a caption

While the population for each test has increased, it’s been lopsided.

blkapgrowthbytest

It’s not hard to see the pattern behind the increases. The high-growth courses are one-offs with no prerequisites. It’s hard to convince kids to take these courses year after year–even harder to convince suburban teachers to lower their standards for that long. So put the kids in US History, Government–hey, it’s short, too!– and Statistics, which technically requires Algebra II, but not really.

The next three show data that isn’t often compiled for witnesses. I’m not good at presenting data, so there might be better means of presenting this. But the message is clear enough.

First,  here’s the breakdown behind the test growth. I took the growth in each score category (5 high, 1 low) and determined its percentage of the overall growth.

blkapscoredistributiongrowth

See all that blue? Most of the growth has been taken up by students getting the lowest possible score. Across the academic test spectrum, black student growth in 5s and 4s is anemic compared to the robust explosion of  failing 1s and 2s. Unsurprisingly, the tests that require a two to three year commitment have the best performace. Calc AB has real growth in high scores–but, alas, even bigger growth in low scores. Calc BC is the strongest performance. English Lang & Comp has something approaching a normal distribution of scores, even.

Here you can see the total scores by test and category. Calc BC and European History, two of the tests with the smallest growth, have the best distributions. Only four tests have the most scores in the 1 category; most have 2 as their modal score.

blkap1997

The same chart in 2016 is pretty brutally slanted. Eight tests now fail most students with a one, just four have a two. Worst is the dramatic drop in threes. In 1997, test percentages with 3 scores ranged from 10-38%. In 2016, they range from 10-20%. Meanwhile, the 4s and 5s are all well below 10%, with the cheery exception of Calculus BC.

blkap2016

Jay Mathews’ relentless and generally harmful push of Advanced Placement has been going strong since the 80s, even if the  Challenge Index only began in 1998. So 1997’s result include a decade of “AP push”. But the last 20 years have been even worse, as Jay, Newsweek, and the Washington Post all hawked the Index as a quality signifier: America’s Best High Schools! Suddenly, low-achieving, high-minority students had a way to bring some pride to their schools–just put their kids in AP classes.

As I wrote a couple years ago, this effort wasn’t evenly distributed. High achieving, diverse suburban high schools couldn’t just dump uninterested, low-achieving students (of any race) into a class filled with actually qualified students (of any race). Low achieving schools, on the other hand, had nothing to lose. Just dub a class “Advanced Placement” and put some kids in it. Most states cover AP costs, often using federal Title I dollars, so it’s a cheap way to get some air time.

African American AP test scores don’t represent a homogeneous population, and you can see that in the numbers.  Black students genuinely committed to academic achievement in a school with equally committed peers and qualified teachers are probably best reflected in the Calculus BC scores, as BC requires about four years of successful math. Black students dumped in APUSH and AP Government  are the recourse of diverse suburban schools not rich enough to ignore bureaucratic pressure to up their AP diversity.  They are taking promising students with low motivation and putting them in AP classes. This annoys the hell out of the parents and kids who genuinely want the rigorous course, and quite often angers the “promising” students, who are known to fail the class and refuse to take the test. The explosion of 1s across the board comes from the low-achieving urban schools who want to make the Challenge Index and don’t have any need to keep the standards high.

Remember each test costs $85 and test fees are waived by taxpayers for students who can’t afford them.  Consider all the students being forced, in many cases, to take classes they have no interest in.  Those smaller increases in passing scores are purchased with considerable wasted time and taxpayer expense.

But none of this should be news. Let’s talk about the real challenge of black students and AP scores and methods to fix the abuses.

First, schools and students should be actively restricted from using the AP grade “boost” for fraudulent purposes. The grades should be linked to the test scores without exception. Students who receive 4s and 5s get an A, even if the teacher wants to give a B1. Students who get a 3 receive a B, even if the teacher wants to give an A2 . Students who get a 2 receive a C. Students who get a 1 or who don’t take the test get a D–which, remember, will be bumped to a C for GPA purposes. This sort of grade link, first suggested by Saul Geiser (although I’ve extended it to the actual high school grade) would dramatically reduce abuse not only by predominantly minority schools, but also by all students  gaming the AP system to get inflated GPAs. That should reduce a lot of the blue in this picture:

blkapscoredistributiongrowth

Then we should ask a simple question: how can we bump those yellows to greys? That is, how can we get the students who demonstrated enough competence to score a 2 on the AP test to get enough motivation and learning to score a 3?

I’ve worked in test prep for years with underachieving blacks and Hispanics, and now teaching a lot of the kids not strong enough or not motivated enough to take AP classes. My school is under a great deal of pressure to get more low income, under-represented minorities in these classes as well (and my school administration is entirely non-white, as a data point). A couple years ago, I taught a US History course that resulted in four kids being “tagged” for an advanced placement class the next year–that is, they did so well in my class, having previously shown no talent or motivation, that they were put in AP Government the next year. I kept in touch with one, who  got an A in the class and passed the test.

My advice to my own principal, which I would repeat to the principal in Gewertz’s piece, is to create a class full of the promising but unmotivated students, separate from the motivated students. Give them a teacher who will be rigorous but low key, who won’t give much homework, who will focus on skill improvement in class. (ahem. I’m raising my hand.) Focus on getting the kids to pass the test. If they pass, they will get a guaranteed B in the class, which will count as an A for GPA purposes. (Even if the College Board doesn’t change the rules, schools can guarantee this policy.)

This strategy would work for advanced placement classes in English, history, government, probably economics.  It could work for statistics. Getting unmotivated kids to pass AP Calculus may be more difficult, as it would involve using the strategy consistently for 3 years with no test to guarantee a grade.

The challenge of increasing the abilities and college-readiness of promising but not strongly motivated students (of any race) lies in understanding their motives. Teachers need to give their first loyalty to the students, not the content. Traditional AP teachers are reluctant to do this, and I don’t think they should be required to change. But traditional AP teachers are, perhaps, not the best teachers for this endeavor.

In order for this proposal to get any serious attention, however, reporters would have to stop pretending that talented black students aren’t taking AP courses. The data simply doesn’t support that charge. We are putting too many black students into AP courses. Too many of them are completely unfit, have remedial level skills that high schools aren’t allowed to address. Much of the growth of Advanced Placement has relied on this fraud–and again, not just for black students.

It’s what we do with the kids in the middle, the skeptics, the uncertain ones, the ones who dearly want to be proven wrong about their own skills, that will help us improve these dismal statistics.

1I can’t even begin to tell you how many teachers in suburban districts do this.
2The same teachers who give students with 4s and 5s Bs are also prone to giving As to kids who got 3s. But of course, this is also the habit of teachers in low achieving urban districts. Consider this 2006 story celebrating the first two kids ever to pass the AP English test, and wonder how many of the students got As notwithstanding.


The Many Failings of Value-Added Modeling

Scott Alexander reviews the research on value-added models measuring teacher quality1. While Scott’s overview is perfectly fine, any such effort is akin to a circa 1692 overview of the research literature on alchemy. Quantifying teacher quality will, I believe, be understood in those terms soon enough.

High School VAM is Impossible

I have many objections to the whole notion of modeling what value a teacher adds, but top of the idiocy heap is how little attention is paid to the fact that VAM is only even possible with elementary school teachers. First, reading and basic math are the primary learning objectives of years 1-5. Second, elementary schools think of reading and math ability in terms of grade level. Finally, elementary teachers or their schools have considerable leeway in allocating instruction time by subject.

Now, go to high school (of which middle school is, as always, a pale imitation with similar issues). We don’t evaluate student reading skills by grade level, but rather “proficiency”. We don’t say “this 12th grader reads at the 10th grade level”. We have 12th graders who read at the 8th grade level, of course. We have 12th graders who read at the third grade level. But we don’t acknowledge this in our test scores, and so high school tests can’t measure reading progress. Which is good, because high school teachers aren’t tasked with reading instruction, so we wouldn’t expect students to make much progress. What’s that? Why don’t we teach reading instruction in high school, if kids can’t read at high school level, you ask? Because we aren’t allowed to. High school students with remedial level skills have to wait until college acknowledges their lack of skills.

And that’s reading, where at least we have a fighting shot of measuring progress, even though the tests don’t currently measure it–if we had yearly tests, which of course we don’t. Common Core ended yearly high school tests in most states. Math, it’s impossible because we pass most kids (regardless of ability) into the next class the next year, so there’s no “progress”, unless we measure kids at the beginning and end of the year, which introduces more tests and, of course, would show that the vast majority of students entering, say, algebra 2 don’t in fact understand algebra 1. Would the end of year tests measure whether or not the students had learned algebra 1, or algebra 2?

Nor can high school legally just allocate more time to reading and math instruction, although they can put low-scoring kids in double block instruction, which is a bad, bad thing.

Scope Creep

Most teachers at all levels don’t teach tested subjects and frankly, no one really cares about teacher quality and test scores in anything other than math or reading, but just pretend on everything else. Which leads to a question that proponents answer implicitly by picking one and ignoring the other: do we measure teacher quality to improve student outcomes or to spend government dollars effectively?

If the first, then what research do we have that art teachers, music teachers, gym teachers, or, god save us, special education teachers improve student outcomes? (answer: none.) If the second, then what evidence do we have that the additional cost of testing in all these additional topics, as well as the additional cost of defending the additional lawsuits that will inevitably arise as these teachers attack the tests as invalid, will be less strain on the government coffers than the cost of the purportedly inadequate teachers? What research do we have that any such tests on non-academic subjects are valid even as measures of knowledge, much less evidence of teacher validity?

None, of course. Which is why you see lawsuits by elective teachers pointing out it’s a tad unfair to be judged on the progress of students they’ve never actually met, much less taught. While many of those lawsuits get overturned as unfair but not constitutional, the idiocy of these efforts played no small part in the newest version of the federal ESEA, the ESSA, killed the student growth measure (SGM) requirement.

So while proponents might argue that math and English score growth have some relationship to teacher quality in those subjects, they can’t really argue for testing all subjects. Sure, people can pretend (a la Common Core) that history and science teachers have an impact on reading skills, but we have no mechanism to, and are years away from, changing instruction and testing in these topics to require reading content and measuring the impact of that specific instruction in that specific topic. And again, that’s just reading. Not math, where it’s easy enough to test students on their understanding of math in science and history, but very difficult to tangle out where that instruction came from. Of course, this is only an issue after elementary school. See point one.

Abandoning false gods

For the past 20 years or so, school policy has been about addressing “preparation”, which explains the obsession with elementary school. Originally, the push for school improvement began in high school. Few people realize or acknowledge these days that the Nation at Risk, that polemic seen as groundbreaking by education reformers but kind of, um, duh? by any regular people who take the time to read it, was entirely focused on high school, as can be ascertained by a simple perusal of its findings and recommendations. Stop coddling kids with easy classes, make them take college prep courses! That’s the ticket. It’s the easy courses, the low high school standards that cause the problem. Put all kids in harder classes. And so we did, with pretty disastrous results through the 80s. Many schools began tracking, but Jeannie Oakes and disparate impact lawsuits put an end to that.

I’m not sure when the obsession with elementary school began because I wasn’t paying close attention to ed policy during the 90s. But at some point in the early 90s, it began to register that putting low-skilled kids in advanced high school classes was perhaps not the best idea, leading to either fraud or a lot of failing grades, depending on school demographics. And so, it finally dawned on education reformers that many high school students weren’t “academically prepared” to manage the challenging courses that they had in mind. Thus the dialogue turned to preparing “underserved” students for high school. Enter KIPP and all the other “no excuses” charters which, as I’ve mentioned many times, focus almost entirely on elementary school students.

In the early days of KIPP, the scores seemed miraculous. People were bragging that KIPP completely closed the achievement gap back then, rather than the more measured “slight improvement controlling for race and SES” that you hear today. Ed reformers began pushing for all kids to be academically prepared, that is hey! Let’s make sure no child is left behind! And so the law, which led to an ever increasing push for earlier reading and math instruction, because hey, if we can just be sure that all kids are academically prepared for challenging work by high school, all our problems will be fixed.

Except, alas, they weren’t. I believe that the country is nearing the end of its faith in the false god of elementary school test scores, the belief that the achievement gap in high school is caused simply by not sufficiently challenging black and Hispanic kids in elementary school. Two decades of increasing elementary scores to the point that they appear to have topped out, with nary a budge in high school scores has given pause. Likewise, Rocketship, KIPP, and Success Academy have all faced questions about how their high-scoring students do in high school and college.

As I’ve said many times, high school is brutally hard compared to elementary school. The recent attempt to genuinely shove difficulty down earlier in the curriculum went over so well that the new federal law gave a whole bunch of education rights back to the states as an apology. Kidding. Kind of.

And so, back to VAM….Remember VAM? This is an essay about VAM. Well, all the objections I pointed out above–the problems with high school, the problems with specific subject teachers–were mostly waved away early on, because come on, folks, if we fix elementary school and improve instruction there, everything will fall into place! Miracles will happen. Cats will sleep with dogs. Just like the NCLB problem with 100% above average was waved away because hey, by them, the improvements will be sooooo wonderful that we won’t have to worry about the pesky statistical impossibilities.

I am not sure, but it seems likely that the fed’s relaxed attitude towards test scores has something to do with the abandonment of this false idol, which leads inevitably to the reluctant realization that perhaps The Nation At Risk was wrong, perhaps something else is involved with academic achievement besides simply plopping kids in the right classes. I offer in support the fact that Jerry Brown, governor of California, has remained almost entirely unscathed for shrugging off the achievement gap, saying hey, life’s a meritocracy. Who’s going to be a waiter if everyone’s “elevated” into some important job? Which makes me wonder if Jerry reads my blog.

So if teacher’s don’t make any difference and VAM is pointless, how come any yutz can’t become a teacher?

No one, ever, has argued that teachers don’t make any difference. What they do say is that individual teacher qualities make very little difference in student test scores and/or student academic outcomes, and the differences aren’t predictable or measurable.

If I may quote myself:

Teaching, like math, isn’t aspirin. It’s not medicine. It’s not a cure. It is an art enhanced by skills appropriate to the situation and medium, that will achieve all outcomes including success and failure based on complex interactions between the teachers and their audience. Treat it as a medicine, mandate a particular course of treatment, and hundreds of thousands of teachers will simply refuse to comply because it won’t cure the challenges and opportunities they face.

And like any art, teaching is not a profession that yields to market justice. Van Gogh died penniless. Bruces Dern and Davison are better actors than Chrisses Hemsworth and Evans, although their paychecks would never know it. Teaching, like art and acting, runs the range from velvet Elvis paint by numbers to Renoir, from Fast and Furious to Short Cuts. There are teaching superstars, and journeyman teachers, and the occasional lousy teacher who keeps working despite this–just as Rob Scheider still finds work, despite being so bad that Roger Ebert wrote a book about it.

Unlike art and acting, teaching is a government job. So while actors will get paid lots of money to pretend to be teachers, the job itself will never lead to the upside achieved by the private sector, despite the many stories about famous Korean tutors. Upside, practicing our craft won’t usually lead to poverty, except perhaps in North Carolina.

Most teachers understand this. It’s the outside world and the occasional short-termers who want teachers to be rewarded for excellence. Most teachers don’t support merit pay and vehemently oppose “student growth measures”.

The country appears to be moving towards a teacher shortage. I anticipate all talk of VAM to vanish. But if you want to improve teacher quality beyond its current much-better-than-it’s-credited condition, I suggest we consider limiting the scope of public education. Four of these five education policy proposals will do just that.

**************************************************************************
1 I was writing this up in the comments section of Scott Alexander’s commentary on teacher VAM research, when I remembered I was behind on my post quota. What the heck. I’m turning this into a post. It’s a long answer, but not as long-winded as Scott Alexander, the one blogger who makes me feel brusque.


The Prima Donna Rock Star Tester Treatment

I met with her the first time last Sunday a week before the SAT, mother looking on, and the conversation went something like this.

“I want to specialize in one test. Which one should I take?”

“Yeah, okay, back up a bit. You took SAT test prep over the summer, right?”

“Yeah, but I knew everything they told me. It didn’t help.”

“What’s your course load?” (she goes to a 50% Asian school.)

” I’m taking a history honors class now, but it’s my first. Precalc for math.”

“And your GPA? What colleges are you considering? ”

Shrug. “3.8 or so. Colleges, I have no idea. But what I want to know is, should I specialize in the ACT or the SAT? And should I take the old one or the new one?”

“Do you have a target SAT score?”

“2000. What’s the equivalent in ACT? But I really think I should take the old SAT and be done. ”

“Your last practice test was a 1400.” She winced. “Even if all colleges take the old SAT for 2016 admissions–something I find unlikely despite assurances to the contrary–I’m not sure how you can find the time to focus on improvement between now and January, the last sitting of the old test. Besides, why the hurry?”

She waved dismissively. “I want to be done with all this. I hate the SAT. Maybe I should specialize in the ACT. I don’t want to learn the new SAT.”

“Yeah, we’re back to this whole ‘pick a test’ thing. Let’s discuss something touchier. Are you frustrated by the difference between your school performance and your test performance?”

She got very still. “Yes.”

“When I see an academic profile significantly higher than a test score, the student usually mentions it first. I’ve met many kids, a lot of them girls, with a profile like yours. They’ll tell me that they really just want to improve, to get their score into a respectable range, and that they haven’t had good luck with test prep so far. I didn’t hear any of that from you. Instead it’s ‘gotta pick a test’, need a 2000′ despite no college plans, without any acknowledgment of what must be a very disappointing practice history.”

I said all this as delicately as possible, but she was already surreptitiously wiping away tears.

” I don’t see your mom behind this. You’re causing your own pressure but are also very resistant to making more effort or exploring options.”

She started nodding before I finished, and her mom handed her a Kleenex. “I just think I’m wasting my time.”

“So let’s start there. Do you have trouble with school tests? No? How about your state tests? So it’s not a general testing problem, just big standardized tests. Is it nerves?”

She laughed, sadly. “No. My big problem is motivation.”

I snorfed involuntarily, and she looked up in shock. “Sorry. I’m not at all laughing at you. Just the idea that the kid I see in front of me barking orders like an executive suffers from motivation problems.”

The mother demurred here. “Well, her GPA is only a 3.8.”

“Forgive me, but you’re Chinese and prone to distortion on this point.” They’re American enough to laugh. ” I see an articulate, bright, driven girl who appears to have an intellect that I would put conservatively three or four hundred points above this practice score. You are using that intellect in school. I don’t see an obvious motivation issue.”

“No, not in school. Not studying. When I’m testing–you know, like the practice tests? I lose all motivation.”

Well, hey now.

“Tell me if any of this is familiar: The test begins and you’re working away, feeling good. Then you run into a problem that you don’t know how to solve and suddenly, as you try to figure the problem out, everything seems pointless. You give up, make a guess, go on to the next problem. Except now you aren’t sure what to do with this one, either. Suddenly, nothing matters. You simply stop caring. I see by your face that I’m not off-base.”

“How did you know?”

“I’ve seen it before. I describe it as a sort of stress reaction.1

” I’m not nervous at all.”

” You should be so lucky. Jitters don’t usually affect performance. You get bored by stress. What happens, best I can tell after hearing many students describe the feeling, is that your brain shuts down to avoid feeling stress.”

My first case was a short, slight blond boy back before the SAT changes, so before 2005. I was going through his practice test explaining the missed problems, and he’d finish my sentences. That is, he knew how to do many of the problems he’d gotten incorrect on the test.

So why the high error count, I asked.

It was after I got bored, he replied. Once the boredom hit, he’d start to randomly bubble. I was aghast. He may as well have told me he sucked dead chickens’ eyeballs for candy, so incomprehensible was his behavior.

“So what you have to start doing, have to understand, is that you are a testing prima donna.”

“A prima donna?”

“You know how movie stars always order off-menu? Because they’re just too special for the pre-arranged menu that the rest of us use. Or the ballerinas or opera stars who simply refuse to be rushed, because they are artists. Or rock stars, the kind who make huge demands for their hotel rooms sometimes—Van Halen famously demanded brown M&Ms be removed from the candy bowl (yes, I know they had another reason, but her parents are never going to let her listen to Van Halen, so I’m safe). You need to be a prima donna rock star tester.”

“How?”

“Take two SAT sections daily, from the blue book. Use deadly serious test conditions. No music. No interruptions. No stopping the clock. No laying on the floor or on your bed. Sit at a table, door shut, start the timer.”

“That’s not even an hour.”

“And when the timer starts, I want you to take two minutes, at least, to go through the test and cherrypick. Circle the problems you’ll deign to do.”

“Um. What?”

“In math, pick and choose your problems. Circle the good ones. ‘This one, I shall do. This one, pah!’ Spit upon it. If you don’t instantly vibe to the question, avert your eyes and scratch an X next to that problem, which clearly must be for peasants and other little people. Can you do that?”

She giggled. “Really? What about reading?”

” Skip anything with long paragraphs that looks less desirable than root canal. You like sentence completions?”

“Yes!”

“Do them first, then evaluate each reading passage to determine whether or not Her Majesty–that’s you–is interested. Which part of the writing section do you like best, the paragraph at the end?”

“How do you know this?”

“Do those six questions at the end first. Then go back to the front. The second–I mean the second—you find a long sentence you can’t instantly decipher, that question OFFENDS you. Turn up your nose. Move on.”

“So that’s all I want for the week. Two sections. Vary the subject. Every night. Take them like a rock star looking at candy bowls to make sure there are no…oh, look there’s a brown M&M. Skip it.”

“But I might only want to do four or five questions a section.”

“Great. Do those. Then, oh, hey. You’ve still got 20 minutes to kill. What’ll help pass the time? Let’s look at the other questions to see if they hold any interest. You are a movie star stuck in Podunk, in search of decent dim sum.”

“But the whole thing is a lie. The problems I can’t do aren’t stupid.”

“Sure, but we need to fake out your psyche. You have a fragile testing temperament that must be coddled and swathed in protective coating.”

The mom was a bit stunned, but accepting. “So none of the strategies she learned in test prep?”

“Mom, they didn’t work anyway. But what if I don’t have enough time to go back and do the problems that bored me?”

“Then you will have spent a whole test section working on problems you can do. How is that worse?”

“But if I try to read the long passages, I know I will get bored.”

“Well, I have some ideas for that later, but for now, read the passages that meet with your approval, and do the questions. Then for the rest, amuse yourself with the peasant passages. Do the vocabulary questions. The ones with line numbers. Don’t read them if they bore you. Normally, you understand, I wouldn’t suggest this.”

“So practice that all week. Eat pizza, chocolate, noodles, sesame balls with red bean paste, whatever your favorite food is Friday night. Saturday, have a good breakfast and visualize rejecting all those peasant problems.”

“What if I get bored anyway?”

“That’s a very real possibility. At the first moment you identify boredom, put your pencil down. Take a breath. Remind yourself that while it’s scary, this boredom is a valuable opportunity to practice dealing with it. That it only feels like boredom. Do not give up. Do not let yourself randomly bubble. If you feel done and can’t fight off the boredom, put your head down and take a nap. Otherwise, go back to the test and look for test questions that pique your curiosity.”

“But you said I didn’t have to read the passages.”

“Sure. But don’t randomly bubble, or give up. Estimate. Eliminate known wrong answers. Guess based on the context. But if you can’t kick off the boredom and feel hopeless, take a rest until the next section.”

“And here’s the important part: under no conditions are you to worry about your score. You’re not there for the score. You’re there to practice being a rock star who picks and chooses her projects. We’ll do scores later, if you like.”

“That’s okay. I don’t think I’m going to improve now, so at least I might know why.”

“It’s helpful just to know what the problem is,” her mother agreed.

They actually smiled as I left, both noticeably less anxious than they were when I arrived.

Note: she’s a junior, and has no reason whatsoever to take the SAT in October. I tried to talk the mom out of that, but she was determined to keep the date. Ideally, I wouldn’t send a student to try out this method on a live test, but that was the only option.

Will it work, this refusal to tolerate brown M&Ms and uninviting questions? Typically, yes, although since I’ve cut back on tutoring I haven’t run into the prima donna tester in several years. The cases I remember always saw an instant boost of 100-150 points the first time they took the test in rock star mode. In every case, they were also mentally exhausted afterwards. They’d never worked the entire test before, having mentally checked out. Prima donnas are fixable. The ones who go into a fugue state, not so much. Fortunately, that’s even rarer.

I started to make a larger point, but it’s too complicated and, since returning this August I’ve vowed to post more. I had too many ideas piling up that just weren’t…perfect, and so I kept putting them off, even though each idea had more than enough for a post. Time for me to limit scope and bite off achievable chunks. Otherwise I’ll think I’m bored and don’t care when really I’m stressed out….hey. Good thing I don’t get like this for tests.

So don’t read too much into this beyond an interesting behavior that I’ve learned to treat. Don’t apply it to policy. Do I think some people underperform their abilities on tests? Yes, I do. Do I think that tests can be gamed by people whose essential intelligence is high on mimicry and memory, giving the impression of skills they don’t actually have? Yes, I do. Do I think tests are mostly accurate? Yes, for most people. It’s a big ol’ world out there. Many cases exist simultaneously.

Meanwhile, I hope all you testers out there did well yesterday. And if you know any fragile testing temperaments, give this strategy a try.

**********************************
1 While writing this piece, I googled and learned that researchers call it stress, too.


Evaluating the New PSAT: Math

Well, after the high drama of writing, the math section is pretty tame. Except the whole oh, my god, are they serious? part. Caveat: I’m assuming that the SAT is still a harder version of the PSAT, and that this is a representative test.

Metric

Old SAT

Old PSAT

ACT

New PSAT
Questions
 

54 
44 MC, 10 grid

38 
28 MC, 10 grid

60 MC 
 

48 
40 MC, 8 grid

Sections
 
 

1: 20 q, 25 m 
2: 18 q, 25 m 
3: 16 q, 20 m

1: 20 q, 25 m 
2: 18 q, 25 m
 

1: 60 q, 60 m 
 
 

NC: 17 q, 25 m 
Calc: 31 q, 45 m
 
MPQ
 
 

1: 1.25 mpq 
2: 1.38 mpq
3: 1.25 mpq

1: 1.25 mpq 
2: 1.38 mpq
 

1 mpq 
 
 

NC: 1.47 mpq 
Calc: 1.45 mpq
 
Category 
 
 
 
 
 
 

Number Operations 
Algebra & Functions
Geometry & Measurement
Data & Statistics
 
 
 

Same  
 
 
 
 
 
 

Pre-algebra 
Algebra
elem & intermed.
Geometry
coord & plane
Trigonometry
 
 
1) Heart of Algebra 
2) Passport to
Advanced Math
3) Probability &
4) Data Analysis
Additional Topics
in math
 

It’s going to take me a while to fully process the math section. For my first go-round, I thought I’d point out the instant takeaways, and then discuss the math questions that are going to make any SAT expert sit up and take notice.

Format
The SAT and PSAT always gave an average of 1.25 minutes for multiple choice question sections. On the 18 question section that has 10 grid-ins, giving 1.25 minutes for the 8 multiple choice questions leaves 1.5 minutes for each grid in.

That same conversion doesn’t work on the new PSAT. However, both sections have exactly 4 grid-ins, which makes a nifty linear system. Here you go, boys and girls, check my work.

The math section that doesn’t allow a calculator has 13 multiple choice questions and 4 grid-ins, and a time limit of 25 minutes. The calculator math section has 27 multiple choice questions and 4 grid-ins, and a time limit of 45 minutes.

13x + 4y = 1500
27x + 4y = 2700

Flip them around and subtract for
14x = 1200
x = 85.714 seconds, or 1.42857 minutes. Let’s round it up to 14.3
y = 96.428 seconds, or 1.607 minutes, which I shall round down to 1.6 minutes.

If–and this is a big if–the test is using a fixed average time for multiple choice and another for grid-ins, then each multiple choice question is getting a 14.4% boost in time, and each grid-in a 7% boost. But the test may be using an entirely different parameter.

Question Organization

In the old SAT and ACT, the questions move from easier to more difficult. The SAT and PSAT difficulty level resets for the grid-in questions. The new PSAT does not organize the problems by difficulty. Easy problems (there are only 4) are more likely to be at the beginning, but they are interlaced with medium difficulty problems. I saw only two Hard problems in the non-calculator section, both near but not at the end. The Hard problems in the calculator section are tossed throughout the second half, with the first one showing up at 15. However, the coding is inexplicable, as I’ll discuss later.

As nearly everyone has mentioned, any evaluation of the questions in the new test doesn’t lead to an easy distinction between “no calc” and “calc”. I didn’t use a calculator more than two or three times at any point in the test. However, the College Board may have knowledge about what questions kids can game with a good calculator. I know that the SAT Math 2c test is a fifteen minute endeavor if you get a series of TI-84 programs. (Note: Not a 15 minute endeavor to get the programs, but a 15 minute endeavor to take the test. And get an 800. Which is my theory as to why the results are so skewed towards 800.) So there may be a good organizing principle behind this breakdown.

That said, I’m doubtful. The only trig question on the test is categorized as “hard”. But the question is simplicity itself if the student knows any right triangle trigonometry, which is taught in geometry. But for students who don’t know any trigonometry, will a calculator help? If the answer is “no”, then why is it in this section? Worse, what if the answer is “yes”? Do not underestimate the ability of people who turned the Math 2c into a 15 minute plug and play to come up with programs to automate checks for this sort of thing.

Categories

Geometry has disappeared. Not just from the categories, either. The geometry formula box has been expanded considerably.

There are only three plane geometry questions on the test. One was actually an algebra question using the perimeter formula Another is a variation question using a trapezoid’s area. Interestingly, neither rectangle perimeter nor trapezoid formula were provided. (To reinforce an earlier point, both of these questions were in the calculator section. I don’t know why; they’re both pure algebra.)

The last geometry question really involves ratios; I simply picked the multiple choice answer that had 7 as a factor.

I could only find one coordinate geometry question, barely. Most of the other xy plane questions were analytic geometry, rather than the basic skills that you usually see regarding midpoint and distance–both of which were completely absent. Nothing on the Pythagorean Theorem, either. Freaky deaky weird.

When I wrote about the Common Core math standards, I mentioned that most of geometry had been pushed down into seventh and eighth grade. In theory, anyway. Apparently the College Board thinks that testing geometry will be too basic for a test on college-level math? Don’t know.

Don’t you love the categories? You can see which ones the makers cared about. Heart of Algebra. Passport to Advanced Math! Meanwhile, geometry and the one trig question are stuck under “Additional Topic in Math”. As opposed to the “Additional Topic in History”, I guess.

Degree of Difficulty;

I worked the new PSAT test while sitting at a Starbucks. Missed three on the no-calculator section, but two of them were careless errors due to clatter and haste. In one case I flipped a negative in a problem I didn’t even bother to write down, in the other I missed a unit conversion (have I mentioned before how measurement issues are the obsessions of petty little minds?)

The one I actually missed was a function notation problem. I’m not fully versed in function algebra and I hadn’t really thought this one through. I think I’ve seen it before on the SAT Math 2c test, which I haven’t looked at in years. Takeaway— if I’m weak on that, so are a lot of kids. I didn’t miss any on the calculator section, and I rarely used a calculator.

But oh, my lord, the problems. They aren’t just difficult. The original, pre-2005 SAT had a lot of tough questions. But those questions relied on logic and intelligence—that is, they sought out aptitude. So a classic “diamond in the rough” who hadn’t had access to advanced math could still score quite well. Meanwhile, on both the pre and post 2005 tests, kids who weren’t terribly advanced in either ability or transcript faced a test that had plenty of familiar material, with or without coaching, because the bulk of the test is arithmetic, algebra I, and geometry.

The new PSAT and, presumably, the SAT, is impossible to do unless the student has taken and understood two years of algebra. Some will push back and say oh, don’t be silly, all the linear systems work is covered in algebra I. Yeah, but kids don’t really get it then. Not even many of the top students. You need two years of algebra even as a strong student, to be able to work these problems with the speed and confidence needed to get most of these answers in the time required.

And this is the PSAT, a test that students take at the beginning of their junior year (or sophomore, in many schools), so the College Board has created a test with material that most students won’t have covered by the time they are expected to take the test. As I mentioned earlier, California alone has nearly a quarter of a million sophomores and juniors in algebra and geometry. Will the new PSAT or the SAT be able to accurately assess their actual math knowledge?

Key point: The SAT and the ACT’s ability to reflect a full range of abilities is an unacknowledged attribute of these tests. Many colleges use these tests as placement proxies, including many, if not most or all, of the public university systems.

The difficulty level I see in this new PSAT makes me wonder what the hell the organization is up to. How can the test will reveal anything meaningful about kids who a) haven’t yet taken algebra 2 or b) have taken algebra 2 but didn’t really understand it? And if David Coleman’s answer is “Those testers aren’t ready for college so they shouldn’t be taking the test” then I have deep doubts that David Coleman understands the market for college admissions tests.

Of course, it’s also possible that the SAT will yield the same range of scores and abilities despite being considerably harder. I don’t do psychometrics.

Examples:

newpsatmath10

Here’s the function question I missed. I think I get it now. I don’t generally cover this degree of complexity in Precalc, much less algebra 2. I suspect this type of question will be the sort covered in new SAT test prep courses.

mathnocalcquads

These two are fairly complicated quadratic questions. The question on the left reveals that the SAT is moving into new territory; previously, SAT never expected testers to factor a quadratic unless a=1. Notice too how it uses the term “divisible by x” rather than the more common term, “x is a factor”. While all students know that “2 is a factor of 6” is the same as “6 is divisible by 2”, it’s not a completely intuitive leap to think of variable factors in the same way. That’s why we cover the concept–usually in late algebra 2, but much more likely in pre-calc. That’s when synthetic division/substitution is covered–as I write in that piece, I’m considered unusual for introducing “division” of this form so early in the math cycle.

The question on the right is a harder version of an SAT classic misdirection. The test question doesn’t appear to give enough information, until you realize it’s not asking you to identify the equation and solve for a, b, and c–just plug in the point and yield a new relationship between the variables. But these questions always used to show up in linear equations, not quadratics.

That’s the big news: the new PSAT is pushing quadratic fluency in a big way.

Here, the student is expected to find the factors of 1890:

newpsatperimeter

This is a quadratic system. I don’t usually teach these until Pre-Calc, but then my algebra 2 classes are basically algebra one on steroids. I’m not alone in this.

No doubt there’s a way to game this problem with the answer choices that I’m missing, but to solve this in the forward fashion you either have to use the quadratic formula or, as I said, find all the factors of 1890, which is exactly what the answer document suggests. I know of no standardized test that requires knowledge of the quadratic formula. The old school GRE never did; the new one might (I don’t coach it anymore). The GMAT does not require knowledge of the quadratic formula. It’s possible that the CATs push a quadratic formula question to differentiate at the 800 level, but I’ve never heard of it. The ACT has not ever required knowledge of the quadratic formula. I’ve taught for Kaplan and other test prep companies, and the quadratic formula is not covered in most test prep curricula.

Here’s one of the inexplicable difficulty codings I mentioned–this is coded as of Medium difficulty.

As big a deal as that is, this one’s even more of a shock: a quadratic and linear system.

newpsatsystemlineparabola

The answer document suggests putting the quadratic into vertex form, then plugging in the point and solving for a. I solved it with a linear system. Either way, after solving the quadratic you find the equation of the line and set them equal to each other to solve. I am….stunned. Notice it’s not a multiple choice question, so no plug and play.

Then, a negative 16 problem–except it uses meters, not feet. That’s just plain mean.
newpsatmathneg16

Notice that the problem gives three complicated equations. However, those who know the basic algorithm (h(t)=-4.9t2 + v0 + s0) can completely ignore the equations and solve a fairly easy problem. Those who don’t know the basic algorithm will have to figure out how to coordinate the equations to solve the problem, which is much more difficult. So this problem represents dramatically different levels of difficulty based on whether or not the student has been taught the algorithm. And in that case, the problem is quite straightforward, so should be coded as of Medium difficulty. But no, it’s tagged as Hard. As is this extremely simple graph interpretation problem. I’m confused.

Recall: if the College Board keeps the traditional practice, the SAT will be more difficult.

So this piece is long enough. I have some thoughts–rather, questions–on what on earth the College Board’s intentions are, but that’s for another test.

tl;dr Testers will get a little more time to work much harder problems. Geometry has disappeared almost entirely. Quadratics beefed up to the point of requiring a steroids test. Inexplicable “calc/no calc” categorization. College Board didn’t rip off the ACT math section. If the new PSAT is any indication, I do not see how the SAT can be used by the same population for the same purpose unless the CB does very clever things with the grading scale.


Evaluating the New PSAT: Reading and Writing

The College Board has released a new practice PSAT, which gives us a lot of info on the new SAT. This essay focuses on the reading and writing sections.

As I predicted in my essay on the SAT’s competitive advantage, the College Board has released a test that has much in common with the ACT. I did not predict that the homage would go so far as test plagiarism.

This is a pretty technical piece, but not in the psychometric sense. I’m writing this as a long-time coach of the SAT and, more importantly, the ACT, trying to convey the changes as I see them from that viewpoint.

For comparison, I used these two sample ACT, this practice SAT (old version), and this old PSAT.

Reading

The old SAT had a reading word count of about 2800 words, broken up into eight passages. Four passages were very short, just 100 words each. The longest was 800 words. The PSAT reading count was around 2000 words in six passages. This word count is reading passages only; the SAT has 19 sentence completions to the PSAT’s 13.

So SAT testers had 70 minutes to complete 19 sentence completions and 47 questions over eight passages of 2800 words total. PSAT testers had 50 minutes to complete 13 sentence and 27 questions over six passages of 2000 words total.

The ACT has always had 4 passages averaging 750 words, giving the tester 35 minutes to complete 40 questions (ten for each passage). No sentence completions.

Comparisons are difficult, but if you figure about 45 seconds per sentence completion, you can deduct that from the total time and come up with two rough metrics comparing reading passages only: minutes per question and words per question (on average, how many words is the tester reading to answer the questions).

Metric

Old SAT

Old PSAT

ACT

New PSAT
Word Count

2800

2000

3000

3200
Passage Count

8

6

4

5
Passage Length

100-850

100-850

750

500-800
MPQ

1.18

1.49

1.14

1.27
WPQ

59.57

74.07

75

69.21

I’ve read a lot of assertions that the new SAT reading text is more complex, but my brief Lexile analysis on random passages in the same category (humanities, science) showed the same range of difficulty and sentence lengths for old SAT, current ACT, and old and new PSAT. Someone with more time and tools than I have should do an indepth analysis.

Question types are much the same as the old format: inference, function, vocabulary in context, main idea. The new PSAT requires the occasional figure analysis, which the College Board will undoubtedly flaunt as unprecedented. However, the College Board doesn’t have an entire Science section, which is where the ACT assesses a reader’s ability to evaluate data and text.

Sentence completions are gone, completely. In passage length and overall reading demands, the new PSAT is remarkably similar in structure and word length to the ACT. This suggests that the SAT is going to be even longer? I don’t see how, given the time constraints.

tl;dr: The new PSAT reading section looks very similar to the current ACT reading test in structure and reading demands. The paired passage and the questions types are the only holdover from the old SAT/PSAT structure. The only new feature is actually a cobbled up homage to the ACT science test in the form of occasional table or graph analysis.

Writing

I am so flummoxed by the overt plagiarism in this section that I seriously wonder if the test I have isn’t a fake, designed to flush out leaks within the College Board. This can’t be serious.

The old PSAT/SAT format consisted of three question types: Sentence Improvements, Identifying Sentence Error, and Paragraph Improvements. The first two question types presented a single sentence. In the first case, the student would identify a correct (or improved) version or say that the given version was best (option A). In the ISEs, the student had to read the sentence cold with no alternatives and indicate which if any underlined word or phrase was erroneous (much, much more difficult, option E was no change). In Paragraph Improvements, the reader had to answer grammar or rhetoric questions about a given passage. All questions had five options.

The ACT English section is five passages running down the left hand side of the page, with underlined words or phrases. As the tester goes along, he or she stops at each underlined section and looks to the right for a question. Some questions are simple grammar checks. Others ask about logic or writing choices—is the right transition used, is the passage redundant, what would provide the most relevant detail. Each passage has 15 questions, for a total of 75 questions in 45 minutes (9 minutes per passage, or 36 seconds per question). The tester has four choices and the “No Change” option is always A.

The new PSAT/SAT Writing/Language section is four passages running down the left hand side of the page, with underlined words or phrases. As the tester goes along, he or she stops at each underlined section and looks to the right for a question. Some questions are simple grammar checks. Others ask about logic or writing choices—is the right transition used, is the passage redundant, what would provide the most relevant detail. Each passage has 11 questions, for a total of 44 questions in 35 minutes (about 8.75 minutes per passage or 47 seconds a question). The tester has four choices and the “No Change” option is always A.

Oh, did I forget? Sometimes the tester has to analyze a graph.

The College Board appears to have simply stolen not only the structure, but various common question types that the ACT has used for years—as long as I’ve been coaching the test, which is coming on for twelve years this May.

I’ll give some samples, but this isn’t a random thing. The entire look and feel of the ACT English test has been copied wholesale—I’ll add “in my opinion” but don’t know how anyone could see this differently.

Writing Objective:

Style and Logic:

Grammar/Punctuation:

tl;dr: The College Board ripped off the ACT English test. I don’t really understand copyright law, much less plagiarism. But if the American College Test company is not considering legal action, I’d love to know why.

The PSAT reading and writing sections don’t ramp up dramatically in difficulty. Timing, yes. But the vocabulary load appears to be similar.

The College Board and the poorly informed reporters will make much of the data analysis questions, but I hope to see any such claims addressed in the context of the ACT’s considerably more challenging data analysis section. The ACT should change the name; the “Science” section only uses science contexts to test data analysis. All the College Board has done is add a few questions and figures. Weak tea compared to the ACT.

As I predicted, The College Board has definitely chosen to make the test more difficult for gaming. I’ve been slowly untangling the process by which someone who can barely speak English is able to get a high SAT verbal and writing score, and what little I know suggests that all the current methods will have to be tossed. Moving to longer passages with less time will reward strong readers, not people who are deciphering every word and comparing it to a memory bank. And the sentence completions, which I quite liked, were likely being gamed by non-English speakers.

In writing, leaving the plagiarism issue aside for more knowledgeable folk, the move to passage-based writing tests will reward English speakers with lower ability levels and should hurt anyone with no English skills trying to game the test. That can only be a good thing.

Of course, that brings up my larger business question that I addressed in the competitive advantage piece: given that Asians show a strong preference for the SAT over the ACT, why would Coleman decide to kill the golden goose? But I’ll put big picture considerations aside for now.

Here’s my evaluation of the math section.


What You Probably Don’t Know About the Gaokao

I didn’t intend to write about the gaokao, or Brook Larmer ‘s profile of 18-year-old Yang and his family inside Chinese test prep factory. I just started out googling, as is my wont, to find out more information than the article provides. I certainly did that.

The novice might find Larmer’s article emotionally draining. Anyone with even a rudimentary understanding of Chinese academic culture will notice a huge, gaping hole.

I noticed the hole, which led me to an observation, which led me to a better understanding of how the gaokao works, which is almost exactly the opposite of its presentation in the American press.

The hole: In a story dedicated to students preparing for the National Higher Education Entrance Examination (aka the gaokao) Larmer never once mentions cheating. This would be a problematic oversight in any event, but given the last anecdote, the omission strains credulity.

When Larmer returned to the town for his second visit, the day before the gaokao, Yang’s scores, which had been dropping, had not improved. As a result, Yang had kicked out his mom and brought his grandfather to live with him in Maotanchang for the last few weeks of prep. While Larmer drove into town with Yang’s parents, the grandfather refused to let Larmer accompany the family to the test site. Grandpa was afraid the family might “get in trouble” for talking to a reporter, according to “someone”.

Yang does exceptionally well, given his fears—“his scores far surpassed his recent practice tests”. Sadly, his friend Cao tanks because he “had a panic attack”.

Yang’s scores were considerably beyond what his recent performance had predicted. Yet it apparently never once occurred to Larmer that perhaps Yang and Grandpa prudently got the New York Times reporter out of the way before they arranged a fix. Maybe Yang wanted more aid than could be provided with “‘brain-rejuvenating’ tea”, or Gramps didn’t want Larmer to see Yang wired up for sound, or that he’d really put in some money and paid for a double.

Yang’s performance might have been entirely unaided, of course. But any article about the gaokao should address cheating, even with Gramps banning access.

When I realized that Larmer hadn’t mentioned cheating, I read the piece again, thinking I must have missed it. Nope. But that second readthrough led to an observation.

I got curious—just curious, nothing skeptical at this point—about the school’s gender restriction on teachers. Was that just for cram schools? What was the gender distribution of Chinese teachers?

I couldn’t find anything. No confirmation that the teacher were all male, no comprehensive source on cram schools, no readily available data on Maotanchang. I couldn’t find anything at all about the school’s business practices online. So I went back to Larmer’s paper to look for a source for that fact—and nothing.

And so, the observation: In his description of the school’s interior and practices, Larmer doesn’t mention interviews with school representatives, other journalism, or a Big Book of Facts on Chinese Cram Schools.

The earliest detailed description of Maotanchang online appears to be this August 2013 article in China Youth Daily, a Beijing paper, which created quite a furor in China and largely ignored here because we can’t read Chinese. Rachel Lu, senior editor at Foreign Policy magazine, restated some key points for those folks who don’t read Chinese, which is nice of her, because what idiot would copy and paste the Chinese piece into Google Translate?

Yeah, well, I’m an idiot. I won’t bore people with the extended version, but a lot of the details that Larmer didn’t seem to personally witness show up in the Chinese story: same school official quoting management theory, teachers using bullhorns, Maotanchang’s 1939 origins, bus license plates ending in 8, burning incense at the town’s sacred tree, teacher dismissals for low scores.

The excitement over the China Youth Daily article generated more interest, like Exam Boot Camp, also written in August 2013, happily in English, which profiled a female student and her mother who provide data points like higher prices for lower scoring students ,lack of electrical outlets, and surveillance cameras in the classroom.

Am I accusing Larmer of lifting tidbits from these other stories? Well, I’d like to know where he got the information.

Leave that aside, though, because reading through these stories looking for sources led me to all sorts of “new things” to learn about the gaokao. These “new things” are readily available online; in fact, anyone can find most of the information in the Wikipedia entry. But you will rarely read these not-in-fact new things, but well-established facts, explicitly laid out by any major media outlet (although now that I know, I can see hints). I don’t know why. I can’t even begin to see how any reporter wouldn’t trumpet these facts to the world, narrative or no.

China’s supposedly meritocratic test is a fraud.

To begin with, Larmer, like just about any other reporter discussing the gaokao, describes it as a “grueling test, which is administered every June over two or three days (depending on the province), is the lone criterion for admission to Chinese universities.”

Wrong. The test score is, technically, the sole criterion for admission. But in China, the test score and the test performance aren’t the same thing.

Testers get additional points literally added to their scores for a number of attributes. China’s 55 ethnic minorities (non-Han) get a boost of up to 30 points , although the specific number varies by province. Athletic and musical certifications appear to be in flux, but still giving some students more points, even though the list of certification sports culled from 70 to 17. Children whose parents died in the military and Chinese living overseas get extra points, and recently the government announced point boosts for morality.

Remember when the University of Michigan used to give students 20 points if they were black, and 12 points if they had a perfect SAT score? Well, imagine those points were just added into the SAT/ACT score. That’s what the Chinese do.

But even after the extra points are allotted, test scores aren’t relevant until the tester’s residence has been factored in. Larmer: “The university quota system also skews sharply against rural students, who are allocated far fewer admissions spots than their urban peers.”

I first understood this to mean that colleges used the same cut scores for everyone, but just accepted fewer rural students, without grasping the implications: city kids have lower cut scores than rural kids.

Xu Peng, the only Maotanchong student to make the cut off score for Tsinghua, where the “minimum score for students from Anhui province taking the science exam was 641.”

Two years earlier, the cutoff score for Tsinghua for a Beijing student was somewhere under 584.

Rachel Lu again:” the lowest qualifying score for a Beijing-based test-taker may be vastly lower than the score required from a student taking the examination in Henan or Jiangsu. [rural provinces]. ”

A joke goes:
gaokaojoke

Of course, don’t make the mistake, as I did, of thinking the cut scores mean the same thing for each student.

Curious about the nature of the studying/memorization the students do (another vague area for Larmer’s piece), I tried to find more information on the gaokao content. The actual gaokao essay questions are usually published each year and they’re….well, insane.

When I finally did find an an actual math question:


beijingmathtrans

it seemed surprisingly easy and then, I realized that it was only for the Beijing test:

beijingmatheasy

Then I went back to the essay questions and it sunk in: the essay questions differed by city.

The gaokao isn’t the same test in every province. Many provinces develop their own custom test and just call it the gaokao.


diffgaokaos

At which point, I threw up my hands and mentally howled at Larmer, my current proxy for the mainstream American press: you didn’t think this worth mentioning? Or didn’t you know?

If all this is true, then the wealthier province universities use a lower cut score for their residents. But just to be sure, some provinces make an easier test for their residents, so that the rural kids are taking a harder test on which they have to get a higher score. Please, please, please tell me I’m misunderstanding this.

Consider Larmer’s story again in light of this new information. Larmer can’t say definitively who had the best performance without ascertaining whether Yang or Cao got extra points. Both Yang and Cao might both have outscored many students who were admitted to top-tier universities. Cao may or may not have “panicked”, and may not have even done poorly, in an absolute sense. None of this context is provided.

In my last story about Chinese academic fraud, I pointed out that so much money was involved that few people have any incentive to fix the corruption. All the people bellyaching about the American test prep industry should pause for a moment to think about the size of the gaokao enterprise. The original China Youth Daily story focused on Maotanchang’s economic transformation, something Larmer also mentions. Parents are paying small fortunes for tutoring, for cheating devices, for impersonators, for bribes for certificates. All of these services have their own inventory supply chains and personnel. Turn the gaokao into a meritocratic test and what happens to a small but non-trivial chunk of the Chinese economy?

But I’m just stunned at how much worse the Chinese fraud is than I’d ever imagined.

Sure, well-connected parents could probably bribe their kids into college. Sure, urban kids who had better schools that operated longer with educated teachers would likely learn more than those stuck with “substitutes”. Sure, the content was probably absurd and has little relationship to actual knowledge. Sure, the tests were little more than a memory capacity game, with students memorizing essays as well as facts that had no real meaning to them. Without question the testers were engaging in rampant cheating.

But not once had I considered that the test difficulty varied by province, that some kids got affirmative action or athletic points added directly to their score, and worst of all, that a kid from Outer Nowhere who scored a 650 would have no chance at a college that accepted a kid from Beijing with a 500.

Once again, I am distressed to realize that my cynical skepticism has been woefully inadequate to the occasion.

The gaokao isn’t a meritocracy. Millions of kids who live in the wrong province are getting screwed by a test whose great claim to fame is that it will reward applicants strictly by merit. And of course, the more kids who apply to college, the more cut scores and test difficulty will increase–but only for those students from those wrong provinces. Meanwhile, the kids from the “right” provinces have a (relatively) easy time.

In this context, the 2013 gaokao cheating riot takes on a whole new light. If you really want to feel sad, consider the possibility that Yang’s friend, Cao, now working as a migrant, might have scored higher on a harder test than a rich kid in Shanghai.

By the way, could someone alert Ron Unz?

*Note: in the comments, someone who understands this is (bizarrely, to me) fussed over my use of the “rural/urban” paradigm. I was using the same construct that Brooke Larmer and others have. The commenter seems to think it makes a difference. My point is simpler, and I don’t think obscured for non-Chinese readers. But I caution anyone that I’m utterly unfamiliar with Chinese geography.


The SAT is Corrupt. No One Wants to Know.

“We got a recycled test, BTW. US March 2014.”.

This was posted on the College Confidential site, very early in the morning on December 6, the test date for the international SAT.

Did you get it?

Get what?

I mean how do you know it was a recycled Marhc test? Do you have the March Us test?

Oh, no. I just typed in one of the math questions from today’s test and the March US 2014 forum popped right up.

And of course, the March 2014 test thread has all the answers spelled out. The kids (assuming it’s kids) build a Google doc in which they compile all the questions and answers.

This is a pattern that goes on for every SAT, both domestic and international. The kids clearly are using technology during the test. They acknowledge storing answers on their calculators, but don’t explain what allows them to remember all the sentence completions, reading questions and even whole passages verbatim, much less post their entire essay online. Presumably, they are using their phones to capture the images?

They create a google doc, in which they recreate as many of the questions as can be remembered (in many cases, all) and then they chew over the answers. By the end of the collaboration, they have largely recreated the test. They used to post links to openly with any request. But recently the College Confidential moderators, aware that their site is being exposed as a cheating venue, have cracked down on requests for the link, while banning anyone who links to the document.

So floating out there somewhere in the Internet are copies of the actual test, which many hagwons put out (and pull them down because hey, no sense letting people have them for free), as well as the results of concentrated braindumping by hundreds of testers.

For international students, “studying for the SAT” doesn’t mean increasing math and vocabulary skills, but rather memorizing the answers of as many tests as possible.

And those are just the kids that aren’t paying for the answers.

The wealthy but not super-rich parents who want a more structured approach pay cram schools–be they hagwons, jukus or buxiban–to provide kids with all the recycled tests and memorize every question. No, not learn the subject. Memorize. As described here, cram schools provide a “key king”, a compilation of all the answer sequences for sections, using all the potential international tests. They know which ones will be recycled because the CB “withholds” these tests.

Of course, the super-rich parents don’t want to fuss their kids with all that memorizing. Cram schools have obtained copies of all the potential international tests by paying testers to photograph them. Then they pay someone to take the SAT in the earliest time zone for the International, and disseminate the news via text to all the testers. They just copy the answers from the pictures. Using phones. Which they have told the proctors they don’t have, of course.

I don’t know exactly how all this works—for example, are the cram schools offering tiered pricing for key kings vs. phoned in answers? Do different cram schools have different offerings? I’ve read through the documented process provided by Bob Schaeffer of FairTest (a guy I don’t often agree with), and it seems very credible. He’s also provided a transcript of an offer to provide answers to the test. Valerie Strauss got on the record accounts of this process from two international administrators, Ffiona Rees and Joachim Ekstrom.

Every so often Alexander Russo complains that Valerie Strauss shouldn’t do straight education reporting, given her open advocacy against reform.

Great. So where’s all the other hard reporting on this topic? The New York Times, whose public editor Margaret Sullivan just encouraged to “to enlighten citizens, hold powerful people and institutions accountable and maybe even make the world a better place”, bleeds for the poor Korean and Chinese testers anxious for their scores and concerned they’ll be tarred with the same brush. Everyone else just spits out the College Board press release–if they mention it at all. While most news outlets reported the October cancellation, few other than Strauss reported that the November and December international tests scores were delayed as well.

At the same time Strauss reported the College Board is stonewalling any inquiries as to how many kids were cheating, how many scores were cancelled, or what it was doing to prevent further corruption, an actual Post “reporter”, Anna Fifield, regurgitates a promotional ad for a Korean SAT equivalent coach.*

Well, you can understand why. The millionaire Korean test prep coach-called-a-teacher story is one of the woefully underreported stories of the 21st century. I mean, we only had one promo put out by the Wall Street Journal the year before, and another glowing testimonial CBS a few months later (even mentioning the tops in performance, bottom in happiness poll). But really, only one or two a year of these stories have been coming out since 2005.

So you can see why the Post felt another story on a Korean test prep instructor making millions required immediate exposure, if not anything approaching investigation or reporting.

These stories are catnip to reporters who get all their education facts from The Big Book Of Middlebrow Education Shibboleths. First, unlike our cookie cutter teacher tenure system, Korean teachers work in a real meritocracy where kids and their parents reward excellence with cash. Take that, teachers!

Then, unlike American moms and dads, Korean parents care about their kids and put billions into their education. Take that, parents!

And oy, the faith Anna shows in her subjects. Cha is a “top-ranked math teacher” who “says” he earns a “cool $8 million last year.” Cha says he’s been teaching for 20 years, but refuses to give his age and there’s no mention of the topic or school he attended for his PhD, or if he ever got one. But he’s got a really popular video, so he must be great!

Some outlets are less adulatory. The Financial Times points out that the Korean government is cracking down on hagwon fees and operating hours, and preventing them from pre-teaching topics. Megastudy, the company in the 2005 story linked in above, just went up for sale because of those government changes. Michael Horn of the Christiansen Institute is doing no small part to alert people to the madness of the Korean system. The New York Times, despite its tears for the Korean and Chinese testers, has done its fair share to report on the endemic cheating in Chinese college applications.

But when it comes to the College Board and the SAT, everyone seems to be hands off the international market. At what point will it occur to reporters to seriously investigate whether a large chunk of the money spent on cram schools is not for instruction, but for “prior knowledge” cheating? When will they ask the Korean cram school instructors if they are fronts for an organized criminal conspiracy, if the money they get is not for tutoring, but for efficient delivery of test answers on test day? And how many of those test days are run by the College Board?

People think “well, sure, there’s some cheating, but so what? Some kids cheat.” Yeah, like I’d be writing this if it were a few dozen, or even a few hundred kids. Asian immigrants cheating on major tests in this country is in the high hundreds a year. Maybe more. In China and Korea? I suspect it’s beyond our comprehension, us ethical ‘murricans.

One of the depressing things about the past three years is that I start looking into things more closely. I never really trusted the media, mind you, but I did assume that journalists skewed stories because of bias. I fondly imagined, silly me, that journalists wanted to investigate real wrongdoing. Yes. Laugh at my foolish innocence.

Consider what would be disrupted if public American pressure forced the College Board to end endemic international student cheating. First, the CB would lose millions but weep no tears, it’s a non-profit company. hahahahah! Yeah, that makes me laugh, too.

But public universities increasingly rely on international student fees and the pretense that they are qualified to do college work. After all, the thinking goes, we accept a lot of Americans who aren’t prepared for college work—may as well take in some kids who pay full freight. Private schools, too, appreciate the well-heeled Chinese students who don’t expect tuition discounts.

So suppose public pressure forces the College Board to use brand new tests for the overseas market, require all international testing to be done at US international schools, use different tests at different locations. The College Board might decide that the international market profits weren’t worth the hassle for other than US students living abroad (as indeed, the ACT seems to have done for years). Either way, a crackdown on testing security would seriously compromise Chinese and Korean students’ ability to lie about their college readiness and English skills.

A wide swath of public universities would either have to forego those delightful international fees or simply waive the SAT requirement, but without those inflated test scores it will be tough to justify letting in these kids over the huge chunk of white and Asian Americans who are actually qualified. No foreign students, more begging for money from state legislatures. Private universities would have a difficult time bragging about their elite international students without the SAT scores to back thing up.

Plus, hell, we changed the source country for zombies because we didn’t want to piss off China. Three years ago, the College Board wanted to open up mainland China as a market. 95% of the SAT testers in Hong Kong are Chinese. Stop all that money flowing around? People are going to be annoyed.

At this point, I start to feel too conspiratorial, and go back to figuring that reporters just don’t care. I’ve got a lot of respect for education policy reporters—the Edweek reporters are excellent on most topics—and most reporters do a good job some of the time.

But the SAT is basically corrupt in the international market. I’ve already written about test and grade corruption among recent Asian immigrants over here, particularly in regards to the Advanced Placement tests and grades.

Yet no one seems to really care. Sure, people disapprove of the SAT, but for all the wrong reasons: it’s racist, it’s nothing more than an income test, it reinforces privilege, it has no relationship to actual ability. None of these proffered reasons for hating the SAT have any relationship to reality. But that the SAT is this huge money funnel, taking money from states and parents and shoveling it directly or indirectly into the College Board, universities, and the companies who have essentially broken the test? Eh. Whatever.

The people who are hurt by this: middle and lower middle class whites and Asian Americans. So naturally, who gives a damn?

enlighten citizens, hold powerful people and institutions accountable and maybe even make the world a better place

Sigh. Happy New Year.

*****************************
*In the comments, an actual SAT prep coach making millions–no, really, he assures us, millions!–simply by being a fabulous coach with stupendous methods is insulted that I insinuated that the Washington Post story was on an SAT prep coach, rather than the Korean equivalent of the SAT. I knew that, but at one point referred to the guy as a SAT prep coach. I fixed the text.


Advanced Placement Test Preferences: Asians and Whites

I just finished my AP US History survey course, and a glorious time it was. But I will save the specifics of my three to four hour lectures, and whether or not this is a good way to teach history for another post. I will also, hopefully, weigh in some time on what value add I think I bring to history. (If you’re curious, in public school I taught history of Elizabethan theater and a truly awesome 50s science fiction film course, in which students were to analyze the movie’s foreign policy approach by Walter Russell Mead’s paradigm.)

I always end my AP class by discussing the students’ course selections for next year. APUSH is a junior course, and I have about ten kids in this particular class, and the conversation is always the same.

“What are you taking?”

“Calc BC, AP Physics C, AP Bio, AP Gov.”

“You?

“AP Chem, AP Stats, AP Psych,…”

“So you’ve already taken BC?”

“Yeah, just took the test. Piece of cake. I’m taking intro to MVC.”

“What about AP English?”

All the heads shake. “God, no. Way too hard.”

One kid says “I’m taking AP Gov, I heard it’s easy.”

“I’m taking Macro Econ, one of our teachers has all the info you need to pass the tests.”

I laugh. “Jesus. Embrace the stereotype.”

They all get it and laugh, shamefacedly.

“Who’s taking AP English this year?” Two hands rose. “AP English next year?” No hands.

“So here’s what I don’t understand. You are all trying to get into college, and the reason you are taking these tough classes is to make yourself look good for colleges.”

“Sure.”

“And I see only Chinese, Korean, and Indian Americans in front of me, all either FOB or citizens with parents who lived most of their lives in China, Korea, or India. Moreover, as I imagine you’ve heard, and certainly your parents have heard, universities often engage in some form of discrimination against Asians.”

“Wow,” one of the students laugh-gasped. “I never thought I’d hear an American admit that.”

“An American, or a white person?”

“They aren’t the same?”

“You born here?” Pause, as I see that datapoint register. Yes. She’s an American. (We’ll leave aside the fact that they don’t consider blacks and Hispanics American, either. I’ve written about this before; it’s still weird to see.)

“Anyway. All of you avoid classes that involve reading literature or written analysis because they would be too difficult.”

“Well, yeah.”

“So the stereotype is all wrong.”

“What, the stereotype that says we’re good at science and math?”

“No, the stereotype that says you work hard, that you take on challenges.”

“Oooooh, SNAP.”

I smiled, too. “Look, there’s a serious point here. You’re a college admissions officer, reading through approximately 16 billion Asian resumes that all read exactly the same: 4.2 GPA, BC calculus as a sophomore ( with the occasional underachiever waiting until junior year), several AP science courses, APUSH for those of you who can string a sentence together, AP Chinese for those of you lucky enough to win the language lottery, and so on. What’s going to stand out? Not one more STEM course.”

“Yeah, but I hate reading.”

“You think the universities don’t know that? Oh, look, one more Asian kid who’s a machine at math and can memorize all the facts in AP Bio but uses Cliff notes for Hamlet. College admissions is a numbers game anyway, and I’m not pretending anything is going to make a huge difference,, but…”

“My dad says colleges are reducing Asians born here…American Asians [score!] for Chinese and Koreans.”

“Your dad’s right. So given all the work you’re putting in clearly to just get that last inch of consideration, may I suggest that the path to differentiation lies in showing the admissions reviewer that you take on challenges in all subjects, as opposed to taking classes you know you’ll get an A in.”

***********************************************

I was going to just post this little anecdote, but then I got to wondering just how prevalent the behavior is—it is exclusive to my little corner of the country, or are the recent Asian immigrants showing up in national data?

One of the problems with AP data is that you simply can’t make too many assumptions. For example, much has been written about the fact that the mode AP score for blacks is 1. Not only do most blacks fail the AP test, people wail, but they fail it completely! Twice as many blacks get a failing score as get a passing score! Our teachers are failing black children!

Yeah, no. The black AP population is a combination of at least three different groups. First, the group of genuinely qualified, academically prepared black students. Small group, I know, but each year hundreds of African American students take and pass the BC Calculus test, many with a score of 5 (however, 1 is still the mode for BC Calc). Second, the group of average or higher ability blacks with relatively little interest in academic success, who have nonetheless been put in AP classes by desperate suburban school officians who are under fire from the feds for their “opportunity gap” numbers. These are kids who could, with good teaching, achieve a respectable “3” on a number of tests, and probably do.

The problem, alas, is that a teacher can focus on getting middle achievers over the hump, or on challenging a bunch of smart kids. Can’t do both in the same room, not easily and probably not at all. Thus bringing in more marginal black students and coaxing them to a three occasionally has a depressing effect on suburban AP scores, as the top white kids aren’t being taught at the top of their ability. But I digress.

The third group, and it’s huge, are low income urban and charter schools gaming the GPA and Jay Mathews Challenge Index. These are kids who are barely literate, often aren’t even taught the course material, but boy, by golly if they get the butts in the seats they’ll show up on Jay’s list somewhere. All at taxpayer expense.

While the AP tests results disaggregate Mexicans and Puerto Ricans from the rest of Hispanics, Mexican performance has the same conflation of three groups as black results do, and are equally useless. The Hispanic mode score is also one.

Asian scores aren’t disaggregated, but the Big Three (Chinese, Koreans, and Indians) dominate.

So are Asians showing a preference for science and math over the humanities AP tests?

AP testing populations by race–mostly. It would have been a huge hassle to add up all the URM categories, so I just subtracted whites and Asians from the total. So “Decline to state” is categorized as a URM, when it’s probably mostly white. I checked a couple values, it wasn’t a big difference. These are the top 20 tests by popularity, in order from left to right1.

2013aptablebyrace

The visual display is useful—look for big green, little blue, or a relatively high number of URMs, fewer Asians. See? Asians live the stereotype. Don’t assume that blacks and Hispanics are drawn to the Humanities courses—it’s just easier for schools to shove unprepared kids into English, Geography, and History classes than it is to science and math courses. Fewer prerequisites.

Here’s the same data in table form. I added one column, Asians as a percentage of the Asian/white total, to clear away the URM noise. Then I highlighted the tests for each column that were more than one average deviation away from the mean, both higher and lower (I used average deviation because I don’t want outliers emphasized. Just wanted to show spread.) I bolded any values that were more than two average deviations away from mean.
2013aptablebyrace

Whites are the most tightly clustered, URMs next. Asians tilt strongly towards and against.

There’s a lot more to explore here, and I hope to do that soon. But for now, I wanted to stay focused on Asian vs. white preferences. So I next compared the top 20 Asian test preferences to those of whites. (Actually, I did 22 for Asians because I thought #22 was revealing.)

AP totals include many multiple testers, so I took the number of testers for any given test as a percentage of the total for that race. This is not a perfect measure, for obvious reasons. Or maybe not so obvious. Say, for example, that an entirely different group of Asians take the English Lit test than take the Calc AB test, but the white students have a significant overlap. In that case, the percentage of testers would be saying something entirely different about each group than if both Asians and whites had overlapping testers.

However, in either case, it would be revealing. If more whites than Asians took both math and English tests, or if one group of Asians took math tests and another group took English (or the same case of whites), the percentages are still showing a preference. I think. I’m sure there’s a way to describe this more technically, but it’s late, the school year’s almost over, so put the correct text in comments and I’ll change it.

Anyway.

whitestop20ap

asiantop22

And here it is graphically, ranked again by test popularity. The blue and green columns are the percentage of white or Asian testers taking that test. The graph above was percent of each test population that was white/Asian/URM. These columns show the percent of white or Asian population taking that particular test (the blue column “% of total” in the tables immediately above). The line graph is the percent of each group that scored a 5 on that test.

apasianwhitepref

(You notice something weird? Spanish is the tenth most popular test–but it barely makes the top 20 for either whites or Asians. How could that be? Who on earth is taking all those Spanish tests?)

So again, I want to write more about these results but I thought I’d put them out there and let people chew on them. Here’s a few preliminary observations:

  • Whites appear to be the utility players, good in a number of subjects and not expressing huge preferences. They are stretching more into STEM than Asians stretch into writing.
  • Asians appear to be avoiding writing-intensive tests relative to whites, no matter how you interpret the data.
  • Asians tend to choose tests that are more likely to yield high scores, and avoid tests that give out fewer 5s. Until recently, AP Bio doled out 5 scores like candy; they clearly changed scoring in some significant way this year (without announcing it, I guess). Environmental Science, which has a deservedly crappy rep, is actually pretty hard to get a high score on, so Asians avoid.
  • The real difference between Asians and whites in both preferences and scores is in the science tests, not math. Asians have higher scores in all tests—and while that’s probably a reflection of cognitive ability, you really can’t understand the difference in preparation and grinding until you see it—but the real gaps are in the sciences. AP Science courses are, in my opinion, pretty horrible to begin with. Yes. It’s the subject I don’t teach. Bias alert.

TL, DR: Asians across the land reflect the same biases. They may or may not be working hard, but they appear to be avoiding subjects that are more difficult for them, and don’t yield as high a score. This may also be why they avoid the ACT. Or not.

More on this later. Let me know what you think and of course, point out any errors.

1I actually did this work from the bottom up. So in the first chart, which was actually the last one I did, there are only 19 tests. Guess which one I left off, and why. The other charts all have 20 tests.


Finding the Bad Old Days

Michael Petrilli wrote an extremely aggravating article suggesting we tell unqualified kids they aren’t ready for college and go to CTE and then a much improved follow up that acknowledges the racial reality of his idea.

In his first piece, Petrilli only mentions race once:

PetrilliCTEquote3

This is a common trope in articles on tracking, a nod to “the bad old days” right after the end of segregation, that time immediately after Brown and ending sometime in the late 70s, or when Jeannie Oakes excoriated the practice in Keeping Track.

In the bad old days, the story goes, evil school districts, eager to keep angry racist white parents from fleeing, sought a means of maintaining segregation despite the Supreme Court decision and the Civil Rights Act. So they pretended to institute ability grouping and curriculum tracks, but in reality, they used race. That way the district could minimize white flight and still pretend to educate the poor and the brown. That’s why so many brown kids were in the low ability classes, and that’s why so many lawsuits happened, because of the evil racist/classist methods of rich whites keeping the little brown people down.

The bad old days are a touchstone for anyone proposing an educational sorting mechanism. So you have Petrilli advocating a return to tracking, who tell us the bad old days are a thing of the past: yeah, we used to track by race and income, pretending to use ability, but we’ve progressed. Districts pretended to use IQ, but they were really using culturally biased tests to commit second-order segregation. Today, we understand that all races and all incomes can achieve. Districts don’t have to distort reality. The bad old days are behind us, and we can group by ability secure that we aren’t discriminating by race.

Before ed school, I accepted the existence of the bad old days, but then I noticed that every reading asserted discrimination but didn’t back it up with data. Since ed school, I’d occasionally randomly google on the point, looking for research that established discriminatory tracking back in the 60s and 70s. And so the Petrilli article got me googling and thinking again. (What, buy books? Pay for research? Cmon, I’m a teacher on a budget. If it’s damning, the web has it.)

I first reviewed Jeannie Oakes, reaffirming that Oakes holds tracking itself, properly applied, as the operative sin. Discriminatory tracking isn’t a main element of Oakes’ argument, although she points out that “some research” suggests it occurred. Oakes’ third assumption, that tracking is largely made on valid decisions (page 4) is accepted at face value. So the grande dame of the anti-tracking movement has completely neglected to mention the bad old days—which, at that time, would have been contemporary.

On I move to Roslyn Mickelson, who does charge Charlotte Mecklenburg schools with discriminatory tracking.

mickelson5

In Capacchione v Charlotte-Mecklenburg, Judge Richard Potter eviscerates her expert testimony, finding faults with her credibility, her accuracy, and her logic.

Bottom line, however, Mickelson’s research shows that high achieving scorers in year one are not consistently placed in high achieving classes six years later. While both whites and blacks with high scores end up in low tracks and vice versa, more whites get high placement than blacks. But generally, her data shows something I’ve documented before, that achievement falls off each year because school gets harder.

Both whites and blacks experience the falloff, even though Mickelson seems to think that the pattern should be linear. The achievement scale simply gets larger as kids move up in grade levels, and fewer blacks make the top tier. This is consistent with cognitive realities.

There might be a smoking gun in research. But I couldn’t find it.

Then I suddenly realized duh, what about case law? If districts were tracking by race, there’d be a lawsuit.

I started with three legal articles that discussed tracking case law: 1, 2 and 3. They were all useful, but all failed to mention a significant case in which the district routinely used different standards or sorted directly by race or zip code.

From these articles, I determined that Hobson vs. Hanson was the original tracking case, and that the McNeal standard was for many years (and may still be) the test for ability grouping.

So I created a reading list of cases from the late 60s to the early 90s:

Only two of these cases involved schools directly accused of using race to sort students. In Johnson v. Jackson, the schools were forced to integrate in the middle of a school year. The black kids were ported over to white schools and the classes kept intact. The court ordered them to fix this. From first integration order to the fix order: 4 months.

The second case, Rockford, was decided in the early 90s, and the judge directly accuses the district of intentionally using race to ability group. However, Jeannie Oakes was the expert witness, and the judge drank every bit of Koolaid she had to offer and licked the glass. Oakes is presented as an expert witness, with no mention that she’s an anti-tracking advocate. Her testimony appears to be little more than readings from her book and some data analysis.

The proof of “intentional racism” was pretty weak and largely identical to Mickelson’s described above. Major difference: the judge accepted it.

Leaving aside these two cases, I couldn’t find any case in which the district was found to misuse the results of the test, either by using different racial standards or ignoring the tests entirely. The tests themselves were the issue.

In the south, school systems that weren’t “unitary” (that is, were previously segregated districts) couldn’t use ability testing. Since blacks would have lower scores based on past racial discrimination, the use of tests was discriminatory, an intent to segregate.

For school systems that were found to be unitary, ability testing isn’t in and of itself invalid and racial imbalance isn’t a problem (see Starkville case for example).

In all these cases, I couldn’t find a district that was tracking by race. They were guilty of tracking by test. Everyone knew the tests would reveal that blacks would have lower ability on average, and therefore ability grouping was by definition invalid in previously segregated schools. This was an era in which judges said “The court also finds that a Negro student in a predominantly Negro school gets a formal education inferior to the academic education he would receive, and which white students receive, in a school which is integrated or predominantly white.” (Hobson)

Once the system is declared unitary, or that was never an issue, the record is mixed. When judges did accept the results as valid, they ruled in favor of the school districts (Starkville, Hannon). In Pase v Hannon, the judge actually reviewed the test questions himself and determined they were unbiased with few exceptions, all of which were far above the IQ level in question.

In California, on the other hand, where de jure segregation wasn’t an issue*, the mere existence of racial imbalance was still a problem (Pasadena, Riles). In Riles, Judge Robert Peckham banned all IQ testing of blacks in California for educational purposes. He later extended the ruling even if black parents requested testing, but later withdrew that order. Peckham’s reasoning is much like the other judges who believed in cultural bias:

Even if it is assumed that black children have a 15 percent higher incidence of mild mental retardation than white children, there is still less than a one in a million chance that a color-blind system would have produced this disproportionate enrollment. If it is assumed that black children have a 50 percent greater incidence of this type of mental retardation, there is still less than a one in 100,000 chance that the enrollment could be so skewed towards black children.

Notice the reasoning: of course it’s not possible that blacks have a 50% greater incidence of an IQ below 75. Except it’s worse than that.

This image is from The Bell Curve (borrowed from here) reflecting the frequency of black/white IQ distribution:

BCFreqblkwhiteIQ

As many blacks as whites populate the sub 75 IQ space, but the population distribution being what it is, blacks are far more likely to have low IQs.

When Charles Murray researched this for The Bell Curve:

In the NLSY-79 cohort, 16.8 percent of the black sample scored below 75, using the conversion of AFQT scores reported in the appendix of TBC and applying sample weights. The comparable figure for non-Latino whites was 2.2 percent. In the NLSY-97 cohort, the comparable figures were 13.8 percent for blacks and 2.7 percent for non-Latino whites.

(Charles Murray, personal communication)

So at the time of Peckham’s decision, blacks didn’t have a 50% higher chance of an IQ below 75, but rather a several hundred percent higher chance, a chance that is still in the triple digits today.1 Peckham couldn’t even begin to envision such a possibility, and so no IQ testing for blacks in California.

(As for the lower frequency of blacks in the “trainable” mentally retarded division, as it was called then, an interesting but rarely discussed fact: Low IQ blacks are often higher functioning that low IQ whites. They are less likely to be organically retarded, and more likely to be capable of independent living. This despite the fact that their IQ tests and academic outcomes are identical. Arthur Jensen discovered this phenomenon, and I highly recommend that article; it’s fascinating. I wonder if the difference is somehow related to crystallized vs. fluid intelligence, but haven’t read up enough on it.)

So there it is. Obviously, if I missed a key case in which a major district was found to have deliberately tracked kids by race, please let me know.

But despite extensive efforts, I couldn’t find the bad old days of discriminatory sorting. What I found, instead, was a judicial rejection of IQ and other ability tests, coupled with an inability to conceive of the actual distribution patterns of cognitive ability.

Please understand my limited objective. Many Southern districts did everything they could to avoid integration. See, for example, US v Tunica, where the school tried to assign students based on test scores, but were denied because of the achievement testing ban and required to reassign students and teachers to achieve integration. The teachers refused assignment to integrated schools and resigned, white parents withdrew their kids, then the white schools set up shop at local churches, classes largely intact. Money? Not an issue. They used taxpayer dollars, since the district paid the teachers who resigned and the kids took all their school books with them.

But believe it or not, there’s no mention that the district was only pretending to use test scores, actually assigning students by race. And this is a place where I’d expect to find it. Opposition to integration, absolutely. Achievement testing used as a way to minimize racially mixed classes? Sure.

In many other cases, schools or districts instituted tracking as a genuine attempt to educate a much wider range of abilities, or even had a tracking system in place before integration.

The inconvenient realities of cognitive ability distribution being what they are, the test scores would be depressingly indifferent to intent.

Then there’s the messy middle, the one that Mickelson probably found in Charlotte and Oakes found in Rockford and any one looking at my classrooms would find as well. All tracked classrooms are going to have inconsistencies, whether the schools use tests, teacher recommendations, or student choice. The honors classes fill up or a teacher suddenly dies or all sorts of other unforeseen situations mean some kids get moved around and it’s a safe bet high income parents bitch more about wrong assignments than poor parents. Go through each high score in a “regular” class and each low score in a tracked, and each one of those test scores will have a story—a story usually doesn’t involve race or malign intent. The story occasionally does involve bad teachers or district bureaucracy, but not as often as you might think.

Teacher recommendations are supposed to mitigate the testing achievement gap but teachers are moralists, particularly in math, as I’ve written before. It doesn’t surprise me that new study shows that controlling for performance, blacks are less likely to be assigned to algebra as 8th graders by teacher recommendation. I can’t tell you the number of bright Hispanic and black kids I’ve run into (as well as huge number of white boys, including my son) who don’t bother with homework and have great test scores. So their GPA is 2.7, but their test scores are higher than the kids who got As–and the teacher recommendations.

Parents: some parents insist that their kids need to be in the top group to be challenged. Others feel that their kids do better when they feel secure, able to manage the challenge. Then there are the parents who don’t give a damn about their kids’ abilities but don’t want them in a noisy classroom with kids who don’t give a damn about education. White and Asian parents are disproportionately represented in the first group, black and Hispanic parents take up more than their share in the second, and all parents of all races worry about the last.

So let’s stop using teacher recommendation, stop allowing parents or students to ask for different placement. Test scores are destiny.

But test scores today still reflect the same reality that the judges assumed, back then, could only be caused by racism or bias.

The tests haven’t changed. The kids haven’t changed much.

The judges are another story.

Richard Posner, in a much-quoted 1997 decision on an appeal to the People Who Care v Rockford did what he has done before–made my point with much greater efficiency:

Tracking is a controversial educational policy, although just grouping students by age, something no one questions, is a form of “tracking.” Lawyers and judges are not competent to resolve the controversy. The conceit that they are belongs to a myth of the legal profession’s omnicompetence that was exploded long ago. To abolish tracking is to say to bright kids, whether white or black, that they have to go at a slower pace than they’re capable of; it is to say to the parents of the brighter kids that their children don’t really belong in the public school system; and it is to say to the slower kids, of whatever race, that they may have difficulty keeping up, because the brighter kids may force the pace of the class. …

Tracking might be adopted in order to segregate the races. The well-known correlation between race and academic performance makes tracking, even when implemented in accordance with strictly objective criteria, a pretty effective segregator. If tracking were adopted for this purpose, then enjoining tracking would be a proper as well as the natural remedy for this form of intentional discrimination, at least if there were no compelling evidence that it improves the academic performance of minority children and if the possible benefits to the better students and the social interest in retaining them in the public schools were given little weight. The general view is that tracking does not benefit minority students…although there is evidence that some of them do benefit… All this is neither here nor there. The plaintiffs’ argument is not that the school district adopted tracking way back when in order to segregate the schools. It is that it misused tracking, twisting the criteria to achieve greater segregation than objective tracking alone would have done. The school district should be enjoined from doing this not, on this record, enjoined from tracking.

The Charlotte-Mecklenburg case mentioned above cited Posner’s reasoning. The third of my case law articles discusses Holton v Thomasville II, which doesn’t mention Posner but does say that racial imbalance in ability grouping isn’t of itself evidence of discrimination, and points out that the time for judicial interference in educational decisions is probably over:

holtoncase

Most districts ended tracking out of fear of lawsuits. It may be time for parents to demand more honors classes, test the limits.

So what does this have to do with Petrilli? Well, less than it once did, now that Petrilli has acknowledged the profound racial implications of his suggestion.

But if the bad old days of racial tracking never really existed, then Petrilli can’t pretend things will be better. Yes, we must stop devaluing college degrees, stop fooling kids who have interest but no ability in taking on massive loans that they can never pay off. And with luck even Petrilli will eventually realize as well that we have to stop forcing kids with neither interest nor ability to sit in four years of “college preparation” courses feeling useless.

So what comes next? Well, that’s the question, isn’t it?

*************************
*Commenter Mark Roulo points out that California did commit de jure segregation against Hispanics and was ordered to stop in Mendez v. Westminster. See comments for my response.

1See Steve Sailer’s comment for why black IQs might have been biased against lower IQ blacks and the 97 data more representative.