(I was writing my final article on this era when I realized I hadn’t really focused completely on the history of Value Added Metrics (VAM) in my original coverage of the Obama years. I am saying this because VAM sprites both pro and con are holding me at gunpoint demanding I write an article all about them.)
In 2009, The New Teacher Project’s The Widget Effect declared that schools treated all teachers as interchangeable units, didn’t bother to train new teachers, refused to fire tenured teachers, and worse, gave all teachers high ratings. 99% of teachers got ratings of Proficient or higher! The shame!
Mind you, none of these are new declarations, but this paper initiated the argument that allowed Obama and Duncan (as I wrote here) to demand that states evaluate teachers with student achievement, and that achievement must be test scores. Thus, one of the requirements for a Duncan “waiver” from No Child Left Behind school “program improvement penalities”, which by now were affecting over half of all schools, was that the state must begin evaluating teacher effectiveness using data–just another word for VAM.
Put another way, Obama and Duncan allowed states to escape schoolwide accountability for student test scores by forcing them to agree to teacher accountability for student test scores.
In 2009, 10 states required evaluation to include student achievement metrics. By 2015, 43 states required value-added metrics for evaluation. Most courts agreed that the usually hasty and poorly thought through implementation plans were absurd and unfair, but declined to step in. There were some notable exceptions, as you’ll see. (Note: I wrote a longer opinion of VAM that includes more info.)
From 1% Ineffective to…..?
By now, no one should be surprised to learn that these efforts were a spectacular failure, although rarely reported in just those terms. But by 2019, only 34 states required it, and most other states still requiring them on paper had watered down the impact by dramatically reducing the VAM component, making VAM optional, removing the yearly requirement for teacher evaluations, or allowing schools to design their own metrics.
In the definitive evaluation, Harvard researchers studied 24 states that implemented value-added metrics and learned that principals refused to give teachers bad ratings. In fact, principals would rate teachers lower in confidential ratings than in formal ones, although in either method the average score was a positive evaluation. When asked, principals said that they felt mean giving the bad results (which suggests they didn’t agree with them). Moreover, many principals worried that if they gave a bad review, the teachers might leave–or worse, force the principal to begin firing procedures. Either way, the principal might end up forced to hire a teacher no better or possibly worse.
Brief aside: Hey, that should sound familiar to long-time readers . As I wrote seven years ago: “…most principals don’t fire teachers often because it’s incredibly hard to find new ones.”. Or as I put it on Twitter back when it allowed only 140 characters, “Hiring, not firing, is the pain point.”
So the Obama administration required an evaluation method that would identify bad teachers for firing or training, and principals are worried that the teachers might leave or get fired. That’s….kind of a problem.
Overall, the Harvard study found that only two of them gave more than 1% of teachers unsatisfactory ratings.
If you do the math, 100% – 1% = 99% which is exactly what the Widget effect found, so that was a whole bunch of money and energy spent for no results.
The study’s outlier was New Mexico, which forced principals to weight VAM as 50% of the overall evaluation score, courtesy of Hanna Skandera, a committed reform education secretary appointed by a popular Republican governor. As a result, over 1 in 4 teachers were rated unsatisfactory.
But! A 2015 court decision prevented any terminations based on the evaluation system, and the case got delayed until it was irrelevant. In 2017, Governor Martinez agreed to a compromise on the evaluation methodology, increasing permitted absences to six and dropping VAM from 50% to 35%. New Mexico also completed its shift from a purple to blue state, and in 2018 all the Democratic gubernatorial candidates promised they would end the evaluation system. The winner, Michelle Lujan, wasted no time. On January 3, 2019, a perky one-page announcement declared that VAM was ended, absences wouldn’t count on evaluations, and just for good measure she ended PARCC.
So the one state in which principals couldn’t juke the stats to keep teachers they didn’t want to fire, the courts stepped in, the Republican governor backed down, and the new Democrat governor rendered the whole fuss moot.
California had always been a VAM outlier, as governor Jerry Brown steadfastly refused the waiver bribes .Students Matter, an organization founded by a tech entrepreneur, engaged in a two-pronged attempt to force California into evaluation compliance–first by suing to end teacher tenure (Vergara) and then by forcing evaluation by student test scores (Doe vs. Antioch). Triumphalists hailed the original 2014 Vergara decision that overturned the protections of teacher tenure, and even the more cautiously optimistic believed that the California appeals court might overturn the decision, but the friendlier California Supreme Court would side with the plaintiffs and end tenure. The appeals court did overturn, and the CA Supreme Court….declined to review, letting the appellate ruling stand.
Welch and Students Matter likewise tried to force California schools to read its 1971 Stull Act as requiring teachers to be evaluated by test scores. That failed, too. No appeal.
“Experts” often talk about forcing education in America to follow market-based principles. But in the VAM failure, the principals are following those principles! (hyuk.) As I’ve also written many times, there is, in fact, a teacher shortage. But at the same time, even the confidential evaluations demonstrate that the vast majority of teachers are doing good work by their manager’s estimation.
As a teacher, I would be interested in learning whether I had an impact on my students’ scores. I’d be more interested, really, in whether my teaching methods were helping all students equally, or if there were useful skews. Were my weakest students, the ones who really weren’t qualified for the math I was teaching, being harmed, unlearning some of the earlier skills that could have been enforced? Was my practice of challenging the strongest students with integrated problem solving and cumulative applications of material keeping them in the game compared to other students whose teachers taught more faster, tested only on new material, and gave out practice tests?
But the idea that any teachers other than, perhaps, reading teachers in elementary school could be accurately assessed on their performance by student learning is just absurd.
Any teacher could have told you that. Many teachers did tell the politicians and lobbyists and billionaires that. But teachers are the peasants and plebes of the cognitive elite, so the country had to waste billions only to get right back to where we started. Worse: they still haven’t learned.
( I swear I began this article as the final one in the series until I realized VAM was pulling focus. I really do have that one almost done. Happy New Year.)