Tag Archives: Value-added evaluations

Bush/Obama Ed Reform: Victory over Value Add

(I was writing my final article on this era when I realized I hadn’t really focused completely on the history of Value Added Metrics (VAM) in my original coverage of the Obama years. I am saying this because VAM sprites both pro and con are holding me at gunpoint demanding I write an article all about them.)

In 2009, The New Teacher Project’s The Widget Effect declared that schools treated all teachers as interchangeable units, didn’t bother to train new teachers, refused to fire tenured teachers, and worse, gave all teachers high ratings.  99% of teachers got ratings of Proficient or higher! The shame!

Mind you, none of these are new declarations, but this paper initiated the argument that allowed Obama and Duncan (as I wrote here)  to demand that states evaluate teachers with student achievement, and that achievement must be test scores. Thus, one of the requirements for a Duncan “waiver” from No Child Left Behind school “program improvement penalities”, which by now were affecting over half of all schools, was that the state must begin evaluating teacher effectiveness using data–just another word for VAM.

Put another way, Obama and Duncan allowed states to escape schoolwide accountability for student test scores by forcing them to agree to teacher accountability for student test scores.

In 2009, 10 states required evaluation to include student achievement metrics. By 2015, 43 states required value-added metrics for evaluation. Most courts agreed that the usually hasty and poorly thought through implementation plans were absurd and unfair, but declined to step in. There were some notable exceptions, as you’ll see. (Note: I wrote a longer opinion of VAM that includes more info.)

From 1% Ineffective to…..?

By now, no one should be surprised to learn that these efforts were a spectacular failure, although rarely reported in just those terms. But by 2019, only 34 states required it, and most other states still requiring them on paper had watered down the impact by dramatically reducing the VAM component, making VAM optional, removing the yearly requirement for teacher evaluations, or allowing schools to design their own metrics.

In the definitive evaluation, Harvard researchers studied 24 states that implemented value-added metrics and learned that principals refused to give teachers bad ratings. In fact, principals would rate teachers lower in confidential ratings than in formal ones, although in either method the average score was a positive evaluation.  When asked, principals said that they felt mean giving the bad results (which suggests they didn’t agree with them). Moreover, many principals worried that if they gave a bad review, the teachers might leave–or worse, force the principal to begin firing procedures. Either way, the principal might end up forced to hire a teacher no better or possibly worse.

Brief aside: Hey, that should sound familiar to long-time readers . As I wrote seven years ago: “…most principals don’t fire teachers often because it’s incredibly hard to find new ones.”. Or as I put it on Twitter back when it allowed only 140 characters, “Hiring, not firing, is the pain point.” 

So the Obama administration required an evaluation method that would identify bad teachers for firing or training, and principals are worried that the teachers might leave or get fired. That’s….kind of a problem. 

Overall, the Harvard study found that only two of them gave more than 1% of teachers unsatisfactory ratings.

If you do the math, 100% – 1% = 99% which is exactly what the Widget effect found, so that was a whole bunch of money and energy spent for no results.

New Mexico

The study’s outlier was New Mexico, which forced principals to weight VAM as 50% of the overall evaluation score, courtesy of Hanna Skandera, a committed reform education secretary appointed by a popular Republican governor. As a result, over 1 in 4 teachers were rated unsatisfactory.

But! A 2015 court decision prevented any terminations based on the evaluation system, and the case got delayed until it was irrelevant. In 2017, Governor Martinez agreed to a compromise on the evaluation methodology, increasing permitted absences to six and dropping VAM from 50% to 35%. New Mexico also completed its shift from a purple to blue state, and in 2018 all the Democratic gubernatorial candidates promised they would end the evaluation system. The winner, Michelle Lujan, wasted no time. On January 3, 2019, a perky one-page announcement declared that VAM was ended, absences wouldn’t count on evaluations, and just for good measure she ended PARCC.

So the one state in which principals couldn’t juke the stats to keep teachers they didn’t want to fire, the courts stepped in, the Republican governor backed down, and the new Democrat governor rendered the whole fuss moot.

California

California had always been a VAM outlier, as governor Jerry Brown steadfastly refused the waiver bribes .Students Matter, an organization founded by a tech entrepreneur, engaged in a two-pronged attempt to force California into evaluation compliance–first by suing to end teacher tenure (Vergara) and then by forcing evaluation by student test scores (Doe vs. Antioch).  Triumphalists hailed the original 2014 Vergara decision that overturned the protections of teacher tenure, and even the  more cautiously optimistic believed that the California appeals court might overturn the decision, but the friendlier California Supreme Court would side with the plaintiffs and end tenure. The appeals court did overturn, and the CA Supreme Court….declined to review, letting the appellate ruling stand. 

Welch and Students Matter likewise tried to force California schools to read its 1971 Stull Act as requiring teachers to be evaluated by test scores. That failed, too.  No appeal.

Upshot

“Experts” often talk about forcing education in America to follow market-based principles. But in the VAM failure, the principals are following those principles! (hyuk.) As I’ve also written many times, there is, in fact, a teacher shortage. But at the same time, even the confidential evaluations demonstrate that the vast majority of teachers are doing good work by their manager’s estimation.

As a teacher, I would be interested in learning whether I had an impact on my students’ scores. I’d be more interested, really, in whether my teaching methods were helping all students equally, or if there were useful skews. Were my weakest students, the ones who really weren’t qualified for the math I was teaching, being harmed, unlearning some of the earlier skills that could have been enforced? Was my practice of challenging the strongest students with integrated problem solving and cumulative applications of material keeping them in the game compared to other students whose teachers taught more faster, tested only on new material, and gave out practice tests?

But the idea that any teachers other than, perhaps, reading teachers in elementary school could be accurately assessed on their performance by student learning is just absurd.

Any teacher could have told you that. Many teachers did tell the politicians and lobbyists and billionaires that. But teachers are the peasants and plebes of the cognitive elite, so the country had to waste billions only to get right back to where we started. Worse: they still haven’t learned.

( I swear I began this article as the final one in the series until I realized VAM was pulling focus. I really do have that one almost done. Happy New Year.)


Algebra 1 Growth in Geometry and Algebra II

Last September, I wrote about my classes and the pre-algebra/Algebra 1 assessment results.

My school covers a year of instruction in a semester, so we just finished the first “year” of courses. I start with new students and four preps on Monday. Last week, I gave them the same assessment to see if they’d improved.

Unfortunately, the hard drive on my school computer got wiped in a re-imaging. This shouldn’t have been a problem, because I shouldn’t have had any data on the hard drive, except I never got put on the network. Happily, I use Dropbox for all my curriculum development, so an entire year’s worth of intellectual property wasn’t obliterated. I only lost the original assessment results, which I had accidentally stored on the school hard drive. I should have entered the scores in the school grading system (with a 0 weight, since they don’t count towards the grade) but only did that for geometry, the only class I can directly compare results with.

My algebra II class, though, was incredibly stable. I only lost three students, one of whom got a perfect score—which the only new addition to the class also got, so balance maintained. The other two students who left got around 10-15 wrong, so were squarely in the average at the time. I feel pretty comfortable that the original scores didn’t change substantially. My geometry class did have some major additions and removals, but since I had their scores I could recalculate.

Mean

Median

Mode

Range
Original

just above 10

9.5

7

22
Recalculated

just below 10 (9.8)

8

7

22

I didn’t have the Math Support scores, and enough students didn’t take the second test that comparisons would be pointless.

One confession: Two Algebra II students, the weakest two in the class, who did no work, scored 23 and 24 wrong, which was 11 more than the next lowest score. Their scores added an entire point to the average wrong, increased the range by 14 points, and you know, I just said bye and stopped them from distorting the results the other 32 kids. (I don’t remember exactly, but the original A2 tests had five or six 20+ wrong scores.)

So here’s the original September graph and the new graph of January:

AlgtestAlgAssessyrend

The geometry class was bimodal: 0 and 10. Excel refused to acknowledge this and I wasn’t sure how to force it. The 10s, as a group, were pretty consistent—only one of them improved by more than a point. The perfect scores ranged from 8 wrong to 2 wrong on the first test.

geoalgclassgrowth

In short, they learned a lot of first year algebra, and that’s because I spent quite a bit of time teaching them first year algebra. In Algebra II, I did it with data modeling, which was a much more sophisticated approach than what they’d had before, but it was still first year algebra. In geometry, I minimize certain standards (proofs, circles, solid shapes) in favor of applied geometry problems with lots of algebra.

And for all that improvement, a still distressing number of students answered x2 + 12 when asked what the product of (x+3) and (x+4) was, including two students who got an A in the class. I beat this into their heads, and STILL some of them forget that.

Some folks are going to draw exactly the wrong impression. “See?” these misguided souls will say, nodding wisely. “Our kids just aren’t being taught properly in early grades. Better standards, better teachers, this problems’s fixed! Until then, this poor teacher has to make up the slack.” In short, these poor fools still believe in the myth that they’ve never been taught.

When in fact, they were taught. Including by me—and I don’t mean the “hey, by the way, don’t forget the middle term in binomial multiplication”, but “you are clubbing orphan seals and making baby Jesus cry when you forget the middle term” while banging myself on the head with a whiteboard. And some of them just forgot anyway.

I don’t know how my kids will do on their state tests, but it’s safe to say that the geometry and second year algebra I exposed them to was considerably less than it would have been had their assessment scores at the beginning of class been the ones they got at the end of class. And because no one wants to acknowledge the huge deficit half or more of each class has in advanced high school math, high schools won’t be able to teach the kids the skills they need in the classes they need—namely, prealgebra for a year, “first year” algebra for two years, and then maybe some geometry and second year algebra. If they do okay on the earlier stuff.

Instead, high schools are forced to pretend that transcripts reflect reality, that all kids in geometry classes are capable of passing a pre-algebra test, much less an algebra one test. Meanwhile, reformers won’t know that I improved my kids’ basic algebra skills whilst still teaching them a lot of geometry/algebra II, because the tests they’ll insist on judging me with will assume a) that the kids had that earlier material mastered or b) that I could just catch them up quickly because after all, the only problem was the kids’ earlier teachers had never taught them.