Tag Archives: Value-added evaluations

The Same Thing Over and Over: Yglesias Edition

(with apologies to Rick Hess, who means exactly the opposite of me when he says it.)

Matt Yglesias is a liberal I’ve followed for years. He’s become more temperate since his signing the Harper’s letter, now that he’s realized how insane the progressive left has become.  But if you want a representative sample of why Democrats turned away from neoliberalism,  Yglesias is your guy. In his recent two part article that’s ostensibly about critical race theory, he rehashes the nostrums he’s been pushing his entire pundit life. Naturally, Twitter moderates were ecstatic. 

If I wrote an angry takedown every time an ed reformer preached nonsense–well, I’d write more, so maybe I should. But Yglesias, despite making a few concessions I was happy to see, shocked me with his implicit….lies? misrepresentations?…ignorance? not sure which.

mattysin1

I really, really wish that people with megaphones could be reasonable about unions. It’s fine to hate them. It’s stupid to think they have much in the way of influence. It’s worse to pretend that education reform proposals failed because teacher unions prevent them. Most egregious of all is to pretend that charter school expansion and merit pay for teachers hasn’t been tried and rejected.

But this is just normal, ordinary middlebrow pabulum. This passage is shocking in its naivete, ignorance, or dishonest–take your pick.

mattysin9

I mean, my god. We have not yet been able to persuasively demonstrate through test scores that particulates, healthy meals,  or even air conditioning in the summer has any impact on student achievement, particularly not at a granular level. We’ve spent billions on free lunch programs and air conditioning and a host of other environmental adjustments. The achievement gap endures. Thestudents  tossed the healthy food in the trash.

As for standardized tests, surely the past thirty years of debate should have informed him that no, everyone does NOT agree that we can assess competence with a test.  Why else are colleges so insistent on committing affirmative action? Why did so many of them seize the opportunity of the George Floyd moment of righteousness to use  GPAs instead of SAT/ACT scores? Why are so many black activists angry about the “achievement gap”? A significant chunk of the institutional left believes those scores are lies–or at least unpleasant ephemera that can safely be ignored.

Most egregious: Measuring teacher impact via student achievement “would be uncontroversial”?  Doesn’t this sound like he thinks VAM would be this terrific, obvious improvement if only policy makers could stand up to unions and put this sucker in place?

The Obama administration wasn’t just “open” to value add–it mandated some form of student performance metrics to any state trying to qualify for Race to the Top funding. Forty three states complied with a strict form of value add by 2014. Twenty three states mandated student performance metrics for teacher tenure decisions. Teachers unions sued endlessly to stop these mandates, and lost, time and again. (Once more, with feeling: unions have no influence on their  own. They win when a major player agrees with them: districts, parents, or politicians.)  

The entire rationale for VAM was first popularized in “The Widget Effect“, an article that argued for more differentiation in teacher evaluations, since 99% of teachers got a good review. But data revealed that three years of VAM resulted in….99% of teachers getting a good review. When states didn’t water down the test component, principals simply juked the stats. 

Research isn’t the conclusive slam dunk that  Yglesias’s “uncontroversial” implies, either: 

  • RAND: VAM are not absolute indicators of teacher effectiveness and are imprecise.
    American Statistical Association: VAM measures correlation not causation, can change substantially based on model used, and show that teachers affect from 1-14% of variability in student test scores.
  • For a complete review, pro and con, of the research, Scott Alexander does his typical deep dive into VAM and finds it wanting, as does the great (and MIA) Spotted Toad.

By 2016 ESSA, the most recent version of the Elementary and Secondary Education Act, had removed all the evaluation mandates. Twenty three states no longer required VAM in evaluations, another fifteen did but left it up to districts to determine implementation.  Public opinion, always split, declined: surveyvamtrend

New Mexico, the one state that genuinely gave bad reviews and fired teachers for insufficient value-add, unwound the entire program with a new administration that explicitly promised to undo the policies wrought by the previous governor Susanna Martinez and her reform darling ed chief, Hanna Skandera.

Yglesias’s representation of VAM as an uncontroversial implementation blocked by only by those unreasonable unions, is absurd. States, desperate for federal funding, implemented a wide range of value-added metrics that infuriated teachers. Public approval dropped, principals show by behavior they didn’t agree with the results, and research is at best equivocal. 

 Yglesias’s casual offhand shilling for charters is at least anodyne, if not original. But a close read reveals an interesting bias. While conservative education reformers emphasize parent choice, read closely and it’s clear Matt would cheerfully override parents and voters if they don’t agree with him. 

Shot #1:

mattysin7

Chaser #1:

mattysin3

So school closures were horrible, “real anger” was unleashed–but only by white parents. Meanwhile, black (and Hispanic and Asian) parents were, er, “less annoyed” (translated: while as many as 3 in 4 white parents wanted schools open, 1 in 2 or fewer non-whites preferred remote education, findings that have been consistent throughout the pandemic). Yglesias is saying, explicitly, that we should not give parents a choice. (That said, he at least acknowledged the racial difference in preferences, which almost no one else mentions, so props for that.)

Shot and Chaser #2:

mattysin10

Democrats have responded to their voters’ preferences by moving away from charters. Voters have rejected charters (he doesn’t mention that California, another previously strong charter state, has flipped a law that banned districts from considering the financial impact of charters–and this is after charter growth in California and the nation had stalled. These are all deep blue states, previously supportive of charters. Yglesias doesn’t have much interest in voter opinion–unless, of course, it agrees with his.

This is all chaser

mattysin8

Everyone who pushes for mayoral control of schools is arguing against voter control. All those school board recalls? Yglesias thinks they’re a bad idea–or at least, they’re a bad idea where non-white and poor parents might make decisions he doesn’t liike. Fine for the angry white parents in Loudon to recall their school board, but where it really matters, where achievement is low, let’s put school control in the hands of an executive. Once again, all this choice is fine unless Yglesias thinks he knows better.

The money quote that everyone’s been retweeting about nearly made my head explode.

mattysin6

Oh, hey, integration, choice, curriculum, merit pay, healthy food and higher teacher standards!! Damn. Here we could have knocked out decades of achievement gaps if we’d just known about these obvious policy changes and put them into action.

Oh,wait, we did. We tried. They failed. Achievement gap has been stable for years. NAEP scores stalled and then dropped after 20 years of the reformers running the table.  And the kids dumped the healthy food in the trash. What now? 

More money won’t help. School choice won’t help. Firing teachers won’t help.

Maybe education policy should start by realizing schools are doing a pretty good job, given the idiotic constraints imposed on them by people who don’t understand the limits of education. Maybe we should change some laws, drop others, and ensure we spend money on our neediest citizens (ask yourself how much Title I funding is going to Afghani refugees and border asylum claimants?). None of these failures mean that teachers don’t matter or that we can’t improve schools. But we have to understand what “improvement’ means. Most of the people screaming for better schools won’t approve.

I try not to be depressed by the regular evidence that the vast majority of people with megaphones don’t understand education. But it really was horrifying how many people approved of Yglesias’s recipe for improving schools, how few of them seem to understand what has already been tried and failed or tried and rejected dozens of times in the past fifty years. And hell, I needed something to write about. I’m stalled on three other pieces.

But in the interest of comity:

mattysin5

This, finally, is correct. 

Note: I’ve written two articles on Value Add, one of which goes through the obvious logical failings,  the other outlining the voter political rejection mentioned here.

Also, I don’t spend enough time praising Freddie deBoer, who is writing fantastic reality about education from the left. He might be a socialist or a Marxist or whatever, but he’s much more of a realist than anyone with a similar audience and mainstream politics. I particularly liked his article on college admissions (which led to one of the pieces I’m stuck on) and on resisting blank slate thinking

Just a reminder that when I’m trying to write something, I do the second draft instead of the tenth to get anything out at all.

 


Bush/Obama Ed Reform: Victory over Value Add

(I was writing my final article on this era when I realized I hadn’t really focused completely on the history of Value Added Metrics (VAM) in my original coverage of the Obama years. I am saying this because VAM sprites both pro and con are holding me at gunpoint demanding I write an article all about them.)

In 2009, The New Teacher Project’s The Widget Effect declared that schools treated all teachers as interchangeable units, didn’t bother to train new teachers, refused to fire tenured teachers, and worse, gave all teachers high ratings.  99% of teachers got ratings of Proficient or higher! The shame!

Mind you, none of these are new declarations, but this paper initiated the argument that allowed Obama and Duncan (as I wrote here)  to demand that states evaluate teachers with student achievement, and that achievement must be test scores. Thus, one of the requirements for a Duncan “waiver” from No Child Left Behind school “program improvement penalities”, which by now were affecting over half of all schools, was that the state must begin evaluating teacher effectiveness using data–just another word for VAM.

Put another way, Obama and Duncan allowed states to escape schoolwide accountability for student test scores by forcing them to agree to teacher accountability for student test scores.

In 2009, 10 states required evaluation to include student achievement metrics. By 2015, 43 states required value-added metrics for evaluation. Most courts agreed that the usually hasty and poorly thought through implementation plans were absurd and unfair, but declined to step in. There were some notable exceptions, as you’ll see. (Note: I wrote a longer opinion of VAM that includes more info.)

From 1% Ineffective to…..?

By now, no one should be surprised to learn that these efforts were a spectacular failure, although rarely reported in just those terms. But by 2019, only 34 states required it, and most other states still requiring them on paper had watered down the impact by dramatically reducing the VAM component, making VAM optional, removing the yearly requirement for teacher evaluations, or allowing schools to design their own metrics.

In the definitive evaluation, Harvard researchers studied 24 states that implemented value-added metrics and learned that principals refused to give teachers bad ratings. In fact, principals would rate teachers lower in confidential ratings than in formal ones, although in either method the average score was a positive evaluation.  When asked, principals said that they felt mean giving the bad results (which suggests they didn’t agree with them). Moreover, many principals worried that if they gave a bad review, the teachers might leave–or worse, force the principal to begin firing procedures. Either way, the principal might end up forced to hire a teacher no better or possibly worse.

Brief aside: Hey, that should sound familiar to long-time readers . As I wrote seven years ago: “…most principals don’t fire teachers often because it’s incredibly hard to find new ones.”. Or as I put it on Twitter back when it allowed only 140 characters, “Hiring, not firing, is the pain point.” 

So the Obama administration required an evaluation method that would identify bad teachers for firing or training, and principals are worried that the teachers might leave or get fired. That’s….kind of a problem. 

Overall, the Harvard study found that only two of them gave more than 1% of teachers unsatisfactory ratings.

If you do the math, 100% – 1% = 99% which is exactly what the Widget effect found, so that was a whole bunch of money and energy spent for no results.

New Mexico

The study’s outlier was New Mexico, which forced principals to weight VAM as 50% of the overall evaluation score, courtesy of Hanna Skandera, a committed reform education secretary appointed by a popular Republican governor. As a result, over 1 in 4 teachers were rated unsatisfactory.

But! A 2015 court decision prevented any terminations based on the evaluation system, and the case got delayed until it was irrelevant. In 2017, Governor Martinez agreed to a compromise on the evaluation methodology, increasing permitted absences to six and dropping VAM from 50% to 35%. New Mexico also completed its shift from a purple to blue state, and in 2018 all the Democratic gubernatorial candidates promised they would end the evaluation system. The winner, Michelle Lujan, wasted no time. On January 3, 2019, a perky one-page announcement declared that VAM was ended, absences wouldn’t count on evaluations, and just for good measure she ended PARCC.

So the one state in which principals couldn’t juke the stats to keep teachers they didn’t want to fire, the courts stepped in, the Republican governor backed down, and the new Democrat governor rendered the whole fuss moot.

California

California had always been a VAM outlier, as governor Jerry Brown steadfastly refused the waiver bribes .Students Matter, an organization founded by a tech entrepreneur, engaged in a two-pronged attempt to force California into evaluation compliance–first by suing to end teacher tenure (Vergara) and then by forcing evaluation by student test scores (Doe vs. Antioch).  Triumphalists hailed the original 2014 Vergara decision that overturned the protections of teacher tenure, and even the  more cautiously optimistic believed that the California appeals court might overturn the decision, but the friendlier California Supreme Court would side with the plaintiffs and end tenure. The appeals court did overturn, and the CA Supreme Court….declined to review, letting the appellate ruling stand. 

Welch and Students Matter likewise tried to force California schools to read its 1971 Stull Act as requiring teachers to be evaluated by test scores. That failed, too.  No appeal.

Upshot

“Experts” often talk about forcing education in America to follow market-based principles. But in the VAM failure, the principals are following those principles! (hyuk.) As I’ve also written many times, there is, in fact, a teacher shortage. But at the same time, even the confidential evaluations demonstrate that the vast majority of teachers are doing good work by their manager’s estimation.

As a teacher, I would be interested in learning whether I had an impact on my students’ scores. I’d be more interested, really, in whether my teaching methods were helping all students equally, or if there were useful skews. Were my weakest students, the ones who really weren’t qualified for the math I was teaching, being harmed, unlearning some of the earlier skills that could have been enforced? Was my practice of challenging the strongest students with integrated problem solving and cumulative applications of material keeping them in the game compared to other students whose teachers taught more faster, tested only on new material, and gave out practice tests?

But the idea that any teachers other than, perhaps, reading teachers in elementary school could be accurately assessed on their performance by student learning is just absurd.

Any teacher could have told you that. Many teachers did tell the politicians and lobbyists and billionaires that. But teachers are the peasants and plebes of the cognitive elite, so the country had to waste billions only to get right back to where we started. Worse: they still haven’t learned.

( I swear I began this article as the final one in the series until I realized VAM was pulling focus. I really do have that one almost done. Happy New Year.)

Next up–and Finally! Bush/Obama Ed Reform: It All Came Tumbling Down


Algebra 1 Growth in Geometry and Algebra II

Last September, I wrote about my classes and the pre-algebra/Algebra 1 assessment results.

My school covers a year of instruction in a semester, so we just finished the first “year” of courses. I start with new students and four preps on Monday. Last week, I gave them the same assessment to see if they’d improved.

Unfortunately, the hard drive on my school computer got wiped in a re-imaging. This shouldn’t have been a problem, because I shouldn’t have had any data on the hard drive, except I never got put on the network. Happily, I use Dropbox for all my curriculum development, so an entire year’s worth of intellectual property wasn’t obliterated. I only lost the original assessment results, which I had accidentally stored on the school hard drive. I should have entered the scores in the school grading system (with a 0 weight, since they don’t count towards the grade) but only did that for geometry, the only class I can directly compare results with.

My algebra II class, though, was incredibly stable. I only lost three students, one of whom got a perfect score—which the only new addition to the class also got, so balance maintained. The other two students who left got around 10-15 wrong, so were squarely in the average at the time. I feel pretty comfortable that the original scores didn’t change substantially. My geometry class did have some major additions and removals, but since I had their scores I could recalculate.

Mean

Median

Mode

Range
Original

just above 10

9.5

7

22
Recalculated

just below 10 (9.8)

8

7

22

I didn’t have the Math Support scores, and enough students didn’t take the second test that comparisons would be pointless.

One confession: Two Algebra II students, the weakest two in the class, who did no work, scored 23 and 24 wrong, which was 11 more than the next lowest score. Their scores added an entire point to the average wrong, increased the range by 14 points, and you know, I just said bye and stopped them from distorting the results the other 32 kids. (I don’t remember exactly, but the original A2 tests had five or six 20+ wrong scores.)

So here’s the original September graph and the new graph of January:

AlgtestAlgAssessyrend

The geometry class was bimodal: 0 and 10. Excel refused to acknowledge this and I wasn’t sure how to force it. The 10s, as a group, were pretty consistent—only one of them improved by more than a point. The perfect scores ranged from 8 wrong to 2 wrong on the first test.

geoalgclassgrowth

In short, they learned a lot of first year algebra, and that’s because I spent quite a bit of time teaching them first year algebra. In Algebra II, I did it with data modeling, which was a much more sophisticated approach than what they’d had before, but it was still first year algebra. In geometry, I minimize certain standards (proofs, circles, solid shapes) in favor of applied geometry problems with lots of algebra.

And for all that improvement, a still distressing number of students answered x2 + 12 when asked what the product of (x+3) and (x+4) was, including two students who got an A in the class. I beat this into their heads, and STILL some of them forget that.

Some folks are going to draw exactly the wrong impression. “See?” these misguided souls will say, nodding wisely. “Our kids just aren’t being taught properly in early grades. Better standards, better teachers, this problems’s fixed! Until then, this poor teacher has to make up the slack.” In short, these poor fools still believe in the myth that they’ve never been taught.

When in fact, they were taught. Including by me—and I don’t mean the “hey, by the way, don’t forget the middle term in binomial multiplication”, but “you are clubbing orphan seals and making baby Jesus cry when you forget the middle term” while banging myself on the head with a whiteboard. And some of them just forgot anyway.

I don’t know how my kids will do on their state tests, but it’s safe to say that the geometry and second year algebra I exposed them to was considerably less than it would have been had their assessment scores at the beginning of class been the ones they got at the end of class. And because no one wants to acknowledge the huge deficit half or more of each class has in advanced high school math, high schools won’t be able to teach the kids the skills they need in the classes they need—namely, prealgebra for a year, “first year” algebra for two years, and then maybe some geometry and second year algebra. If they do okay on the earlier stuff.

Instead, high schools are forced to pretend that transcripts reflect reality, that all kids in geometry classes are capable of passing a pre-algebra test, much less an algebra one test. Meanwhile, reformers won’t know that I improved my kids’ basic algebra skills whilst still teaching them a lot of geometry/algebra II, because the tests they’ll insist on judging me with will assume a) that the kids had that earlier material mastered or b) that I could just catch them up quickly because after all, the only problem was the kids’ earlier teachers had never taught them.