Category Archives: testing

NAEP TUDA Scores—Detroit isn’t Boston

So everyone is a-twitter over NAEP TUDA (Trial Urban District Assessment) scores. For those who aren’t familiar with The Nation’s Report Card, the “gold standard” of academic achievement metrics, it samples performance rather than test every student. For most of its history, NAEP only provided data at the state level. But some number of years ago, NAEP began sampling at the district level, first by invitation and then accepting some volunteers.

I don’t know that anyone has ever stated this directly, but the cities selected suggest that NAEP and its owners are awfully interested in better tracking “urban” achievement, and by “urban” I mean black or Hispanic.

I’m not a big fan of NAEP but everyone else is, so I try to read up, which is how I came across Andy Smarick‘s condemnation of Detroit, Milwaukee, and Cleveland: “we should all hang our heads in shame if we don’t dramatically intervene in these districts.”

Yeah, yeah. But I was pleased that Smarick presented total black proficiency, rather than overall proficiency levels. Alas, my takeaway was all wrong: where Smarick saw grounds for a federal takeover, I was largely encouraged. Once you control for race, Detroit looks a lot better. Bad, sure, but only a seventh as bad as Boston.

So I tweeted this to Andy Smarick, but told him that he couldn’t really wring his hands until he sorted for race AND poverty.

He responded “you’re wrong. I sorted by race and Detroit still looks appalling.”

He just scooted right by the second attribute, didn’t he?

Once I’d pointed this out, I got curious about the impact that poverty had on black test scores. Ironic, really, given my never-ending emphasis on low ability, as opposed to low income. But hey, I never said low income doesn’t matter, particularly when evaluating an economically diverse group.

But I began to wonder: how much does poverty matter, once you control for race? For that matter, how do you find the poverty levels for a school district?

Well, it’s been a while since I did data. I like other people to do it and then pick holes. But I was curious, and so went off and did data.

Seventeen days later, I emerged, blinking, with an answer to the second question, at least.

It’s hard to know how to describe what I did during those days, much less put it into an essay. I don’t want to attempt any sophisticated analysis—I’m not a social scientist, and I’m not trying to establish anything certain about the impact of poverty on test scores, an area that’s been studied by people with far better grades than I ever managed. But at the same time, I don’t think most of the educational policy folk dig down into poverty or race statistics at the district level. So it seemed like it might be worthwhile to describe what I did, and what the data looks like. If nothing else, the layperson might not know what’s involved.

If my experience is any guide, it’s hard finding poverty rates for children by race. You can get children in poverty, race in poverty, but not children by race in poverty. And then it appears to be impossible to find enrolled children in a school district—not just who live in it, which is tough enough—by poverty. And then, of course, poverty by enrollment by race.

First, I looked up the poverty data here (can’t provide direct links to each city).

But this is overall poverty by race, not child poverty by race, and it’s not at the district level, which is particularly important for some of the county data. However, I’m grateful to that site because it led me to American Community Survey Factfinder, which organizes data by all kinds of geographic entities—including school districts—and all kinds of topics–including poverty—on all sorts of groups and individuals—including race. Not that this is news to data geeks, which I am not, so I had to wander around for a while before I stumbled on it.

Anyway. I ran report 1701 for the districts in question. If I understand googledocs, you can save yourself the trouble of running it yourself. But since the report is hard to read, I’ll translate. Here are the overall district black poverty rates for the NAEP testing regions:

ACSdistrictblkpoverty

Again, these are for the districts, not the cities.

(Am I the only one who’s surprised at how relatively low the poverty rates are for New York and DC? Call me naïve for not realizing that the Post and the Times are provincial papers. Here I thought they focused on their local schools because of their inordinately high poverty rates, not their convenient locations. Kidding. Kind of.)

But these rates are for all blacks in the district, not black children. Happily, the ACS also provides data on poverty by age and race, although you have to add and divide in order to get a rate. But I did that so you don’t have to–although lord knows, my attention to detail isn’t great so it should probably be double or triple checked. So here, for each district, are the poverty rates for black children from 5-17:

ACSblk517poverty

In both cases, Boston and New York have poverty rates a little over half those of the cities with the highest poverty rates—and isn’t it coincidental that the four cities with the lowest black NAEP scores have the highest black poverty rates? Weird how that works.

But the NAEP scores and the district data don’t include charter or private schools in the zone, and this impacts enrollment rates differently. So back to ACS to find data on age and gender, and more combining and calculating, with the same caveats about my lamentable attention to detail. This gave me the total number of school age kids in the district. Then I had to find the actual district enrollment data, most of which is in another census report (relevant page here) for the largest school districts. The smaller districts, I just went to the website.

Results:

naepdistenrollrate

Another caveat–some of these data points are from different years so again, some fuzziness. All within the last three or four years, though.

So this leads into another interesting question: the districts don’t report poverty anywhere I can find (although I think some of them have the data as part of their Title I metrics) and in any event, they never report it by race. I have the number and percent of poor black children in the region, but how many of them attend district schools?

So to take Cleveland, for example, the total 5-17 district population was 67,284. But the enrolled population was 40871, or 60.7% of the district population.

According to ACS, 22,445 poor black children age 5-17 live in the district, and I want an approximation of the black and overall poverty rates for the district schools. How do I apportion poverty? I do not know the actual poverty rate for the district’s black kids. I saw three possibilities:

  1. I could use the black child poverty rate for the residents of the Cleveland district (ACS ratio of poor black children to ACS total black children). That would assume (I think) that the poor black children were evenly distributed over district and non-district schools.
  2. I could have take the enrollment rate and multiplied that by the poor black children in ACS—and then use that to calculate the percentage of poor kids from blacks enrolled.
  3. I could assign all the black children in poverty (according to ACS) to the black children enrolled in the district (using district given percentage of black children enrolled).

Well, the middle method is way too complicated and hurts my head. Plus, it didn’t really seem all that different from the first method; both assume poor black kids would be just as likely to attend a charter or private school than they would their local district school. The third method assumes the opposite—that kids in poverty would never attend private or charter schools. This method would probably overstate the poverty rates.

So here are poverty levels calculated by methods 1 and 3–ACS vs assigning all the poor black students to the district. In most cases, the differences were minor. I highlight the districts that have greater than 10 percentage points difference.

naepweightingpov

Again, is it just a coincidence that the schools with the lowest enrollment rates and the widest range of potential poverty rates have some of the lowest NAEP scores?

Finally, after all this massaging, I had some data to run regression analysis on. But I want to do that in a later post. Here, I want to focus on the fact that gathering this data was ridiculously complicated and required a fair amount of manual entry and calculations.

If I didn’t take the long way round, I suspect this effort is why researchers use the National Student Lunch Program (“free and reduced lunch”) as a poverty proxy.

The problem is that the poverty proxy sucks, and we need to stop using it.

Schools and districts have noticed that researchers use National School Lunch enrollment numbers as a proxy for poverty, and it’s also a primary criterion for Title I allocations. So it’s hard not to wonder about Boston’s motives when the district decides to give all kids free lunches regardless of income level, and whether it’s really about “awkward socio-economic divides” and “invasive questions”. The higher the average income of a district’s “poor” kids, the easier it is to game the NCLB requirements, for example.

Others use the poverty proxy to compare academic outcomes and argue for their preferred policy, particularly on the reform side of things. For example, charter school research uses the proxy when “proving” they do a “great job educating poor kids” when in fact they might just be skimming the not-quite-as-poor kids and patting themselves on the back. We can’t really tell. And of course, the NAEP uses the poverty proxy as well, and then everyone uses it to compare the performance of “poor” kids. See for example, this analysis by Jill Barshlay, highlighted by Alexander Russo (with Paul Bruno chiming in to object to FRL as poverty proxy). Bruce Baker does a lot of work with this.

To see exactly how untrustworthy the “poverty proxy is”, consider the NAEP TUDA results broken down by participation in the NSLP.

naepfrlelig

Look at all the cities that have no scores for blacks who aren’t eligible for free or reduced lunch: Boston, Cleveland, Dallas, Fresno, Hillsborough County, Los Angeles, Philadelphia, and San Diego. These cities apparently have no blacks with income levels higher than 180% of poverty. Detroit can drum up non-poor blacks, but Hillsborough County, Boston, Dallas, and Philadelphia can’t? That seems highly unlikely, given the poverty levels outlined above. Far more likely that the near-universal poverty proxy includes a whole bunch of kids who aren’t actually poor.

In any event, the feds, after giving free lunches to everyone, decided that NSLP participation levels are pretty meaningless for deciding income levels “…because many schools now automatically enroll everyone”.

I find this news slightly cheering, as it suggests that I’m not the only one having a hard time identifying the actually poor. Surely this article would have mentioned any easier source?

So. If someone can come back and say “Ed, you moron. This is all in a table, which I will now conveniently link in to show you how thoroughly you wasted seventeen days”, I will feel silly, but less cynical about education policy wonks hyping their notions. Maybe they do know more than I do. But it’s at least pretty likely that no one is looking at actual district poverty rates by race when fulminating about academic achievement, because what I did wasn’t easy.

Andy Smarick, at any rate, wasn’t paying any attention to poverty rates. And he should be. Because Detroit isn’t Boston.

This post is long enough, so I’ll save my actual analysis data for a later post. Not too much later, I hope, since I put a whole bunch of work into it.


Algebra 1 Growth in Geometry and Algebra II, Spring 2013

This is part of an ongoing series on my Algebra II and Geometry classes. By definition, students in these classes should have some level of competence in Algebra I. I’ve been tracking their progress on an algebra I pre-assessment test. The test assesses student ability to evaluate and substitute, use PEMDAS, solve simple equations, operate with negative integers, combine like terms. It tiptoes into first semester algebra—linear equations, simple systems, basic quadratic factoring—but the bulk of the 50 questions involve pre-algebra. While I used the test at my last school, I only thought of tracking student progress this year. My school is on a full-block schedule, which means we teach a year’s content in a semester, then repeat the whole cycle with another group of students. A usual teacher schedule is three daily 90-minute classes, with a fourth period prep. I taught one algebra II and one geometry class first semester (the third class prepared low ability students for a math graduation test), their results are here.

So in round two, I taught two Algebra 2 courses and one Geometry 10-12 (as well as a precalc class not part of this analysis). My first geometry class was freshmen only. In my last school, only freshmen who scored advanced or proficient on their 8th grade algebra test were put into geometry, while the rest take another year of algebra. In this school, all a kid has to do is pass algebra to be put into geometry, but we offer both honors and regular geometry. So my first semester class, Geometry 9, was filled with well-behaved kids with extremely poor algebra skills, as well as a quarter or so kids who had stronger skills but weren’t interested in taking honors.

I was originally expecting my Geometry 10-12 class to be extremely low ability and so wasn’t surprised to see they had a lower average incoming score. However, the class contained 6 kids who had taken Honors Geometry as freshmen—and failed. Why? They didn’t do their homework. “Plus, proofs. Hated proofs. Boring,” said one. These kids knew the entire geometry fact base, whether or not they grokked proofs, which they will never use again. I can’t figure out how to look up their state test scores yet, but I’m betting they got basic or higher in geometry last year. But because they were put into Honors, they have to take geometry twice. Couldn’t they have been given a C in regular geometry and moved on?

But I digress. Remember that I focus on number wrong, not number right, so a decrease is good.

Alg2GeomAlg1Progress

Again, I offer up as evidence that my students may or may not have learned geometry and second year algebra, but they know a whole lot more basic algebra than they did when they entered my class. Fortunately, my test scores weren’t obliterated this semester, so I have individual student progress to offer.

I wasn’t sure the best way to do this, so I did a scatter plot with data labels to easily show student before/after scores. The data labels aren’t reliably above or below the point, but you shouldn’t have to guess which label belongs to which point.

So in case you’re like me and have a horrible time reading these graphs, scores far over to the right on the x-axis are those who did poorly the first time. Scores low on the y-axis are those who did well the second time. So high right corner are the weak students at both beginning and end. The low left corner are the strong students who did well on both.

Geometry first. Thirty one students took both tests.

Spring2013GeomIndImprovement

Four students saw no improvement, another four actually got more wrong, although just 1 or 2 more. Another 3 students saw just one point improvement. But notice that through the middle range, almost all the students saw enormous improvement: twelve students, over a third, got from five to sixteen more correct answers, that is, improved from 10% to over 30%.

Now Algebra 2. Forty eight students took both tests; I had more testers at the end than the beginning; about ten students started a few days late.

Spring2013A2IndImprovement

Seven got exactly the same score both times, but only three declined (one of them a surprising 5 points—she was a good student. Must not have been feeling well). Eighteen (also a third) saw improvements of 5 to 16 points.

The average improvement was larger for the Algebra 2 classes than the Geometry classes, but not by much. Odd, considering that I’m actually teaching algebra, directly covering some of the topics in the test. In another sense, not so surprising, given that I am actually tasked to teach an entirely different topic in both cases. I ain’t teaching to this test. Still, I am puzzled that my algebra II students consistently show similar progress to my geometry students, even though they are soaked in the subject and my geometry students aren’t (although they are taught far more algebra than is usual for a geometry class).

I have two possible answers. Algebra 2 is insanely complex compared to geometry, particularly given I teach a very slimmed-down version of geometry. The kids have more to keep track of. This may lead to greater confusion and difficulty retaining what they’ve learned.

The other possibility is one I am reminded of by a beer-drinking buddy, a serious mathematician who is also teaches math: namely, that I’m a kickass geometry teacher. He bases this assertion on a few short observations of my classes and extensive discussions, fueled by many tankards of ale, of my methods and conceptual approaches (eg: Real-life coordinate Geometry, Geometry: Starting Off, Teaching Geometry,Teaching Congruence or Are You Happy, Professor Wu?, Kicking Off Triangles, Teaching Trig).

This possibility is a tad painful to contemplate. Fully half the classes I’ve taught in my four years of teaching—twelve out of twenty four—have been some form of Algebra, either actual Algebra I or Algebra I pretending to be Algebra II. I spend hours thinking about teaching algebra, about making it more understandable, and I believe I’ve had some success (see my various posts on modeling).

Six of those 24 classes have been geometry. Now, I spend time thinking about geometry, too, but not nearly as much, and here’s the terrible truth: when I come up with a new method to teach geometry, whether it be an explanation or a model, it works for a whole lot longer than my methods in algebra.

For example, I have used all the old standbys for identifying slope direction, as well as devising a few of my own, and the kids are STILL doing the mental equivalent of tossing a coin to determine if it’s positive or negative. But when I teach my kids how to find the opposite and adjacent legs of an angle (see “teaching Trig” above), the kids are still remembering it months later.

It is to weep.

I comfort myself with a few thoughts. First, it’s kind of cool being a kickass geometry teacher, if that is my fate. It’s a fun class that I can sculpt to my own design, unlike algebra, which has a billion moving parts everyone needs again.

Second, my algebra II kids say without exception that they understand more algebra than they ever did in the past, that they are willing to try when before they just gave up. Even the top kids who should be in a different class tell me they’ve learned more concepts than before, when they tended to just plug and play. My algebra 2 kids are often taking math placement tests as they go off to college, and I track their results. Few of them are ending up in more than one class out of the hunt, which would be my goal for them, and the best are placing out of remediation altogether. So I am doing something right.

And suddenly, I am reminded of my year teaching all algebra, all the time, and the results. My results look mediocre, yet the school has a stunningly successful year based on algebra growth in Hispanic and ELL students—and I taught the most algebra students and the most of those particular categories.

Maybe what I get is what growth looks like for the bottom 75% of the ability/incentive curve.

Eh. I’ll keep mulling that one. And, as always, spend countless hours trying to think up conceptual and procedural explanations that sticks.

I almost titled this post “Why Merit Pay and Value Added Assessment Won’t Work, Part IA” because if you are paying attention, that conclusion is obvious. But after starting a rant, I decided to leave it for another post.

Also glaringly on display to anyone not ignorant, willfully obtuse, or deliberately lying: Common Core standards are irrelevant. I’d be cynically neutral on them because hell, I’m not going to change what I do, except the tests will cost a fortune, so go forth ye Tea Partiers, ye anti-test progressives, and kill them standards daid.


Why Merit Pay and Value Added Assessment Won’t Work, Part I

The year I taught Algebra I, I did a lot of data collection, some of which I discussed in an earlier post. Since I’ve been away from that school for a while, I thought it’d be a good time to finish the discussion.

I’m not a super stats person. I’m not even a mathematician. To the extent I know math, it’s applied math, with the application being “high school math problems”. This is not meant to be a statistically sound analysis, comparing Treatment A to Treatment B. But it does reveal some interesting big picture information.

This data wasn’t just sitting around. A genuine DBA could have probably whipped up the report in a few hours. I know enough SQL to get what I want, but not enough to get it quickly. I had to run reports for both years, figure out how to get the right fields, link tables, blah blah blah. I’m more comfortable with Excel than SQL, so I dumped both years to Excel files and then linked them with student id. Unfortunately, the state data did not include the subject name of each test. So I could get 2010 and 2011 math scores, but it took me a while to figure out how to get the 2010 test taken—and that was a big deal, because some of the kids whose transcripts said algebra had, in fact, taken the pre-algebra (general math) test. Not that I’m bitter, or anything.

Teachers can’t get this data easily. I haven’t yet figured out how to get the data for my current school, or if it’s even possible. I don’t know what my kids’ incoming scores are, and I still haven’t figured out how my kids did on their graduation tests.

So the data you’re about to see is not something teachers or the general public generally has access to.

At last school, in the 2010-11 school year, four teachers taught algebra to all but 25 of over 400 students. I had the previous year’s test scores for about 75% of the kids, 90% of whom had taken algebra the year before, the other 10% or so having taken pre-algebra. This is a slightly modified version of my original graph; I put in translations of the scores and percentages.

algallocdist

You should definitely read the original post to see all the issues, but the main takeaway is this: Teacher 4 has a noticeably stronger population than the other three teachers, with over 40% of her class having scored Basic or Higher the year before, usually in Algebra. I’m Teacher 3, with by far the lowest average incoming scores.

The graph includes students for who I had 2010 school year math scores in any subject. Each teacher has from 8-12 pre-algebra student scores included in their averages. Some pre-algebra kids are very strong; they just hadn’t been put in algebra as 8th graders due to an oversight. Most are extremely weak. Teachers are assessed on the growth of kids repeating algebra as well as the kids who are taking it for the first time. Again, 80% of the kids in our classes had taken algebra once. 10-20% had taken it twice (our sophomores and juniors).

Remember that at the time of these counts, I had 125 students. Two of the other teachers (T1 and T4) had just under 100, the third (T2) had 85 or so. The kids not in the counts didn’t have 2010 test scores. Our state reports student growth for those with previous years’ scores and ignores the rest. The reports imply, however, that the growth is for all students. Thanks, reports! In my case, three or four of my strongest students were missing 2010 scores, but the bulk of my students without scores were below average.

So how’d we do?

I limited the main comparison to the 230 students who took algebra for both years and had scores for both years and had one of 4 teachers.

scoreimpalg

Here are the pre-algebra and algebra intervention growth–pre-algebra is not part of the above scores, but the algebra intervention is a sub-group. These are tiny groups, but illustrative:

scoreimpother

The individual teacher category gains/slides/pushes are above; here they are in total:
myschooltotcatchg

(Arrrggh, I just realized I left off the years. Vertical is 2010, horizontal is 2011.)

Of the 230 students who took algebra two years in a row, the point gain/loss categories went like this:

Score change > + 50 points 57
Score change > -20 points 27
-20 points < score change < + 50 points 146

Why the Slice and Dice?

As I wrote in the original post, Teacher 1 and I were positive that Teacher 4 had much stronger student population than we did—and the data supports that belief. Consequently I suspected that no matter how I sliced the data, Teacher 4 would have the best numbers. But I wanted a much better idea of how I’d done, based on the student population.

Because one unshakeable fact kept niggling at me: our school had a tremendous year in 2010-2011, based largely on our algebra scores. We knew this all throughout the year—benchmark tests, graduation tests—and our end of year tests confirmed it, giving us a huge boost in the metrics that principals and districts cared about. And I’d taught far more algebra students than any other teacher. Yet my numbers based on the district report looked mediocre or worse. I wanted to square that circle.

The district reports the data on the right. We were never given average score increase. A kid who had a big bump in average score was irrelevant if he or she didn’t change categories, while a kid who increases 5 points from the top of one category to the bottom of another was a big win. All that matters were category bumps. From this perspective, my scores look terrible.

I wanted to know about the data on the left. For example Teacher 1 had far better “gain” category numbers than I did. But we had the same mean improvement overall, of 5%, with comparable increases in each category. Broken down further, Teacher 4′s spectacular numbers are accompanied by a huge standard deviation—she improved some kids a lot. The other three teachers might not have had as dramatic a percentage increase, but the kids moved up more consistently. In three cases, the average score declined, but was accompanied by a big increase in standard deviation, suggesting many of the kids in that category improved a bit, while a few had huge drops. Teacher 2 and I had much tighter achievement numbers—I may have moved my students less far, but I moved a lot of them a little bit. None of this is to argue for one teacher’s superiority over another.

Of course, once I broke the data down by initial ability, group size became relevant but I don’t have the overall numbers for each teacher, each category, to calculate the confidence interval or a good sample size. I like 10. Eleven of the 18 categories hit that mark.

How many kids have scores for both years?

The 2011 scores for our school show that just over 400 students took the algebra test. My fall 2010 graph above show 307 students with 2010 scores (in any subject) who began the year. Kick in another 25 for the teacher I didn’t include and we had about 330 kids with 2010 scores. My results show 230 kids with algebra scores for both years, and the missing teacher had 18, making 248. Another 19 kids had pre-algebra scores for the first year, although the state’s reports wouldn’t have cared about that. So 257 of the kids had scores for both years, or about 63% of the students tested.

Notice that I had the biggest fall off in student count. I think five of my kids were expelled before the tests, another four or so left to alternative campuses. I remember that two went back to Mexico; one moved to his grandparents’ in Iowa. Three of my intervention students were so disruptive during the tests that they were ejected, so their test results were not scored (the next year our school had a better method of dealing with disruptive students). Many of the rest finished the year and took the tests, but they left the district over the summer (not sure if they are included in the state reports, but I couldn’t get their data). I think I had the biggest fall-off over the year in the actual student counts; I went from 125 to 95 by year-end.

What about the teachers?

Teacher 1: TFA, early-mid 20s, Asian, first year teacher. Had a first class honors masters degree in Economics from one of the top ten universities in Europe. She did her two, then left teaching and is now doing analytics for a fashion firm in a city where “fashion firm” is a big deal. She was the best TFAer I’ve met, and an excellent new teacher.

Teacher 2: About 60. White. A 20-year teacher who started in English, took time off to be a mom, then came back and got a supplemental math credential. She is only qualified to teach algebra. She is the prototype for the Teacher A I described in my last post, an algebra specialist widely regarded as one of the finest teachers in the district, a regard I find completely warranted.

Teacher 3: Me. 48 at the time, white. Second career, second year teacher, English major originally but a 15-year techie. Went to one of the top-rated ed schools in the country.

Teacher 4: Asian, mid-late 30s. Math degree from a solid local university, teaches both advanced math and algebra. She became the department head the next year. The reason her classes are top-loaded with good students: the parents request her. Very much the favorite of administration and district officials.

And so, a Title I school, predominantly Hispanic population (my classes were 80% Hispanic), teachers that run the full gamut of desirability—second career techie from a good ed school, experienced pro math major, experienced pro without demonstrated higher math ability, top-tier recent college grad.

Where was the improvement? Case 1: Educational Policy Objectives

So what is “improvement”? Well, there’s a bunch of different answers. There’s “significant” improvement as researchers would define it. Can’t answer that with this data. But then, that’s not really the point. Our entire educational policy is premised on proficiency. So what improvement does it take to reach “proficiency”, or at least to change categories entirely?

Some context: In our state, fifty points is usually enough to move a student from the bottom of one category to the bottom of another. So a student who was at the tip top of Below Basic could increase 51 points and make it to the bottom of Proficient, which would be a bump of two categories. An increase of 50 points is, roughly, a 17% increase. Getting from the bottom of Far Below Basic to Below Basic requires an increase of 70%, but since the kids were all taking Algebra for the second time, the boost needed to get them from FBB to BB was a more reasonable 15-20%. To get from the top of the Far Below Basic category to Proficient—the goal that we are supposed to aim for—would require a 32% improvement. Improving from top of Basic to bottom of Advanced requires a 23% improvement.

Given that context, only two of the teachers in one category each moved the needle enough to even think about those kind of gains—and both categories had 6-8 students. Looking at categories with at least ten students, none of the teachers had average gains that would achieve our educational policy goals. In fact, from that perspective, the teachers are all doing roughly the same.

I looked up our state reports. Our total population scoring Proficient or Advanced increased 1%.

Then there’s this chart again:

myschooltotcatchg

32 students moved from “not proficient” to “proficient/advanced”. 9 students moved from “proficient” to “advanced”. I’ll throw them in. 18% of our students were improved to the extent that, officially, 100% are supposed to achieve.

So educational policy-wise, not so good.

Where was the improvement? Case 2: Absolute Improvement

How about at the individual level? The chart helps with that, too:

myschooltotcatchg

Only 18 students were “double gainers” moving up two categories, instead of 1. Twelve of those students belonged to Teacher 4; 4 belonged to Teachers 1 , while Teacher 2 and I only had 1 (although I had two more that just missed by under 3 points). Teachers 1, 2, and 3 had one “double slider” each, who dropped two categories.

(I interviewed all the teachers on the double gainers; in all cases, the gains were unique to the students. The teachers all shrugged—who knew why this student improved? It wasn’t some brilliant aha moment unique to that teacher’s methods, nor was it due to the teacher’s inspiring belief and/or enthusiasm. Two of the three echoed my own opinion: the students’ cognitive abilities had just developed over the past year. Or maybe for some reason they’d blown off the test the year before. I taught two of the three “double sliders”—one was mine, one I taught the following year in geometry, so I had the opportunity to ask them about their scores. Both said “Oh, yeah, I totally blew off the test.” )

So a quarter of the students had gains sufficient to move from the middle of one category to the middle of another. The largest improvement was 170 points, with about 10 students seeing >100 point improvement. The largest decline was 169 points, with 2 students seeing over 100 point decline. Another oddity: only one of these two students was a “double slider”. The other two “double sliders” had less than 100 point declines. My double slider had a 60 point decline; my largest point decline was 89 points, but only dropped one category.

However, the primary takeaway from our data is that 63% of the students forced to take algebra twice were, score-wise if not category-wise, a “push”. They dropped or gained slightly, may have moved from the bottom of one category to the middle of the same, or maybe from the top of one category to the bottom of another.

One might argue that we wasted a year of their lives.

State reports say our average algebra score from 2010 to 2011 nudged up half a point.

So it’s hard to find evidence that we made much of a difference to student achievement as a whole.

I know this is a long post, so I’ll remind the reader that all of the students in my study have already taken algebra once. Chew on that for a while, will you?

Where was the improvement? Case 3: Achievement Gap

I had found no answer to my conundrum in my above numbers, although I had found some comfort. Broken down by category, it’s clear I’m in the hunt. But the breakdown doesn’t explain how we had such a stupendous year.

But when I thought of comparing our state scores from year to year, I got a hint. The other way that schools can achieve educational policy objectives is by closing the achievement gap.

All of this data comes from the state reports for our school, and since I don’t want to discuss who I am on this blog, I can’t provide links. You’ll have to take my word for it—but then, this entire post is based on data that no one else has, so I guess the whole post involves taking my word for it.

2010-11 Change
Overall + 0.5
Whites - 7.2
Hispanics + 4
EcDis Hisp - 1
ELL + 7

Wow. Whites dropped by seven points, Hispanics overall increased by 4, and non-native speakers (almost entirely Hispanic and economically disadvantaged), increased by 7 points.

So clearly, when our administrator was talking about our great year, she was talking about our cleverness in depressing white scores whilst boosting Hispanics.

Don’t read too much into the decline. For example, I personally booted 12 students, most of them white, out of my algebra classes because they’d scored advanced or proficient in algebra the previous year. Why on earth would they be taking the subject again? No other teacher did this, but I know that these students told their friends that they could get out of repeating Algebra I simply by demanding to be put in geometry. So it’s quite possible that much of the loss is due to fewer white advanced or proficient students taking algebra in the first place.

So who was teaching Hispanics and English Language Learners? While I can’t run reports anymore, I did have my original file of 2010 scores. So this data is incoming students with 2010 scores, not the final 2011 students. Also, in the file I had, the ED and ELL overlap was 100%, and I didn’t care about white or black EDs for this count. Disadvantaged non-ELL Asians in algebra is a tiny number (hell, even with ELL). So I kept ED out of it.

  Hisp ELL
t1 30 21
t2 32 38
t3 48 37
t4 39 12

Well, now. While Teacher 4 has a hefty number of Hispanics, very few of them are poor or ELLs. Teacher 2 seems to have Asian ELLs in addition to Hispanic ELLs. I have a whole bunch of Hispanics, most of them poor and ELL.

So I had the most mediocre numbers, but we had a great year for Hispanic and ELL scores, and I had the most Hispanic and ELL students. So maybe I was inadvertently responsible for depressing white scores by booting all those kids to geometry, but I had to have something to do with raising scores.

Or did I? Matthew DiCarlo is always warning against confusing comparing year to year scores, which are a cross-section of data at a point in time, with comparing student progress at two different points in time. In fact, he would probably say that I don’t have a conundrum, that it’s quite possible for me to have been a crappy teacher who had minimal impact on student achievement compared point to point, while the school’s “cross-section” data, which doesn’t compare students directly, could have some other reason for the dramatic changes.

Fair enough. In that case, we didn’t have a great year, right? It was just random happenstance.

This essay is long enough. So I’ll leave any one interested to explain why this data shows that merit pay and value added scores are pointless. I’m not sure when I’ll get back to it, as I’ve got grades to do.


Spring 2013: These students aren’t really prepared, either.

I’m teaching Geometry and Algebra II again, so I gave the same assessment and got these results, with the beginning scores from the previous semester:

AlgAssessspr13

I’m teaching two algebra II classes, but their numbers were pretty close to identical—one class had the larger range and a lower mode—so I combined them.

The geometry averages are significantly lower than the fall freshmen only class, which isn’t surprising. Kids who move onto geometry from 8th grade algebra are more likely to be stronger math students, although (key plot point) in many schools, the difference between moving on and staying back in algebra come down to behavior, not math ability. At my last school, kids who didn’t score Proficient or Advanced had to take Algebra in 9th grade. I’d have included Basic kids in the “move-on” list as well. But sophomores who not only can’t factor or graph a line, but struggle with simple substition ought not to be in second year algebra. They should repeat algebra I freshman year, go onto geometry, and then take algebra II in junior year—at which point, they’d still be very weak in algebra, of course, but some would have benefited from that second year of first year.

Wait, what was my point? Oh, yeah–this geometry class class is 10-12, so the students took one or more years of high school algebra. Some of them will have just goofed around and flunked algebra despite perfectly adequate to good skills, but a good number will also be genuinely weak at math.

On the other hand, a number of them really enjoyed my first activity: visualizing intersecting planes, graphing 3-D points. I got far more samples from this class. I’ll put those in another post, also the precalc assessment.

I don’t know if my readers (I have an audience! whoo!) understand my intent in publishing these assessment results. In no way am I complaining about my students.

My point in a huge nutshell: how can math teachers be assessed on “value-added” when the testing instrument will not measure what the students needed to learn? Last semester, my students made tremendous gains in first year algebra knowledge. They also learned geometry and second year algebra, but over half my students in both classes will test Below Basic or Far Below Basic–just as they did the year before. My evaluation will faithfully record that my students made no progress—that they tested BB or FBB the year before, and test the same (or worse) now. I will get no credit for the huge gains they made in pre-algebra and algebra competency, because educational policy doesn’t recognize the existence of kids taking second year algebra despite being barely functional in pre-algebra.

The reformers’ response:

1) These kids just had bad teachers who didn’t teach them anything, and in the Brave New World of Reform, these bad teachers won’t be able to ruin students’ lives;

2) These bad teachers just shuffled students who hadn’t learned onto the next class, and in the Brave New World of Reform, kids who can’t do the work won’t pass the class.

My response:

1) Well, truthfully, I think this response is moronic. But more politely, this answer requires willful belief in a delusional myth.

2) Fail 50-60% of kids who are forced to take math classes against their will? Seriously? This answer requires a willful refusal to think things through. Most high schools require a student to take and pass three years of math for graduation. Fail a kid just once, and the margin for error disappears. Fail twice and the kid can’t graduate. And in many states, the sequence must start with algebra—pre-algebra at best. So we are supposed to teach all students, regardless of ability, three years of increasingly abstract math and fail them if they don’t achieve basic proficiency. If, god save us, the country was ever stupid enough to go down this reformer path, the resulting bloodbath would end the policy in a year. We’re not talking the occasional malcontent, but over half of a graduating class in some schools—overwhelmingly, this policy impacts black and Hispanic students. But it’s okay. We’re just doing it for their own good, right? Await the disparate impact lawsuits—or, more likely, federal investigation and oversight.

Reformers faithfully hold out this hope: bad teachers are creating lazy students who could do the work but just don’t want to. Oh, yeah, and if we catch them in elementary school, they’ll be fine in high school.

It is to weep.

Hey, under 1000 words!


Algebra 1 Growth in Geometry and Algebra II

Last September, I wrote about my classes and the pre-algebra/Algebra 1 assessment results.

My school covers a year of instruction in a semester, so we just finished the first “year” of courses. I start with new students and four preps on Monday. Last week, I gave them the same assessment to see if they’d improved.

Unfortunately, the hard drive on my school computer got wiped in a re-imaging. This shouldn’t have been a problem, because I shouldn’t have had any data on the hard drive, except I never got put on the network. Happily, I use Dropbox for all my curriculum development, so an entire year’s worth of intellectual property wasn’t obliterated. I only lost the original assessment results, which I had accidentally stored on the school hard drive. I should have entered the scores in the school grading system (with a 0 weight, since they don’t count towards the grade) but only did that for geometry, the only class I can directly compare results with.

My algebra II class, though, was incredibly stable. I only lost three students, one of whom got a perfect score—which the only new addition to the class also got, so balance maintained. The other two students who left got around 10-15 wrong, so were squarely in the average at the time. I feel pretty comfortable that the original scores didn’t change substantially. My geometry class did have some major additions and removals, but since I had their scores I could recalculate.

Mean Median Mode Range
Original just above 10 9.5 7 22
Recalculated just below 10 (9.8) 8 7 22

I didn’t have the Math Support scores, and enough students didn’t take the second test that comparisons would be pointless.

One confession: Two Algebra II students, the weakest two in the class, who did no work, scored 23 and 24 wrong, which was 11 more than the next lowest score. Their scores added an entire point to the average wrong, increased the range by 14 points, and you know, I just said bye and stopped them from distorting the results the other 32 kids. (I don’t remember exactly, but the original A2 tests had five or six 20+ wrong scores.)

So here’s the original September graph and the new graph of January:

AlgtestAlgAssessyrend

The geometry class was bimodal: 0 and 10. Excel refused to acknowledge this and I wasn’t sure how to force it. The 10s, as a group, were pretty consistent—only one of them improved by more than a point. The perfect scores ranged from 8 wrong to 2 wrong on the first test.

geoalgclassgrowth

In short, they learned a lot of first year algebra, and that’s because I spent quite a bit of time teaching them first year algebra. In Algebra II, I did it with data modeling, which was a much more sophisticated approach than what they’d had before, but it was still first year algebra. In geometry, I minimize certain standards (proofs, circles, solid shapes) in favor of applied geometry problems with lots of algebra.

And for all that improvement, a still distressing number of students answered x2 + 12 when asked what the product of (x+3) and (x+4) was, including two students who got an A in the class. I beat this into their heads, and STILL some of them forget that.

Some folks are going to draw exactly the wrong impression. “See?” these misguided souls will say, nodding wisely. “Our kids just aren’t being taught properly in early grades. Better standards, better teachers, this problems’s fixed! Until then, this poor teacher has to make up the slack.” In short, these poor fools still believe in the myth that they’ve never been taught.

When in fact, they were taught. Including by me—and I don’t mean the “hey, by the way, don’t forget the middle term in binomial multiplication”, but “you are clubbing orphan seals and making baby Jesus cry when you forget the middle term” while banging myself on the head with a whiteboard. And some of them just forgot anyway.

I don’t know how my kids will do on their state tests, but it’s safe to say that the geometry and second year algebra I exposed them to was considerably less than it would have been had their assessment scores at the beginning of class been the ones they got at the end of class. And because no one wants to acknowledge the huge deficit half or more of each class has in advanced high school math, high schools won’t be able to teach the kids the skills they need in the classes they need—namely, prealgebra for a year, “first year” algebra for two years, and then maybe some geometry and second year algebra. If they do okay on the earlier stuff.

Instead, high schools are forced to pretend that transcripts reflect reality, that all kids in geometry classes are capable of passing a pre-algebra test, much less an algebra one test. Meanwhile, reformers won’t know that I improved my kids’ basic algebra skills whilst still teaching them a lot of geometry/algebra II, because the tests they’ll insist on judging me with will assume a) that the kids had that earlier material mastered or b) that I could just catch them up quickly because after all, the only problem was the kids’ earlier teachers had never taught them.


Teaching Students with Utilitarian Spectacles

In my last post, commenter AllaninPortland said, of my Math Support students, “Their brains are wired a little too literally for modern life.”

James Flynn, of the Flynn Effect:

A century ago, people mostly used their minds to manipulate the concrete world for advantage. They wore what I call “utilitarian spectacles.” Our minds now tend toward logical analysis of abstract symbols—what I call “scientific spectacles.” Today we tend to classify things rather than to be obsessed with their differences. We take the hypothetical seriously and easily discern symbolic relationships.

Yesterday I gave my math support kids a handout on single step equations similar to the one in the link.

“Oh, I know how to do this,” said Dewayne. “Just subtract six from both sides.”

“You could do that,” I said. “But here’s what I want people to try. I want everyone to read the first equation as a sentence. What is it saying?”

“Some number added to six gets fourteen,” came from Andy.

“Excellent!”

“You mean, you don’t want us to subtract, add, do things to get x by itself?” asked Jose.

“That’s called ‘isolation’. You are ‘isolating’ x, getting it all by itself as you put it. Who knows how to do that?” Over half the class raised their hands. “Great. You can do that if you want to, but I’d like you to try seeing each equation just as Andy described it. Put the equation you see into words. This will help make it real, and will often give you the answer right away. For example, what number do I add to six to get 14?”

“Eight.” chorused most of the room.

“There you go. Now, remember, what did I say a fraction was?”

“Division.”

“So instead of saying ‘x over 5′, you’re going to say….”

“X divided by 5″ came back a number of students.

“Off you go.”

This worked for most of the students, but one student, Gerry, sat at the back of the room drawing, as he often does. After watching him do no work for 10 minutes, I called him up front. (Normally, I am wandering the room, but every so often I call them up for conversations instead.)

“So you aren’t working.”

“Yeah. I can’t do this.”

“Remember yesterday, when we were doing those PEMDAS problems? You were on fire!”

“Yeah, but it didn’t have the letters in it. I can do math when it doesn’t have letters. And yesterday, when you showed us how to just draw pictures for the word problems? That was cool. I think I can do those now.”

“You need to look at these problems from a different part of your brain.”

“A different what?”

“This is a really, really easy problem. Way easier than the math problems you solved in your head yesterday. But you don’t see this as the same kind of problem, so we have to fool your brain.”

“How do we do that?”

“Read the first problem aloud.”

“X + 6 = 14. This is when you have to do stuff to both sides, right? I can’t do that.”

“Read it again. But instead of saying x, say ‘what’.”

“Say ‘what’?”

“Yep.”

“You crazy.”

“Definitely. Try it.”

“What plus 6 = 14? 8.”

“There you go.”

He was sitting in one of my wheeled chairs, pushing it back and forth with his feet. This stopped him cold.

“Eight’s the answer? Holy sh**.”

“Try another. Without the language.”

“What minus 3 = 7. That’s nine…no, 10. Ten? Really? No f**k….no way.”

“And this one?”

“Oh, that’s a fraction. I can’t do those.”

“What did I tell you fractions were?”

“Division. Oh. What divided by 5 is 9? Forty five? No way?”

“So. I want to see you do this whole handout, 1-26, and every time you see an x, call it ‘what’. Remember to sketch out subtraction questions on a numberline and think about direction.”

“Okay. Man, I can’t believe this.”

Fifteen minutes later, Gerry was done with the entire set. Only three minor errors, all involving negative numbers.

“I feel like a math genius,” he said with a wry grin.

I sat down next to him. “It’s like I said. We have to ask your brain a different question. So instead of tuning me out, next time I come up with some goofy idea using pictures or tiles or different words, give it a shot. And tell me if it works to give your brain the right question. Some of my ideas will work, some won’t. And some things, we won’t be able to fool your brain to answer a different way. But you know a lot more math than you think you do. You just have to figure out how to ask the question in a way your brain understands.”

Back to Flynn:

A greater pool of those capable of understanding abstractions, more contact with people who enjoy playing with ideas, the enhancement of leisure—all of these developments have benefited society. And they have come about without upgrading the human brain genetically or physiologically. Our mental abilities have grown, simply enough, through a wider acquaintance with the world’s possibilities.

But not everyone is capable of understanding abstractions to the same degree. Some people do better learning the names of capitals and Presidents and the planets in the solar system. They’d learn confidence and competence through interesting, concrete math word problems and situations, and enjoy reading and writing about specific historic events, news, or scientific inventions that helped society. Instead, we shovel them into algebra, chemistry, and literature analysis and make them feel stupid.

Students’ names have been changed. They are all awesome kids. Do not say mean things about them in the comments, which I can control, or other blogs, which I cannot.


The Sinister Assumption Fueling KIPP Skeptics?

Stuart Buck on KIPP critics:

It’s unwitting, to be sure; most of the critics haven’t thought through the logical implications of what they’re saying, and they would sincerely deny being racist in their thoughts or intentions. But even granting their personal good will, what they are saying is full of racially problematic implications. These KIPP critics are effectively saying that poor minority children are incapable of genuinely learning anything more than they already do. If poor minority children seem to be learning more, it can’t really be true; there must be some more sinister explanation for what’s going on.
…..
Now here’s the key point: If selection and attrition is what explains KIPP’s good results, then that logically means that several hundred extra hours a year being instructed in reading, math, music, art, etc. do NOT explain KIPP’s good results. But wait a minute: what does that really mean?
….
Nothing less than this: several hundred hours a years instructing kids doesn’t actually make much difference. Recall that KIPP’s critics say that if KIPP’s students seem to be learning more, it must be an artifact of how KIPP selects kids and then pushes out the low-performers. In saying that, KIPP’s critics are implying, however unwittingly, that no amount of effort or study could possibly get poor urban minorities to learn anything more.

Okay, let me be clear that I am not speaking for any other KIPP critic. While I don’t talk much about KIPP, I am certainly one who thinks their results are due to attrition, creaming, and the benefits that accrue from a homogenous and motivated population.

But yeah. In a nutshell, I’m saying this:

IF you take low ability kids (of any race or income) and IF you select for motivation in the parents, at least, and IF you remove the misbehaving or otherwise highly dysfunctional kids who don’t share their parents’ motivation, and IF you enforce strict behavioral indoctrination in middle class mores and IF you give them hundreds of hours more education a year and IF they are in middle school and IF they are simply being asked to catch up with the material that middle to high ability kids learned fairly effortlessly—that is, elementary reading and math skills…..

…then they will have a slightly better test scores than similarly motivated low ability kids stuck in classes with the misbehavers and highly dysfunctional kids and fewer hours of seat time and less behavioral indoctrination into middle class mores, but their underlying abilities will still be weak and just as far behind their higher ability peers as they were before KIPP.

I’ve written before, improving elementary school or middle school scores is a false god when it comes to improving actual high school outcomes. Children who need tons of hours to get up to grade level fundamentally differ from those reading at or above grade level from kindergarten on, and this difference matters increasingly as school gets harder. High school isn’t the linear steps through increased difficulty that occurs in grades K-8, but a much different and far more difficult animal, now that we make everyone take college prep classes. There’s no evidence that KIPP students are learning more or closing the gap in high school, and call me cynical but I’m really, really sure we’d be hearing about it if they were. KIPP is not transforming low ability kids into high ability kids, or even mid-level ability kids.

I am comfortable asserting that hours and hours of additional education time does nothing to change underlying ability. I’m not a racist, nor am I a nihilist who believes outcomes are set from birth. I do, however, hold the view that academic outcomes are determined in large part by cognitive ability. The reason scores are low in high poverty, high minority schools is primarily due to the fact that the students’ abilities are low to begin with, not because they enter school with a fixable deficit that just needs time to fill, and not because they fall behind thanks to poor teachers or misbehaving peers.

That doesn’t mean we can’t improve outcomes, particularly in high school, when we do a great deal of harm by trying to teach kids what they can’t learn and refusing to teach them what they can learn. And it doesn’t mean we couldn’t tremendously improve elementary school outcomes in numbers, if not individual demonstrated ability, by allowing public schools to do what KIPP does—namely, limit classes to motivated kids of similar ability.

Paul Bruno, another KIPP skeptic (whose views in no way should be confused with mine), thinks it’s wrong to dismiss KIPP achievements, because they show that public schools for low income kids simply need much more money. I disagree. What KIPP “success” shows is the importance of well-behaved, homogeneous classes.

So here’s my preferred takeaway from KIPP and other successful charter schools:

Since it’s evident that much of these schools’ success stories come from their ability to control and limit the population, why are we still hamstringing public schools? Here’s a thought: how about KIPP schools take those really, really tough kids and only those kids? Misbehave too often in public schools and off you go to a KIPP bootcamp, where they will drill you with slogans and do their best to indoctrinate you into middle class behavior and after a while you’ll behave because please, god, anything to get back to the nicer public schools! You could also create KIPP schools for special ed kids–put the special ed kids with cognitive issues and learning disabilities in their own, smaller schools. Meanwhile, public schools could extend the school day a bit, help the kids catch up as much as possible while still making school fun. While the average test score might not improve much, this approach would keep a lot of kids engaged in school through elementary school instead of lost, bored, or acting out in chaotic classes disrupted by a few unmanageable or extremely low ability kids.

See, that would scale a lot better. Instead, we set up small schools for what is actually the majority of all low income students—reasonably well-behaved, of low to middle ability and, with no one around to lead them astray, willing to give school a shot. Only a few kids get into these schools, while the rest of them are stuck in schools where just a few misbehavers make class impossible and really low ability kids take up a lot of addtional teacher time. Crazy, that’s what it is. But what I just laid out is completely unworkable from an ideological standpoint, and as I just explained in an earlier post, school policy is set by ideology and politics, not educational validity. To say nothing of the fact that KIPP doesn’t want to teach “those” kids.

Anyway. The reality is that yes, a low ability kid, regardless of income or race, will not, on average, become a high or mid ability kid simply because he spends a lot of seat time working his butt off in a KIPP school. Sorry Stuart.


SAT Prep for the Ultra-Rich, And Everyone Else

Whenever I read about SAT tutors charging in the hundreds of dollars, I’m curious. I know they exist, but I also know that I’m pretty damn good, and I’m not charging three figures per hour (close, though!). So I always read them closely to see if, in fact, these test prep tutors are super fab in a way that I’m not.

At the heart of all test prep stories lies the reporter’s implicit rebuke: See what rich people are doing for their kids? See the disadvantage that the regular folks operate under? You can’t afford those rates! You’re stuck with Kaplan or cheaper, cut-rate tutors! And that’s if you’re white. Blacks and Hispanics can’t even get that much. Privilege. It sucks.

And so the emphasis on the cost of the tutors, rather than any clear-eyed assessment of what, exactly, these tutors are doing that justifies an hourly rate usually reserved for low-end lawyers, never mind the fact that these stories are always about the SAT, when in fact the ACT is taken by as many kids as the SAT. The stories serve up propaganda more than they provide an accurate picture of test prep.

I’ve written before about the persistence of test prep delusions. Reality, summarized: blacks and Hispanics use test prep more than whites, Asians use it more than anyone. Rich parents are better off buying their kids’ way into college than obsessing about the last few points. Test prep doesn’t artificially inflate ability.

So what, in fact, is the difference between Lisa Rattray, test prep coach charging $300/hour; me, charging just short of 3 figures; and a class at Kaplan/Princeton/other SAT test prep schools?

Nothing much. Test prep coaches can work for a company or on their own. The only difference is their own preferences for customer acquisition. Tutors and instructors with a low risk tolerance just sign on with a company. Independent operators, comfortable with generating their own business, then pick their markets based on their own tolerance. My customers sit comfortably in the high income bracket, say $500K to $5 million yearly income, although I’ve worked with a couple Fortune 500 families. Lisa Rattray and Joshua Brown, the featured tutors, clearly work with families a couple notches up the income ladder from mine.

None of this has anything to do with quality of instruction. Test prep is a sales and marketing game. The research is clear: most kids improve at least a little, quite a few kids improve a lot, a very few kids stay put or, heaven forfend, get worse.

Obviously, instructor quality influences results a bit, but only rarely change a kid from one category (mild improvement) to another (major improvement). Remember, all test prep instructors have high test scores, and they’re all excellent at understanding how the test works. So they make career decisions based on their tolerance for sales and marketing, not the quality of their services. I know of some amazingly god-awful tutors who charge more than I do, having learned of them from their furious ex-clients who assumed a relationship between price and quality. These tutors have websites, business cards, offered their own prepared test materials, saw students in their rented space, and often accepted credit card deposits. I have none of these accoutrements, show up at my clients’ houses, usually but not always on time, and take checks. Every so often I get a client who whips out a wad of bills and pays me $500 in cash, which I find a tad unnerving.

I’m just as good now as I was at Kaplan (in fact, I privately tutored my own students while at Kaplan, tutoring theirs), but I only got paid $24/hour for Kaplan work, which charged about $125/hour for my services. Kaplan will (at least, when I worked there) boost a teacher’s hourly rate to $50/hour if they get 80% or more “perfect” customer ratings. Instructors who convinced their students that to respond to the online survey and give them excellent ratings got more money. This is independent of actual improvement. A customer who doesn’t improve at all but felt reassured and valued by her instructor could give straight 5s (or 1s, whatever the highest rating is). A customer who sees a 300 point improvement might not fill in the survey at all. Their research showed that customers who give their instructors perfect ratings gave awesome word of mouth and that was worth rewarding. Nothing else was. Asian cram schools pay instructors based on the students who sign up, with a premium for those who sign up specifically for that instructor. See? Sales and marketing.

Test prep companies, long castigated as the luxury option of the wealthy, have been the first choice of the middle class for a decade or more. For the reasons I’ve outlined, any parent can find excellent instructors in all the test prep companies: Kaplan, Princeton Review, Asian cram schools. They won’t brag about it, though, because these companies are about the brand. Kaplan doesn’t want word getting out that Joe Dokes is a great Kaplan instructor; it wants everyone to be happy with Kaplan. No one is “Princeton Review’s star tutor” for very long, because Princeton doesn’t like it and at that point, the most risk-averse instructor probably has enough word of mouth fame to go independent.

I’ve often advised my students to consider a class. The structure helps. Some of my kids don’t do any work unless I’m there, so what I end up doing is sitting there playing Spider on my android on my client’s dime while the kid works problems, rather than reviewing a bunch of work to move forward. I’m pretty sure Lisa and Joshua would celebrate this, going to the parent and pointing out how much they are helping. I have better things to do and other clients to see. So I tell the parents to fork out an extra thousand for a class, make sure the kid goes, and then we review the completed work. The student gets more hours, more focus and, usually, higher scores, regardless of the quality of the second instructor.

I’m not saying Lisa and Joshua are wrong, mercenary, or irresponsible. They just play to a different clientele, and a huge chunk of their ability to do so rests on their desire to sell an image. That’s fine. That’s just not me. Besides, Josh forks out $15K of his profit for a rental each summer. Lisa gets constant text messages from anxious parents. Also not me.

So you’re a white, middle class or higher parent with a teenager, worried about SAT scores. What do you do? Here are some guidelines. Recognize that GPA or parental income smacks down test scores without breaking a sweat. If Johnny doesn’t have a GPA of 3.8 or higher, elite universities are out of the question unless his parents are alumni or rich/connected enough to make it worth the school’s while.

If Sally qualifies on GPA, has a top-tier transcript (5 or more AP classes) and wants to go to a top 10 school, test scores should be 700 or higher per section. If they’re at that point, don’t waste your time or money or stress. At that point, the deciding factors aren’t scores but other intangibles, including the possibility that the admissions directors toss a pile of applications in the air and see which ones travel the farthest.

If Jesse is looking for a top 20 or 30 school, the GPA/transcript requirements are the same, but looking at the CDS of these schools, realistically a 650 or higher per section will do the trick. It might be worth boosting the test scores to low 700s, but if Jesse is a terrible tester, then don’t break the bank. One of the schools will probably come through.

If Sammy has a lower GPA (3.3 to 3.8) but excellent test scores (high 600s or higher per section) , then look to the schools in the middle–say, from 40 to 60. It’s actually worth spending money to maximize Sammy’s scores, because these mid-tier schools often get a lot of high effort hard workers with mediocre test scores. Not only will Sammy look good, but he might get some money. (By the way, if you’ve got a Sammy whose grades are much lower than his abilities, you should still push him into the hardest classes, even if he and the counsellors cavil. If your Sammy is like most of them, he’s going to get Bs and Cs regardless, so he may as well get them in AP classes and get some college credit from the AP tests. And the transcript will signal better, as well.)

The biggest bang for the test prep buck lies not in making kids competitive for admissions, but to help them test out of remediation at local universities. So if Austin has a 3.0 GPA, works hard but tests poorly, then find out the SAT cut score at his university. If he’s not above that point, then spend the money to get him there, and emphasize the importance of this effort to his college goals.

If your kid is already testing at 650 or higher, either send her to an Asian cram school (they will be the only white kid there, for the most part, but the instruction will be excellent) or invest in a tutor. The average white kid class at Kaplan or Princeton might have an instructor who can finetune for their issues, but probably won’t.

Otherwise, start with a class and supplement with a tutor if you can afford it. Ask around for good instructors, or ask the test prep company how long the instructor has been teaching. Turnover in test prep instructors is something like 75%; the 25% who stay long term do so because they’re good. As for the tutor, I hope I’ve convinced everyone that price isn’t an issue in determining quality. I would ask around for someone like me, because our ability to get a high rate without the sales and marketing suggests we must be, in fact, pretty good. And there’s always someone like me around. Otherwise, I’d go with the private tutoring options at a test prep company, with interviews.

As I said, these rules are for middle class or higher white kids. Only 6% of blacks and Hispanics get above 600 on any section of the SAT–in fact, the emphasis on GPA came about in large part to bypass the unpleasant reality of the score gap. There are only around 300 black students that get higher than 700 on two sections of the SAT. That’s barely enough blacks for one top ten school. Rules are very different. The main reason for blacks and Hispanics to take test prep is to get their scores above the remediation number. Middle class or higher Asians face much higher standards because universities know their (or their parents’) dedication to getting good grades and good test scores is more than a tad unnatural and probably overstates their value to the campus. Athletes and artists of note play by different rules. Poor whites and poor Asians have it really, really tough.

What this means, of course, is that the kids in the Hamptons are probably already scoring 700 or higher per section and are, consequently, wasting their time. But what the hell, they’re doing the economy some good. Or maybe some of them are Asian.

Note: I wrote this focusing on the SAT but it all applies to the ACT as well, and the ACT is a much better test. I wrote about the ACT here.


The false god of elementary school test scores

Rocketship Academy wants to go national. Rocket Academy is a hybrid charter school chain that focuses solely on getting low income Hispanic elementary school students to proficiency. (Note: Larry Cuban has some excellent observations from his visit to a Rocketship Academy.)

First things first: I’ve checked the numbers every way I can think of, and Rocketship’s numbers are solid. They don’t have huge attrition problems that I can see. They are, in fact, getting 60% or higher proficiency in most test categories, and the bulk of their students are Hispanic, many of them not proficient in English. Of course, that brings up an interesting question–if they are proficient on the ELA tests, why aren’t they considered proficient in English? But I digress.

The larger point is this: getting high test scores on California’s elementary school math tests ain’t all that much to get worked up about. Here’s some data from the 2011 California Standards test in math:

I used two standards, because the NCLB obsession with “Proficient and higher” is, to me, moronic. I prefer Basic or higher. The blue line is the percentage of all California in grades 2-9 scoring Basic or higher in General Math, the red is the percentage of same scoring Proficient or higher.

So it gets a bit tricky here, because after 6th grade, the entry to algebra varies. In order to simplify it slightly, I’m ignoring the seventh grade algebra track (call it “accelerated advanced” path), which is about 40,000 students this year, fewer in previous years.

I combined 8th and 9th grade students in General Math and attached that result to the red and blue lines.

Then I separated two groups–the ones who took algebra in 8th grade, and the ones who took algebra later than that. The first group are those who entered algebra in 8th grade, passed it, and continued on the “average advanced” course path, culminating with Calculus senior year. The second group are those who took algebra for the first, second, or third time in high school and then continued on. For each group, I calculated percentages for Basic + and Proficient +.

Notes:

  1. Through grade 6, the scores represent all students. In grade 7 and 8/9, general math scores reflect only those students who haven’t moved onto algebra. That’s probably why the proficiency levels drop to 50% and lower for the last two groups.In other words, the green and purple lines represent the advanced track students–most, but not all of the the strong algebra and higher math students. The turquoise and orange lines represent the weaker students taking algebra and higher.
  2. Roughly 80% of all students test at Basic or higher from second through sixth grade.
  3. Over 70% of the strong studentws test at Basic or higher from algebra through “summative math” (taken for all subjects after Algebra 2).
  4. The percentage of students testing proficient from second through sixth grade starts at 65%, rises slightly, and then drops steadily.
  5. In no course do more than 50% of the strong students in algebra and higher achieve a score of Proficient or higher.
  6. In no course do more than 50% of the weaker students in algebra or higher achieve a score of Basic or higher.

So the chart reveals that all California second through sixth graders, high and low ability, averaged higher scores on their tested subject than the strongest high school students did.

I used 2011 scores, and I may have made a minor error here or there, but the fall off has been in the scores for several years now, and it’s easy enough to check.

What could cause this? Why are California’s elementary school students doing so phenomenally well, and then fall apart when they get to high school? Let’s go through the usual culprits.

California’s high school math teachers suck.–Well, in that case, there’s not much point in demanding higher standards for math teachers, because California’s high school math teachers have had to pass a rigorous content knowledge test for over 20 years. California’s elementary school teachers have to pass a much easier test–which is much harder than anything they had to pass before 2001. In other words, try again.

The teachers aren’t covering the fundamentals! So when the students get to algebra, they aren’t prepared.–But hang on. Elementary school kids, the ones being taught the fundamentals, are getting good test scores. What evidence do you have that they aren’t being taught properly?

Well, they’re only getting good test scores because the tests are too easy!—dingdingding! This is a distinct possibility. Perhaps the elementary tests aren’t challenging enough. Having looked at the tests, I’m a big believer in this one. I think California’s elementary math tests, through seventh grade, are far less challenging to the tested elementary school population than are the general math and specific subject tests are to the older kids. (On the other hand, the NAEP scores show this same dropoff.)

However, while that might explain the disparity between the slower track math student achievement and elementary school, it doesn’t adequately address why the students in the “average advanced” track aren’t achieving more than 50% proficiency, does it?

Trigonometry is harder than memorizing math facts–We should take to heart the Wise Words of Barbie. Math achievement will fall off as the courses get more challenging. Students who excelled at their times tables and easily grasped fractions might still struggle with complex numbers or combinatorics.

So if you ask me—and no one does. Hell, no one has even really noticed the fall-off—it’s a combination of test design and subject difficulty.

Whatever the reason, the test score falloff has enormous implications for those who are banking on Rocketship Academy, KIPP, and all those other “proven” charters that focus exclusively on elementary school children.

Elementary school test scores are false gods. We have no evidence that kids who had to work longer school days simply to achieve proficiency in fifth grade reading and math will be, er, “shovel ready” for algebra and Hamlet. KIPP’s College Completion Report made no mention of its college students SAT scores, or indeed made any mention of demonstrated ability (e.g., AP tests), and color me a cynic, but I’m thinking they’d have mentioned both if the numbers were anything other than dismal.

So let’s assume that those Rocketship scores are solid (and I do). So what? How will they do in high school? Where’s the follow through? Everyone is banking on the belief that we can “catch them early”. Get kids competent and engaged while they are young, and it all falls into place.

Fine. Just let me know when the test scores back up that lovely vision.

Added in January 2014: Well, hey now. Growing Pains for Rocketship’s Blended-Learning Juggernaut.

Alas, it seems that Rocketship’s scores are declining, their model doesn’t scale, they are making decisions based on cost rather than learning outcomes and, my FAVORITE part:

Lynn Liao, Rocketship’s chief programs officer, said the organization has also received troubling feedback on how students educated under the original blended learning model fare in middle school.

“Anecdotal reports were coming in that our students were strongly proficient, knew the basics, and they were good rule-followers,” Ms. Liao said. “But getting more independence and discretion over time, they struggled with that a lot more.”

That graven image gets you every time, doesn’t it?


Black teachers, teacher quality, and education reform, revisited

In the interest of focus, I left a few things off my original post.

First, and most importantly: teaching is insanely complicated and non-linear. For me, it’s a free-form high-wire act on a daily basis, interspersed with bouts of intense mental activity outside of class as I try to figure out the best way to explain complicated subjects to kids who don’t want to be there and often don’t have the requisite skills to master the material. But for others, it’s a highly structured daily routine in which each lesson is planned out months in advance. I know many math teachers who have each problem worked out before they teach it (in many cases problems they’d done for years); I often make up my problems as I write them on the board, so I can carefully calibrate the difficulty level based on the students in that class. I know history teachers who have tremendous difficulty lecturing without index cards and can’t casually lecture on any topic in their curriculum without preparation; I can go from complete unfamiliarity to ready to talk in 2 hours.

Many non-teachers visualize their ideal teacher as someone more like me, except a lot younger–smart, fluid, an expert on anything a student might want to inquire about, and particularly an expert in the subject taught. Highly educated elites, in particular, have romantic notions of their little snowflakes being taught by a bright Harvard graduate who went to a top 50 school and wants to help the next generation be as enamored by learning as she is.

Given the tough times I have finding jobs, I encourage all those who hold these romantic notions to find the evidence that teacher IQ and general intelligence is dispositive in successful teaching. (It must be said, however, that principals don’t like teachers that are too smart. Or too old.)

Much as I’d like a world that makes it easier for me to find a job, though, I think the reality is much less comforting to those who want “smarter” teachers. Certainly, research has provided little comfort. When the best news available tells us that teachers in the 95% percentile get very small improvements over the very worst bottom-dwellers, and any improvement at all is considered great news because it’s so hard to find any teacher quality criteria that show any increase at all….well, it’s just possible that being smart ain’t all that.

I really understand the intuitive belief that smarter teachers are better teachers, because I used to hold the same notion, when I was a suburban parent. But that belief was shaken even before I became a teacher and realized that teachers weren’t complete morons and, furthermore, that the demonstrated knowledge requirements for teachers took a sharp increase after NCLB, particularly for elementary school teachers. Huge. So big that, if teacher competency were a factor, we should have seen some improvement in teacher outcomes. Instead, recent research shows, again, that experience boosts performance a bit and new teachers are still weakest of all. I’m open to having my mind changed on that one because no research has specifically tested on this point. But again, the boosts in demonstrated ability were huge in many states, and shouldn’t we have seen some improvement in performance?

But what we got for sure were far fewer black and Hispanic teachers.

Notice that the fraud ring involved existing teachers, teachers who were probably caught in the NCLB net and forced to re-qualify for existing positions. As always, ETS explains this with a helpful graphic (I’ve combined text and image from page 16. ETS has excellent data. That’s why everyone trying to push an educational policy ignores it.)

In 2001, NCLB required teachers to be fully credentialed, forcing many existing teachers with emergency credentials to pass a Praxis test. Many black teachers couldn’t. Hence, the fraud. This wasn’t simply a case of wannabe teachers, but actual teachers, teachers with jobs. What if they were considered competent teachers?

There’s a really interesting study idea: test the outcomes of the teachers who committed fraud, compare them to white and black teachers who passed the test legally. What if they do well?

I’m not fuming at the double standard revealed by the reformers who scream about “unqualified teachers” but duck and cover when black teachers are found to have committed fraud. I am, however, annoyed that reformers are constantly promoting a lie, or at least a fantasy unsupported by research, and they don’t even have the balls to hold consistently to that position. Instead, they wilt and run away from the Clarence Mumford case. They never seem to commit, exactly, to the qualifications teachers should have, and how the current tests fall short.

Why? Because failure to commit to a line in the sand allows them to skate on two points. First, the minute they draw that line, they will be ferociously questioned about the impact their standards will have on black and Hispanic teachers. Race is an area that reformers are absolutely determined to avoid, unless it’s an opportunity to call teachers racist for the uneven performance results, of course.

Furthermore, the minute they draw that competency line, they will be forced to confront the fact that, as I’ve said with some frequency, research doesn’t support their claims. It’s hard to argue for changes that will further obliterate the population of black and Hispanic teachers when you can’t prove being smart makes that much difference—and that the teacher’s race seems to matter.

So instead, reformers prate endlessly about incompetent, mediocre teachers who aren’t anything but white, of course, talking cheap about improving standards while fleeing in terror from the tiniest suggestion that raising teacher test scores will disparately impact black and Hispanic teachers.

It’s too bad, because if they stood up for their beliefs, we could have a meaningful discussion about where, exactly, the line is for teacher competency. One reasonable interpretation of the research thus far is that we are well above the line needed. Bad news for reformers, if so.

I keep wondering about one other possibility, which might explain why a low or failing score on a basic skills tests could nonetheless belong to an effective elementary school teacher.

Maybe they’re underperforming.

My years in test prep have shown me a number of oddities, including more than a few African American kids (by far, the smallest percentage of my demographic), who have a terrible time reading and thinking “on their feet” (that is, a general skills test), and can’t think abstractly at all, yet have very strong demonstrated abilities that they’ve internalized.

Two different African American girls have had exceptional writing skills with a strong demonstrated vocabulary (both got perfect scores on their essay), but struggled to break 500 on the SAT verbal. One of them got a 4 on the AP US History test. More than one boy (including some Hispanics) can do relatively complex math word problems beautifully but, given a simple equation, can’t isolate x. I had one kid who could not solve 4x -3 = 21, but if you asked him what number you could multiply by 4 and subtract 3, and get 21, he’d say “6″ before I’d worked it out myself.

I’ve had more than a couple kids ask me why they were being tested on history when they’d never studied it in school. It took me a while to realize they thought the questions on the reading passage were questions they were supposed to know offhand. Their reading scores shot up (from say, the 3rd percentile to the 35th or 40th) when I explained that the big chunk of text on the right had the answers to the questions. They had no idea. And although they did better, they still complained that they wanted to be tested on “what they know” rather than “learn new stuff on the test”. Yes, that sort of thinking is completely alien to me and yes, it’s still pretty common.

In other words, I wonder if maybe crystallized vs. fluid intelligence impacts test scores on the bottom half of the bell curve. This might explain why relatively low skilled people have difficulty showing that knowledge on the test, but can be effective classroom teachers to young kids.

So I’m not as ready as I was five years ago to say that people who can’t pass a basic skills test don’t, in fact, have sufficient mastery of basic skills.

I’m not excusing the fraud. But it infuriates me that everyone’s ignoring it, because the conversations it would kick up are conversations we need to have—and, of course, they’re conversations we’re afraid to have.


Follow

Get every new post delivered to your Inbox.

Join 749 other followers