Value-added assessments

I posted about value-added assessments when the front page story in the New York Times came out early this year. In recent weeks I’ve come across a couple interesting commentaries on these scores.

At the Washington Post, Jay Mathews wrote a column titled “Devaluing value-added assessments.” I read it closely, but couldn’t understand what Mathews is saying is wrong with these scores. He begins by saying he will relate “the best argument against value-added I have seen in some time.”

Point #1:

“I have seen this sham firsthand over many years,” Wiggins writes. “Lots of so-called good N.J. and N.Y. suburban districts are truly awful when you look firsthand (as I have for three decades) at the pedagogy, assignments and local assessments; but those kids outscore the kids from Trenton and New York City, even though both city systems have a number of outstanding schools and teachers.”

I don’t get this–don’t value added scores only measure changes within a single district? Aren’t we only using them to assess teachers within districts?

Point #2:

Also, Wiggins wrote, valid research on value-added exposes “hidden truths,” such as “it IS true that models accurately predict over a three-year period, performance at the extremes. Thus, the really effective teachers stay so and the really ineffective ones are really ineffective.”

I don’t understand this at all. What is the hidden truth here exactly? That teachers matter?

Point #3:

Schools with high test scores discover through value-added analysis that they need more than that. One outstanding prep school, Wiggins said, gave a professionally designed test of critical thinking to freshmen and seniors. There was no improvement. Similar results have come from colleges giving the Collegiate Learning Assessment of analytical skills, given to freshmen and seniors.

Huh? It sounds like Mathews is saying here that value added scores help schools identify bigger problems. Isn’t that a good thing?

Point #4:

Our mistake was thinking this valuable long-term research tool would work as a one-year teacher rating system. “It becomes like a sick game of telephone: What starts out as a reasonable idea, when whispered down the line to people who don’t really get the details — or don’t want to get them — becomes an abomination,” Wiggins wrote. “By looking at individual teachers, over only one year (instead of the minimum three years as the psychometricians and VAM [valued-added model] designers stress), we now demand more from the tests than can be obtained with sufficient precision.”

I’m not sure what to make of this. It sounds like the critique is that the VA measure only uses change over one year. I suppose that would be problematic if true, but I’m not sure it is true. Even if it is, the paper by Chetty et al. (subject of the NYT article linked above) offers evidence that VA measures are an unbiased measure of quality.

A second commentary comes from Andrew Gelman’s blog. This is more of a technical discussion about whether VA measures make the right modeling assumptions.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>