If assessments are diagnoses, what are the prescriptions?

I happen to like statistics. I appreciate qualitative observations, too– data of all sorts can be deeply illuminating. But I also believe that the most important part of interpreting them is understanding what they do and don’t measure. And in terms of policy, it’s important to consider what one will do with the data once collected, organized, analyzed, and interpreted. What do the data tell us that we didn’t know before? Now that we have this knowledge, how will we apply it to achieve the desired change?

In an eloquent, impassioned open letter to President Obama, Education Secretary Arne Duncan, Bill Gates and other billionaires pouring investments into business-driven education reforms (revised version at Washington Post), elementary teacher and literacy coach Peggy Robertson argues that all these standardized tests don’t give her more information than what she already knew from observing her students directly. She also argues that the money that would go toward administering all these tests would be better spent on basic resources such as stocking school libraries with books for the students and reducing poverty.

She doesn’t go so far as to question the current most-talked-about proposals for using those test data: performance-based pay, tenure, and firing decisions. But I will. I can think of a much more immediate and important use for the streams of data many are proposing on educational outcomes and processes: Use them to improve teachers’ professional development, not just to evaluate, reward and punish them.

Simply put, teachers deserve formative assessment too.

The value-added wave is a tsunami

Edweek ran an article earlier this week in which economist Douglas N. Harris attempts to encourage economists and educators to get along.

He unfortunately lost me in the 3rd paragraph.

Drawing on student-level achievement data across years, linked to individual teachers, statistical techniques can be used to estimate how much each teacher contributed to student scores—the value-added measure of teacher performance. These measures in turn can be given to teachers and school leaders to inform professional development and curriculum decisions, or to make arguably higher-stakes decisions about performance pay, tenure, and dismissal.

Emphasis mine.

Economists and their education reform allies frequently make this claim, but it is not true, at least not yet. Value-added measures are based on standardized-test scores and neither currently provide information an educator can actually use to make professional development or curriculum decisions. When the scores are released, administrators and teachers receive a composite score and a handful of subscores for each student. In math, these subscores might be for topics like “Number and Operation Sense” and “Geometry”.

It does not do an educator any good to know last year’s students struggled with a topic as broad as “Number and Operation Sense”. Which numbers? Integers? Decimals? Did the students have problems with basic place value? Which operations? The non-commutative ones? Or did they have specific problems with regrouping and carrying? In what way are the students struggling? What errors are they making? What misconceptions might these errors point to? None of this information is contained in a score report. So, as an educator faced with test scores low in “Number and Operation Sense” (and which might be low in other areas as well), where do you start? Do you throw out the entire curriculum? If not, how do you know which parts of it need to be re-examined?

People trained in education recognize a difference between formative assessment—information collected for the purpose of improving instruction and student learning, and summative assessment—information collected to determine whether a student or other entity has reached a desired endpoint. Standardized tests are summative assessments—bad scores on them are like knowing that your football team keeps losing its games. This information is not sufficient for helping the team improve.

Why do economists see the issue so differently?

An economist myself, let me try to explain. Economists tend to think like well-meaning business people. They focus more on bottom-line results than processes and pedagogy, care more about preparing students for the workplace than the ballot box or art museum, and worry more about U.S. economic competitiveness. Economists also focus on the role financial incentives play in organizations, more so than the other myriad factors affecting human behavior. From this perspective, if we can get rid of ineffective teachers and provide financial incentives for the remainder to improve, then students will have higher test scores, yielding more productive workers and a more competitive U.S. economy.

This logic makes educators and education scholars cringe: Do economists not see that drill-and-kill has replaced rich, inquiry-based learning? Do they really think test preparation is the solution to the nation’s economic prosperity? Economists do partly recognize these concerns, as the quotations from the recent reports suggest. But they also see the motivation and goals of human behavior somewhat differently from the way most educators do.

This false dichotomy makes me cringe. As a trained education research scientist who is no stranger to statistical models, value-added is not ready for prime time because its primary input—standardized test scores—is deeply flawed. In science and statistics, if you put garbage data into your model, you will get garbage conclusions out. It has nothing to do with valuing art over economic competitiveness, and everything to do with the integrity of the science.

The divide between economists and others might be more productive if any of the reports provided specific recommendations. For example, creating better student assessments and combining value-added with classroom assessments are musts.

Thank you. Here where I start agreeing—if only that had been the central point of the article. I don’t dismiss value-added modeling as a technique, but I do not believe we have anything resembling good measures of teaching and learning.

We also have to avoid letting the tail wag the dog: Some states and districts are trying to expand testing to nontested grades and subjects, and to change test instruments so the scores more clearly reflect student growth for value-added calculations. This thinking is exactly backwards.

I agree completely, but that won’t stop states and districts from desperately trying to game the system. Since economists focus so much on financial incentives, this should be easy for them to understand: when the penalty for having low standardized test scores (or low value-added scores) is losing your funding, you will do whatever will get those scores up fastest. In most cases, that is changing the rules by which the scores are computed. Welcome to Campbell’s law.

%d bloggers like this: