Not long ago I read a disturbing story about teacher evaluations. In essence, almost all the teachers were given the highest rating. And as you might expect, the scores of the students in that particular school system didn't reflect that they were being taught by world class educators.
So when I saw the following headline, "High Achieving Teacher Sues State Over "Evaluation" Labeling Her Ineffective", I had already condemned this teacher. My own confirmation bias led me to the conclusion that any teacher who could get rated as ineffective in a system which is largely rigged in their favor must be truly awful. But I was wrong. See the Washington Post story below:
"Sheri G. Lederman has been teaching for 17 years as a fourth-grade teacher in New York’s Great Neck Public School district. Her students consistently outperform state averages on math and English standardized tests, and Thomas Dolan, the superintendent of Great Neck schools, signed an affidavit saying “her record is flawless” and that “she is highly regarded as an educator.”
Yet somehow, when Lederman received her 2013-14 evaluation, which is based in part on student standardized test scores, she was rated as “ineffective.” Now she has sued state officials over the method they used to make this determination in an action that could affect New York’s controversial teacher evaluation system.
How is it that a teacher known for excellence could be rated “ineffective”?
The convoluted statistical model that the state uses to evaluate how much a teacher “contributed” to students’ test scores awarded her only one out of 20 possible points. These ratings affect a teacher’s reputation and at some point are supposed to be used to determine a teacher’s pay and even job status.
The evaluation method, known as value-added modeling, or VAM, purports to be able to predict through a complicated computer model how students with similar characteristics are supposed to perform on the exams — and how much growth they are supposed to show over time — and then rate teachers on how much their students compare to the theoretical students. New York is just one of the many states where VAM is one of the chief components used to evaluate teachers.
If it sounds as if it doesn’t make a lot of sense, that’s because it doesn’t. Testing experts have for years now been warning school reformers that efforts to evaluate teachers using VAM are not reliable or valid. But reformers, including Education Secretary Arne Duncan, have embraced the method as a “data-driven” evaluation solution championed by some economists. Earlier this year, the American Statistical Association issued a report slamming the use of VAM for teacher evaluation, saying in part:
*VAMs are generally based on standardized test scores and do not directly measure potential teacher contributions toward other student outcomes.
*VAMs typically measure correlation, not causation: Effects – positive or negative – attributed to a teacher may actually be caused by other factors that are not captured in the model.
Lederman filed her lawsuit against New York State Education Commissioner John King Jr., Assistant Commissioner Candace H. Shyer and the Office of State Assessment of the New York State Education Department, challenging the rationality of the VAM model being used to evaluate her and, by extension, other teachers. The suit alleges that the New York State Growth Measures “actually punishes excellence in education through a statistical black box which no rational educator or fact finder could see as fair, accurate or reliable.
”The lawsuit shows that Lederman’s students traditionally perform much higher on math and English Language Arts standardized tests than average fourth-grade classes in the state. In 2012-13, 68.75 percent of her students met or exceeded state standards in both English and math. She was labeled “effective” that year. In 2013-14, her students’ test results were very similar but she was rated “ineffective.” The lawsuit says:
This simply makes no sense, both as a matter of statistics and as a matter of rating teachers based upon slight changes in student performance from year to year.
Superintendent Dolan supported Lederman, saying in an affidavit:
As superintendent of the GNPS, I have personally known Dr. Lederman for approximately 4 years. I have had the opportunity to meet with her personally. I have also reviewed her record of teaching, particularly the performance of her students on New York State assessment tests. I can personally attest that she is highly regarded as an educator by the administration of GNPS. Her classroom observations have consistently identified her as an exceptional educator. She is widely regarded in the GNPS as someone who brings out the best in her students. She has taught for seventeen (17) years in the GNPS and her record is flawless.
Sharon Fougner, the principal at Elizabeth M. Baker Elementary School, where Lederman teaches, signed an affidavit saying that she believes the awarding of 1 out of 20 possible points to Lederman under VAM is”arbitrary and capricious” and agreed with Dolan that Lederman is an excellent teacher.
Still, the state of New York says she is “ineffective,” and offers a teacher no way to appeal the result.
The lawsuit will be worth watching because it is taking on the entire notion of VAM. If VAM were to fall in New York, more legal challenges would be likely in other states."
Using a data driven methodology in performance evaluations is not a bad idea. In fact it's a great idea if done right. But my lying eyes tell me that the adopters of Value Added Modeling, with all of its complicated algorithms, seem to have outsmarted themselves in trying to solve this problem. Had they been guided by the problem solving principle known as Occam's Razor, which was formulated some 800 years ago, they would have only proceeded to more complex theories and solutions after their simpler ones didn't prove sufficient. This teacher's students are clearly outperforming by the simpler measures. Why discard that scoring methodology and replace it with something that is clearly flawed?
I can still remember my Statistics 101 teacher explaining correlation and warning that it didn't equal causation. He used the example of eating an ice cream and rainfall and made the point that while the events might be correlated, one did not cause the other. He said something like, 'It might rain every time you eat ice cream, but that doesn't mean your eating ice cream makes it rain.' I know the architects of VAM understand this. I don't know why they ignored it.
So again, I say I'd hire this teacher. And I'd fire anybody who endorsed VAM as a solution.