Approaching Education Data the Nate Silver Way

My girlfriend’s very hospitable and generous family gave me some great gifts for the holidays when I stayed with them in upstate New York. As I rocked my new Teach For America T-shirt in the Rochester airport on Christmas Eve, my cursory overview of Nate Silver’s new book, The Signal and the Noise, inspired me to write this post.

While most people probably know Silver for his election predictions and designation in 2009 as one of the world’s 100 Most Influential People, Silver has been my baseball stat guru for considerably longer than he’s been doing political analysis. In one of my favorite books of all time, Baseball Between the Numbers, Silver penned a brilliant examination of clutch hitting that I still quote at least four or five times a year. I have generally found Silver’s arguments compelling not just because of his statistical brilliance, but also because of his high standards for data collection and analysis, evident in the following passage from the introduction of his book:

The numbers have no way of speaking for themselves. We speak for them. We imbue them with meaning…[W]e may construe them in self-serving ways that are detached from their objective reality…Before we demand more of our data, we need to demand more of ourselves.

In few fields are Silver’s words as relevant as education. While the phrase “data-driven” has become ubiquitous in discussions of school reform and high-quality instruction, most people discussing education have very little understanding of what the statistics actually say. As I’ve written before, many studies that reformers reference to push their policy agendas are methodologically unsound, and many more have findings very different than the summaries that make it into the news.

It’s hard to know how many reformers just don’t understand statistics, how many fall victim to confirmation bias, and how many intentionally mislead people. But no matter the reason for their errors, those of us who care about student outcomes have a responsibility to identify statistical misinterpretation and manipulation and correct it. Policy changes based on bad data and shoddy analyses won’t help (and will quite possibly harm) low-income students.

Fortunately, I believe one simple practice can help us identify truth in education research: read the full text of education research articles.

Yes, reading the full text of academic research papers can be time consuming and mind-numbingly dull at times, but reading articles’ full text is vitally important if you want to understand research findings. Sound bites on education studies rarely provide accurate information. In a Facebook comment following my most recent post about TFA, a former classmate of mine referenced a 2011 study by Raj Chetty to argue that we can’t blame the achievement gap on poverty. “If you leave a low value-added teacher in your school for 10 years, rather than replacing him with an average teacher, you are hypothetically talking about $2.5 million in lost income,” claims one of the co-authors of the study in a New York Times article. Sounds impressive. Look under the hood, however, and we find that, even assuming the study’s methodology is foolproof (it isn’t), the actual evidence can at best show an average difference of $182 in the annual salaries of 28-year-olds.

As I’ve mentioned before, there’s also a poor statistical basis for linking student results on standardized test scores to teacher evaluation systems. Otherwise useful results can give readers the wrong impression when they gloss over or omit this fact, a point underscored by a recent article describing an analysis of IMPACT (the D.C. Public Schools teacher evaluation system). The full text of the study provides strong evidence that the success of D.C.’s system thus far has been achieved despite a lack of variation in standardized test score results among teachers in different effectiveness categories. Instead, the successes of the D.C. evaluation system are driven by programs teachers unions frequently support, programs like robust and meaningful classroom observations that more accurately measure teacher effectiveness.

Policymakers have misled the public with PISA data as well. In a recent interview with MSNBC’s Chris Hayes, Michelle Rhee made the oft-repeated claim that U.S. schools are failing because American students, in aggregate, score lower on international tests than their peers in other countries. Yet, as Hayes pointed out, it is abundantly clear from a more thorough analysis that poverty explains the PISA results much better than school quality, not least because poor US students have been doing better on international tests than poor students elsewhere for several years.

I would, in general, recommend skepticism when reading articles on education, but I’d recommend skepticism in particular when someone offers a statistic suggesting that school-related changes can solve the achievement gap. Education research’s only clear conclusion right now is that poverty explains the majority of student outcomes. The full text of Chetty’s most recent study defending value-added models acknowledges that “differences in teacher quality are not the primary reason that high SES students currently do much better than their low SES peers” and that “differences in [kinder through eighth grade] teacher quality account for only…7% of the test score differences” between low- and high-income schools. In fact, that more recent study performs a hypothetical experiment in which the lowest-performing low-income students receive the “best” teachers and the highest-performing affluent students receive the “worst” teachers from kinder through eighth grade and concludes that the affluent students would still outperform the poor students on average (albeit by a much smaller margin). Hayes made the same point to Rhee that I made in my last post: because student achievement is influenced significantly more by poverty than by schools, discussions about how to meet our students’ needs must address income inequality in addition to evidence-based school reforms. We can’t be advocates for poor students and exclude policies that address poverty from our recommendations.

When deciding which school-based recommendations to make, we must remember that writers and policymakers all too often misunderstand education research. Many reformers selectively highlight decontextualized research that supports their already-formed opinions. Our students, on the other hand, depend on us to combat misleading claims by doing our due diligence, unveiling erroneous interpretations, and ensuring that sound data and accurate statistical analyses drive decision-making. They rely on us to adopt Nate Silver’s approach to baseball statistics: continuously ask questions, keep an open mind about potential answers, and conduct thorough statistical analyses to better understand reality. They rely on us to distinguish statistical significance from real-world relevance. As Silver writes about data in the information age more generally, education research “will produce progress – eventually. How quickly it does, and whether we regress in the meantime, depends on us.”

Update: Gary Rubinstein and Bruce Baker (thanks for the heads up, Demian Godon) have similar orientations to education research – while we don’t always agree, I appreciate their approach to statistical analysis.

Update 2 (6/8/14): Matthew Di Carlo is an excellent read for anyone interested in thoughtful analysis of educational issues.

Update 3 (7/8/14): The Raj Chetty study linked above seems to have been modified – the pieces I quoted have disappeared. Not sure when that happened, or why, but I’d love to hear an explanation from the authors and see a link to the original.

7 responses to “Approaching Education Data the Nate Silver Way”

Nita Spielberg

December 26, 2013

A very well-researched article and well presented. Thanks! Nita Spielberg

Jon Zaid

December 30, 2013

Hey Ben, do you believe that the effectiveness of a teacher is ultimately unquantifiable? Unlike baseball, you’ll never get thousands of data points or ways to measure widget production.

1. Ben Spielberg
  
  January 1, 2014
  
  Though measurement is imperfect, I think pretty much anything can be measured well. Good teaching has been hard to quantify so far but research on doing so is still relatively new. I’m confident we’ll eventually find some metrics that give us reliable and valid information.
  
John Thompson

February 1, 2014

Ben,
I now know I need to follow you. I also need to send you my more detailed critiques of Chetty and Vergara, but this is handy and short.
http://www.huffingtonpost.com/john-thompson/why-a-nice-philanthropist_b_4598322.html
I was a legal historian before the Hoova set of the Crips took over my neighborhood, I got attached to the suffering kids and became an inner city teacher.

gamestop black friday

March 2, 2014

Way cool! Some extremely valid points! I appreciate you writing this write-up and also the rest
of the site is extremely good.

1. Ben Spielberg
  
  March 2, 2014
  
  Thanks so much! Glad you’re enjoying the site.
  
gamezebo.com

April 13, 2014

Right here is the perfect blog for everyone who really wants to understand this topic.

You know so much its almost tough to argue with you (not that I really would
want to…HaHa). You certainly put a new spin on a
subject that has been written about for many years.
Excellent stuff, just great!