Imagine I observe two poker players playing two tournaments each. During their first tournaments, Player A makes $1200 and Player B loses $800. During her second tournament, Player A pockets another $1000. Player B, on the other hand, loses $1100 more during her second tournament. Would it be a good decision for me to sit down at a table and model my play after Player A?
For many people the answer to this question – no – is counterintuitive. I watched Player A and Player B play two tournaments each and their results were very different – haven’t I seen enough to conclude that Player A is the better poker player? Yet poker involves a considerable amount of luck and there are numerous possible short- and longer-term outcomes for skilled and unskilled players. As Nate Silver writes in The Signal and the Noise, I could monitor each player’s winnings during a year of their full-time play and still not know whether either of them was any good at poker. It would be fully plausible for a “very good limit hold ‘em player” to “have lost $35,000” during that time. Instead of focusing on the desired outcome of their play – making money – I should mimic the player who uses strategies that will, over time, increase the likelihood of future winnings. As Silver writes,
When we play poker, we control our decision-making process but not how the cards come down. If you correctly detect an opponent’s bluff, but he gets a lucky card and wins the hand anyway, you should be pleased rather than angry, because you played the hand as well as you could. The irony is that by being less focused on your results, you may achieve better ones.
As Silver recommends for poker and Teach For America recommends to corps members, we should always focus on our “locus of control.” For example, I have frequently criticized Barack Obama for his approach to the Affordable Care Act. While I am unhappy that the health care bill did not include a public option, I couldn’t blame Obama if he had actually tried to pass such a bill and failed because of an obstinate Congress. My critique lies instead with the President’s deceptive work against a more progressive bill – while politicians don’t always control policy outcomes, they do control their actions. As another example, college applicants should not judge their success on whether or not colleges accept them. They should evaluate themselves on what they control – the work they put into high school and their applications. Likewise, great football coaches recognize that they should judge their teams not on their won-loss records, but on each player’s successful execution of assigned responsibilities. Smart decisions and strong performance do not always beget good results; the more factors in-between our actions and the desired outcome, the less predictive power the outcome can give us.
Most education reformers and policymakers, unfortunately, still fail to recognize this basic tenet of probabilistic reasoning, a fact underscored in recent conversations between Jack Schneider (a current professor and one of the best high school teachers I’ve ever had) and Michelle Rhee. We implement teacher and school accountability metrics that focus heavily on student outcomes without realizing that this approach is invalid. As the American Statistical Association’s (ASA’s) recent statement on value-added modeling (VAM) clearly states, “teachers account for about 1% to 14% of the variability in [student] test scores” and “[e]ffects – positive or negative – attributed to a teacher may actually be caused by other factors that are not captured in the model.” Paul Bruno astutely notes that the ASA’s statement is an indictment of the way VAM is used, not the idea of VAM itself, yet little correlation currently exists between VAM results and effective teaching. As I’ve mentioned before, research on both student and teacher incentives suggests that rewards and consequences based on outcomes don’t work. When we use student outcome data to assign credit or blame to educators, we may close good schools, demoralize and dismiss good teachers, and ultimately undermine the likelihood of achieving the student outcomes we want.
Better policy would focus on school and teacher inputs. For example, we should agree on a set of clear and specific best teaching practices (with the caveat that they’d have to be sufficiently flexible to allow for different teaching styles) on which to base teacher evaluations. Similarly, college counselors should provide college applicants with guidance about the components of good applications. Football coaches should likewise focus on their players’ decision-making and execution of blocking, tackling, route-running, and other techniques.
Outcomes are incredibly important to monitor and consider when selecting effective inputs, of course. Mathematicians use outcomes in a process called Bayesian analysis to constantly update our assessments of whether or not our strategies are working. If we observe little correlation between successful implementation of our identified best teaching practices and student growth for five consecutive years, for instance, we may want to revisit our definition of best practices. A college counselor whose top students are consistently rejected from Ivy League schools should begin to reconsider the advice he gives his students on their applications. Relatedly, if a football team suffers through losing season after losing season despite players’ successful completion of their assigned responsibilities, the team should probably overhaul its strategy.
The current use of student outcome data to make high-stakes decisions in education, however, flies in the face of these principles. Until we shift our measures of school and teacher performance from student outputs to school and teacher inputs, we will unfortunately continue to make bad policy decisions that simultaneously alienate educators and undermine the very outcomes we are trying to achieve.
Update: A version of this piece appeared in Valerie Strauss’s column in The Washington Post on Sunday, May 25.