Tag Archives: ed reform

How Accusations of “Negativity” and “Divisiveness” Stifle Debate

by Yvonne Slosarski and Nathan Luecking

To all the leftist organizers out there: How many times have you been called “negative”? How often have those in power accused you of being “divisive”?

If your organizing experience is anything like ours, you may be nodding your head in agreement. It’s mid-October of an election year, which means that left-leaning candidates all over the country are facing accusations of “negativity.” In DC, our city, Elissa Silverman – one of the most left-leaning representatives in DC government – was called “the most divisive politician in the city” by her developer-backed opponents.

As volunteers for Emily Gasoi’s campaign for DC State Board of Education in Ward 1, we are often accused of “going negative” by Gasoi’s opponents. Given our research, professional, and organizing experiences, we recognize this tactic for what it is – an attempt to squash legitimate disagreement.

The accusation of “negativity” or “divisiveness” tends to function in three main ways.

1) It minimizes legitimate dissent to the status quo.

The call for “civility” has historically tended to silence people who dissent from the status quo. What counts as “civil” tends to support the existing power structure and celebrate what our political morality demands that we condemn.

In DC’s Ward 1, the call for “positivity” is similarly being used to shut down challengers to corporate education reform.

Gasoi’s opponent, Jason Andrean, is a Capital One Executive for Government Contracting. He also was a board member of Democrats for Education Reform (DFER), a market-based education reform organization started by hedge-fund managers. DFER advocates against teachers’ unions and for high-stakes testing and charter schools as the primary ways forward in education. Gasoi’s opponent also chairs the board of Achievement Prep, a high-stakes-testing charter school in DC that has been cited for excessive punitive measures, poor educational outcomes, and high teacher turnover.

Gasoi is running for the Ward 1 seat, in part, to challenge the corporate education reform model of DFER. She knows that the finance industry has too much power in education policy and that market approaches have re-segregated schools, lessened “deep learning” for minoritized students, and denied power to the people closest to classrooms – teachers, families, and students.

But corporate education reform is the status quo in DC, so pointing out Andrean’s connections to DFER and the banking industry – and his lack of education experience – is considered an “attack” by his campaign, which wrote the following in a recent email:

Throughout this race, one of my opponents has attacked my motives and has suggested that only someone with a doctorate deserves to represent the families of Ward One. She’s even gone so far as to attack my supporters and those who believe that ALL voices have value as we work to fix what’s broken in our public education system.

Aside from inaccurately portraying Gasoi’s claims, this email suggests that there is no room for criticizing corporate education reform. But how can we be “positive” about it when the stakes are so high for our students?

2) It obscures meaningful differences.

Organizations and candidates have meaningful differences in priorities and experiences. In a neoliberal environment, “positivity” rhetoric draws on an empty notion of individual equality to suggest that all experiences are somehow the same.

Returning to Ward 1, Andrean wrote the following in a Medium piece about his candidacy:

Since embarking on this journey my opponent, Ms. Gasoi, has made it her mission to lambast my character and discredit my education experience — which she deems inferior to her own. I don’t come to this race with an Ed.D. in education policy or having spent time as a classroom teacher, but like the majority of families that look like mine, I want my lived experience to be valued and represented on the State Board of Education. My opponent often tells others that I’m a ‘banker with no education experience’ when out on the campaign trail. The reality is that we all have an ‘education experience’ and that’s why I’m running for the SBOE… [O]ur leaders should reject the notion that there’s only one type of representative we should be electing to serve our kids and families.

When Andrean writes, “we all have an ‘education experience,’” he minimizes a very important difference between him and Gasoi. Unlike him, Gasoi has devoted her entire professional life to public education. That’s part of why her priorities, unlike his, are aligned with what’s best for students in DC.

3) It takes the conflict out of politics, to ensure that the powerful win.

Civility rhetoric presumes a shared interest between groups that—structurally—are in conflict. Where one group is up because another is down, we must bring conflict into the forefront, and those in power may label such disruption “negative.”

In the Ward 1 School Board race, Andrean and his supporters have consistently shied away from his policy priorities, instead uplifting their “positivity.” For example, his campaign tweeted:

Instead of debating policy priorities, he hails himself as the “positive” candidate, thus shutting down debate over consequential policies. As Chantal Mouffe and Ernesto Laclau argued, the status quo is always just one version of the world and conflict is an inherent part of “the political.” Forced positivity cuts off debate over decisions that matter. And with no real conflict, the powerful—who often benefit from inertia—win.

Of course, quashing legitimate and consequential debate is a serious problem for people trying to choose a candidate between options. Are you supposed to choose a representative based on how abstractly positive they are? What if they gut public services with a smile on their face?

This rhetoric of “positivity” seriously obscures the real-life consequences of policies that should legitimately be challenged.

Andrean, who has been the Chairman of the Board at Achievement Prep Public Charter School in Ward 8 of Washington, DC since 2016, has a troubling track record.  Under his leadership, Achievement Prep has fostered a culture of punitive discipline, favored behavior management over classroom instruction, and responded inadequately to teacher concerns. DC voters who care about student outcomes and emotional well-being need to know this history.

In a 2018 Qualitative Site Review of Achievement Prep’s Elementary Campus, the DC Public Charter School Board observers noted that “Academic expectations and rigor were low across the campus. Class time was mostly devoted to managing behavior to keep students safe and compliant.” As the rest of DC moves towards a trauma-informed approach to discipline focused on restorative practices, Achievement Prep continues to embrace an archaic, punitive, zero-tolerance approach to behavior management. This is evidenced by Achievement Prep’s suspension rate, which is twice that of the city average. In addition, student consequences are imposed with little consistency and vary between students. The site survey reported:

Students screamed and called one another hurtful names and hit each other without consequence, while other students engaged in the same behavior received consequences inconsistently…In one observation an adult dragged a student by the hand out of the classroom when he went into crisis.

There was also a highly publicized incident in which a six-year-old girl suffered a concussion after a substitute from a privately contracted company dragged her across the floor.

In another incident in the spring of 2018, an Achievement Prep teacher was sexually assaulted by a visitor on school property. In response, school leadership put the teacher on involuntary unpaid leave for the remainder of the year. The teacher effectively lost nearly $3,000 in wages. While Achievement Prep staff organized, demanding safer working conditions, Achievement Prep leadership has not responded to this call for increased safety requirements. The lack of concern Achievement Prep leadership has shown may reflect why, of the 51 reviews posted by former employees on glassdoor.com, only 6% recommend working at the school. (Note that teacher working conditions and student success are linked, as evidenced by this study in the American Journal of Education.)

Achievement Prep

glassdoor.com reviews of Achievement Prep

Given these issues at Achievement Prep, it’s not surprising that concerns about student discipline, teacher recruitment, and management led to the rejection of Andrean’s 2015 application for a different charter school. Similar concerns drive our opposition to his candidacy and to corporate education reform more generally, and it would be irresponsible not to the tell the truth about his record. When we know the potential consequences of his winning the election, “keeping it positive” would be the lowest of lows.

Yvonne Slosarski has a Ph.D. in Rhetoric & Political Culture. She is an organizer and researcher on movements for economic justice, a Humanities professor, and the associate director of an honors program at the University of Maryland.  

Nathan Luecking is a School Social Worker in the District of Columbia. He is a school mental health advocate and sits on a city-wide Coordinating Council for school mental health.

 

Leave a comment

Filed under 2018 Elections, Philosophy, US Political System

Social Justice Unionism, Education On Tap Style

I recently discussed why teachers unions are important agents of social justice on “Education On Tap,” a Teach For America (TFA) podcast created by Aaron French. I really enjoyed the conversation – French goes above and beyond his promise to make the show “a little bit of fun” – and appreciated TFA’s continued (though still very young) efforts to deconstruct myths about organized labor and education reform ideas.

You can listen to the podcast below:

A couple additional details on two of the topics we discussed:

1) How we refer to education stakeholders: We often use the phrase “reformer” to describe people on one “side” of the education debate. As Nick Kilstein explains, we typically think “reformers” do the following:

1. Support market forces including choice and competition as a mechanism to improve all schools. This is usually done through vouchers and charter schools.

2. Support business practices including evaluation, promotion and merit pay to motivate and attract teachers

3. Hold that teachers and schools should be accountable for student achievement, usually measured by standardized testing

4. Support alternate paths to the classroom through programs like Teach For America

5. Affiliate themselves with no-excuses charter schools

However, neither French nor I (nor Kilstein) are crazy about this term, for a few reasons. First, the group of people we call “reformers” sometimes have drastically different views on these topics. For example, opinions about the appropriateness of suspending students vary widely among people who support the rapid expansion of charter schools. Because “reformers” don’t hold monolithic views, it doesn’t make a ton of sense to lump them all into the same category.

Second, using the term “reformers” erroneously suggests that only a certain group of people support school improvements. However, teachers in unions and other critics of typical education reform efforts fight for school reforms themselves; they just have a different (and, on balance, more evidence-based and theoretically sound) perspective about which reforms we should pursue on behalf of students in low-income communities. Despite misleading claims to the contrary, very few people actually support the “status quo” in education. Though the word has become associated with negative imagery for a lot of education stakeholders, nearly everyone is a “reformer” to some extent.

Third, the use of a term like “reformers” reinforces the notion that there are two polarized “sides” in education debates, the “reformers” and their opponents. As I discussed with French, I believe the “sides” are much less in opposition than they sometimes appear to be, and that most people in education are in general agreement on the vast majority of issues. The more we can deconstruct the notion of “sides,” the better.

That said, I don’t have a great solution to either the first or third problems (for the second, I’d recommend that we use clunkier phrases more like “proponents of market-driven reforms to education” and “advocates for a comprehensive social justice approach to education policy” when we can). Categories can be useful for brevity’s sake, and as is evident below, it’s hard to construct an argument while avoiding categorization altogether. Still, I think it’s worth reflecting on our naming conventions as we endeavor to be more nuanced.

2) Why unions are power-balancing advocates for low-income kids: French explained during our discussion that many people believe the San Jose Teachers Association (SJTA, the local union for which I served as an Executive Board member from 2012 to 2014) to be an atypically progressive union. In reality (and I believe French agrees), the vast majority of unions, including national teachers unions like the National Education Association (NEA) and American Federation of Teachers (AFT), are some of the most power-balancing institutions out there.

Recent research by Martin Gilens confirms this fact: unions consistently advocate on behalf of less advantaged populations on a wide range of social justice issues. They serve as an important counterbalance to wealthy interests and exploitative policies, and have made extremely important gains for working Americans throughout their history. It’s probably not a coincidence that the steep decline in unionization over the past thirty years has coincided with a steep increase in earnings, income, and wealth inequality.

That doesn’t mean unions can’t be wrong on certain issues. We should absolutely condemn the behavior of police unions that defend racist positions, for example, and demand that they be held accountable and change. Teachers unions shouldn’t be immune from criticism, either, and it’s imperative that we confront them when we believe their positions are misguided. Not all teachers unions have realized their potential as social justice unions just yet, and while I firmly believe that a different approach from the education community would help more of them do so, organized labor must also proactively analyze and revise practices that don’t fit its mission.

Yet we must also remember that teachers unions have very strong track records on behalf of low- and moderate-income families, and more credibility as advocates for low-income kids than many of the people and organizations who malign unions. Even if you think certain teachers unions are wrong about aspects of education policy, it’s completely inaccurate to argue that their existence harms low-income kids. The empirical evidence is clear (much clearer, in general, than the evidence about education policy ideas) that teachers unions are a major net positive for low-income populations.

There’s joint responsibility to change the tone of education conversations, and union members must avoid becoming reflexively defensive when confronted with criticism. We do ourselves and our students a disservice when we react by ignoring people outright or slinging insults right back; instead, we should try to understand the legitimate elements of critiques, address them, and educate people on where they’re wrong and how to have more productive dialogue.

At the same time, union members and leaders are understandably offended when proponents of market-driven reforms (making an attempt!) imply that union opposition to these reforms is borne of laziness, selfishness, and/or incompetence. Everyone needs to remember that teachers in unions, who are directly student-facing and who will actually implement education reform ideas, typically have good ideas about what students need, and that both private and public sector unions are important advocates for low-income people in general. While there is some shared responsibility, the tone of the debate cannot change until proponents of market-driven reforms acknowledge these facts. The sooner anti-union messaging becomes a thing of education reform conversations past, the sooner we can collaboratively develop great policies for students.

A big thank you to French for having me on the show, and hope you enjoy the podcast!

1 Comment

Filed under Education, Labor

The Political Lens: What Global Warming and Wright v. New York Have in Common

During the 2003-2004 school year, my chemistry teacher told my class that global warming wasn’t occurring.  I believed her.  When I attended New Jersey’s Governor’s School of International Studies in the summer of 2005, a professor told me the opposite – the evidence for global warming, and for the human contribution to it, was virtually incontrovertible.  Confused about what to think, I began to research the issue.  I also reached out to some of my other former teachers to ask for their input.

Three things became immediately clear.  First, most popular articles about global warming contained more empty rhetoric than useful information.  The mainstream media, as it far too frequently does, focused not on the truth but on grandstanding and a false sense of balance.  Second, I didn’t know enough climate science to look through a given study’s results and determine their legitimacy.  Third, I didn’t have to – a different approach could tell me everything I needed to know about each study’s likely veracity.

Global warming research falls into two categories: research by legitimate scientists and “research” funded by big energy interestsLegitimate scientists, who have no economic incentive to lie, conclude that global warming is a manmade crisis deserving our immediate action.  The few studies that suggest otherwise are normally sponsored by organizations like Exxon and the American Petroleum Institute, interest groups with billions of dollars invested in the activity responsible for global warming.

As with global warming, knowledge of the individual and organizational incentives behind opposing “sides” of any debate provides us with critical information.  This “political lens,” though not completely foolproof, reminds us that certain claims deserve a larger dose of skepticism than others.  The agendas behind a movement are especially important to consider when we lack in-depth knowledge of a particular issue.

In education policy debates, “reformers” far too often selectively and inaccurately apply the political lens or dismiss its importance.  That dynamic surfaced after Stephen Colbert interviewed former CNN anchor Campbell Brown on July 31. Brown’s organization, Partnership for Educational Justice, had filed Wright v. New York three days before the interview.  Wright, modeled after Vergara v. California, challenges several aspects of teacher employment law.

A small group of teachers, parents, and grassroots organizers showed up to protest Brown’s appearance on the show.  Colbert, responding to the protesters and the Twitter hashtag #questions4campbell, asked Brown about her organization’s funding sources.  Brown refused to disclose her donors.  Amidst the criticism that followed, various stakeholders have rushed to Brown’s defenseThey continue to argue that a focus on Brown’s donors and political affiliations is a “desperate effort to distract from the real conversation” about teacher employment law.

The truth of the matter, however, is that educators would love to focus on substantive conversations about teacher employment law.  Teacher “tenure” and dismissal and layoff procedures, though they are intended to protect both student and teacher access to a positive, productive educational experience, don’t always work as intended.  Unions recognize this problem and recommend legislative improvements that simultaneously address issues with the execution of the laws and preserve their important components.  We also frequently discuss the laws on their merits.  Additionally, student advocates would love to see reformers, unions, and legislators engaged in substantive conversations about how to unite behind and fight for causes that matter considerably more for the lives of low-income students: in-school causes like funding equity and improved teacher support and out-of-school causes like the living wage and immigrant rights.

Unfortunately, pro-Wright propaganda, featured much more prominently in the mainstream media than legitimate arguments for the defense, often drowns out these “real conversations.”  No teacher has a job for life, competent school districts can and do dismiss bad teachers, and there is absolutely no evidence that teacher employment law causes inequities between low-income and high-income schools***, yet relatively large swaths of the American public have bought Brown’s misleading narrative and harbor severe misconceptions about the statutes and their effects.  Brown isn’t leading her crusade with a rigorous analysis of the facts and sound logical argument; instead, she “addresses” the lawsuit’s substantive critiques by ignoring inconvenient statistics and logic and implying that disagreement indicates a disregard for the well-being of children.  It’s hard for the public to understand the nuances of education law and research when Wright supporters prominently and erroneously equate opposition to the lawsuit with the defense of horrible teachers.

Thus while education law and research is arguably less complicated than the science behind global warming, the political lens is equally important to consider in this debate.  It’s theoretically possible that the unions who defend teacher employment law do so to protect teachers who call students names and sleep in class.  And it’s theoretically possible that Campbell Brown and her unnamed donors care more about the lives of low-income kids than do the unionized teachers who work with them every day.  It’s also theoretically possible that Exxon produces more honest research about global warming than does the entire scientific community.  But these theoretical possibilities are all extremely unlikely.

Instead, it’s significantly more likely that Campbell Brown’s donors, like the people who funded Vergara v. California, actively exacerbate economic inequality.  That Wright v. New York and Vergara conveniently allow them to undermine organized labor and distract us from the ways their business and political activities harm the families of the very same low-income students they purport to help.  That teachers in unions care deeply about delivering an excellent education to their students, and that their opposition to the lawsuit stems from its negative narrative, erroneous claims and premises, and failure to provide solutions to the actual causes of teacher quality issues.  In other words, looking through our political lens reminds us that there are literally billions more “adult interests” in support of Wright v. New York than in its defense.

Educators must continue to clarify facts about teacher employment law and support responsible reforms.  Most proponents of challenges to the statutes are well-intentioned, and a focus on agendas alone would not do the issues justice.  It is also entirely legitimate, however, to call attention to the profit and political motives behind lawsuits like Wright and Vergara.  Knowledge of donors and allies helps us understand why, when unions and Campbell Brown present conflicting information about the law’s intent and effects, Campbell Brown’s claims warrant significantly more suspicion.

Campbell Brown graphic

***While the plaintiffs in Wright, unlike those in Vergara, do not erroneously contend in their complaint that the laws cause inequities between low- and high-income schools, the idea that low-income students are disproportionately impacted by bad teachers was mentioned by Brown in her appearance on The Colbert Report and still surfaces in discussions of the lawsuit.

Update: A version of this post appeared on The Huffington Post on October 2.

1 Comment

Filed under Education, Labor, Philosophy

On Education and Poverty, and How We Talk About Them (Part 3b)

StudentsFirst Vice President Eric Lerum and I recently began a debate about approaches to teacher evaluation.  During Part 2 of that debate, the conversation touched on the relationship between anti-poverty work and education reform.  We resume that conversation below.

Here were the relevant parts of our original exchange, in case you missed it:

Lerum: The larger point that is made repeatedly is that because outside factors play a larger overall role in impacting student achievement, we should not focus on teacher effectiveness and instead solve for these other factors. This is a key disconnect in the education reform debate. Reformers believe that focusing on things like teacher quality and focusing on improving circumstances for children outside of school need not be mutually exclusive. Teacher quality is still very important, as Shankerblog notes. Improving teacher quality and then doing everything we can to ensure students have access to great teachers does not conflict at all with efforts to eliminate poverty. In fact, I would view them as complementary. But critics of these reforms use this argument to say that one should come before the other – that because these other things play larger roles, we should focus our efforts there. That is misguided, I think – we can do both simultaneously. And as importantly in terms of the debate, no reformer that I know suggests that we should only focus on teacher quality or choice or whatever at the expense or exclusion of something else, like poverty reduction or improving health care.

Spielberg: I believe you discuss [a] very important question…Given that student outcomes are primarily determined by factors unrelated to teaching quality, can and should people still work on improving teacher effectiveness?

Yes!  While teaching quality accounts for, at most, a small percentage of the opportunity gap, teacher effectiveness is still very important.  Your characterization of reform critics is a common misconception; everyone I’ve ever spoken with believes we can work on addressing poverty and improving schools simultaneously.  Especially since we decided to have this conversation to talk about how to measure teacher performance, I’m not sure why you think I’d argue that “we should not focus on teacher effectiveness.”  I am critiquing the quality of some of StudentsFirst’s recommendations – they are unlikely to improve teacher effectiveness and have serious negative consequences – not the topic of reform itself.  I recommend we pursue policy solutions more likely to improve our schools.

Critics of reform do have a legitimate issue with the way education reformers discuss poverty, however.  Education research’s clearest conclusion is that poverty explains inequality significantly better than school-related factors.  Reformers often pay lip-service to the importance of poverty and then erroneously imply an equivalence between the impact of anti-poverty initiatives and education reforms.  They suggest that there’s far more class mobility in the United States than actually exists.  This suggestion harms low-income students.

As an example, consider the controversy that surrounded New York mayor Bill de Blasio several months ago.  De Blasio was a huge proponent of measures to reduce income inequality, helped reform stop-and-frisk laws that unfairly targeted minorities, had fought to institute universal pre-K, and had shown himself in nearly every other arena to fight for underprivileged populations.  While it would have been perfectly reasonable for StudentsFirst to disagree with him about the three charter co-locations (out of seventeen) that he rejected, StudentsFirst’s insinuation that de Blasio’s position was “down with good schools” was dishonest, especially since a comprehensive assessment of de Blasio’s policies would have indisputably given him high marks on helping low-income students.  At the same time, StudentsFirst aligns itself with corporate philanthropists and politicians, like the Waltons and Chris Christie, who actively exploit the poor and undermine anti-poverty efforts.  This alignment allows wealthy interests to masquerade as advocates for low-income students while they work behind the scenes to deprive poor students of basic services.  Critics argue that organizations like StudentsFirst have chosen the wrong allies and enemies.

I wholeheartedly agree that anti-poverty initiatives and smart education reforms are complementary.  I’d just like to see StudentsFirst speak honestly about the relative impact of both.  I’d also love to see you hold donors and politicians accountable for their overall impact on students in low-income communities.  Then reformers and critics of reform alike could stop accusing each other of pursuing “adult interests” and focus instead on the important work of improving our schools.

Lerum: So I’m beginning to understand where some of the miscommunication is coming from. You speak a lot about how you view StudentsFirst’s (and other reformers’) discussion of poverty from the perspective of what you expect us to talk about, rather than from the perspective of our stated objectives. That is, what you deem as “lip service” is merely an acknowledgement of something that is not our primary focus. There are many folks in education reform – I have a few on my team – who could spend hours talking about poverty reduction and could very easily work in another field that more traditionally aligns with what you think of as efforts geared toward reducing poverty. But the route we’re taking is one where reducing poverty, achieving social justice, lifting the long-term opportunities for our country – they all intersect. And therefore what we focus on at StudentsFirst are the policy levers – what we think of as levers for reform or change. For example, creating the conditions for other reforms to flourish or for educators and school leaders to use their resources more wisely (fiscal transparency, structuring smarter compensation systems, creating more school-level autonomy) are levers, whereas something like instituting a STEM program or increasing funding for social and mental health services would be specific programs or initiatives. Both are great for kids. Both are needed in order to ultimately reduce poverty. But we’re squarely focused on the former, while critics seem to be expecting we would focus on the latter. This disconnect is made worse though because critics seem to believe that an approach that involves initiatives is the only way to combat poverty. There’s a lack of appreciation and understanding of what’s intended by reform efforts that target levers.

Spielberg: I actually wasn’t talking about the distinction between levers and initiatives; I was talking about accurate messaging and political activity.

My two critiques from above (rephrased and with my questions for you added) were:

1) StudentsFirst leaders and board members frequently suggest that education can improve the lives of low-income kids as much or more than alleviating poverty.  That suggestion is demonstrably false.  You could say the following, but don’t: “Research is clear that school-related factors cannot fix the achievement gap, but it’s also clear that schools make a difference.  They seem to account for about 20% of student achievement, and our organization believes we can maximize the impact of this 20% with an intense focus on certain policy levers.  We fully support other organizations that work on the anti-poverty efforts that are most important for low-income kids.”  Why won’t you speak honestly about the limitations and relative importance of the reforms you push when compared with other efforts?

2) Relatedly, StudentsFirst supports politicians (besides just Chris Christie, who I discussed above) who substantially harm some of the neediest kids: your preferred candidates have rejected the Medicaid expansion, slashed education spending, tried to prevent immigrants from enrolling in school, and actively discriminated against LGBT youth (though you finally withdrew support for your 2013 “education reformer of the year” after intense public pressure).  StudentsFirst says on your website that the candidates you support “have demonstrated a commitment to policies that prioritize student interests;” I find this assertion at best myopic, and at worst deliberately misleading.  How can you reconcile StudentsFirst’s candidate support with the fact that, on the whole, many of these candidates cause significant harm to low-income and minority students?

I appreciate, as you mentioned in a comment on Part 2 of this conversation, that you “created a school-based mental health program and piloted a half-dozen evidence-based mental/social/emotional health programs” in DC, and I’d love to talk more about the other issues you raised in your response, but I think your thoughts on the above points and questions are most relevant to typical reformer critiques.

Lerum: On the policy discussion, I would just end with this then. Saying that education can’t solve the achievement gap is demonstrably false only works if you base it on the education system we have now. To say that today’s education system cannot and has not solved the problem of poverty or the problem of the achievement gap thus far is correct. It’s also correct that in 60 years we haven’t solved the problem of segregation. But I got into this work because, like every reformer I know, I believe completely that we can do better than this. We don’t even know what’s possible because we haven’t actually tried. We’ve never run a public school system at scale completely differently. We’re not very good at breaking the mold of a model that hasn’t worked. But there are reasons – an increasing body of research – to believe that if we do, we just might get somewhere. That’s a theory of change. You can disagree with it. But you do not have the evidence that it won’t work because everything that’s been tried or done thus far has been done within some confines or under some of the restraints of the existing system. There are many limitations, that’s true. I think we’ve done a pretty good job of talking about those limitations through our advocacy work.

I would also add that there’s little evidence that other approaches that are championed as counters to reform will have a tremendous impact on kids either. I would love to see this “comprehensive assessment of de Blasio’s policies” that you spoke of earlier – but it doesn’t exist. Rather, there is simply a different theory of change – that certain other levers, be they class size or overall funding or whatever will have a greater impact than reforms we’re advocating for. What we need is a way to model, using rigorous research, what the potential impact of various reforms would be. That doesn’t exist right now either. But what I’m trying to get you to agree to here is that by attacking one side as only having a theory that’s not proven while not acknowledging that the anti-reform side isn’t exactly operating with a track record of success seems to me to be disingenuous, but more importantly, allows opponents to occupy this space wherein they own the debate on what’s good for solving poverty, what the right approach is to combat social ills, etc. And I just believe that way of thinking hasn’t gotten us very far and doesn’t advance social change.

As to the political issues you raise – we consistently say that we will support public officials who support the policies we believe are right for kids. I understand you have issues with our agenda – but there’s nothing inconsistent about a single-issue organization supporting candidates that support and will advocate for their issues. That almost always means as an organization we will support candidates with whom I may not agree with on a personal level when it comes to any number of other issues. But that is not unique to StudentsFirst and I do not think it is reasonable to expect us to answer for their stances on other issues or to ask them to change their stance on other issues. The issues we prioritize are those on our policy agenda and we work to stick with that approach, as do countless other organizations in other fields.

Spielberg: While I would agree with you (and said in Part 1 of our conversation) that the research on many in-school reforms is mixed, the suggestion that you seem to be making – that school-based reforms alone could potentially solve the opportunity gap – is contradicted by existing research and logic.  Research has never attributed more than one-third of the variation in student outcomes to school-based factors, we know that “children from rich and poor families score very differently on school readiness tests when they enter kindergarten,” and there is even “some evidence that achievement gaps between high- and low-income students actually narrow during the nine-month school year, but they widen again in the summer months.”  Though I suppose it’s theoretically possible that these studies are wrong, that could be said about almost anything, and the findings you link about teacher attrition and charters in no way support that conclusion.  Our knowledge about the disadvantages of growing up in poverty and the past several decades of research suggest that this theoretical possibility is negligible, which is why I called that statement demonstrably false.

I certainly understand the sentiment behind what you’re saying – we are in agreement that we haven’t yet maximized education’s contribution to anti-poverty efforts, and I think it’s important to remember and highlight that fact – but all the evidence points to a relatively low upper bound on what education reforms alone can accomplish.  Recognizing that anti-poverty work matters more than schools does not preclude us from arguing that what happens in schools is very important for low-income kids.

I really appreciate having had this conversation and want to thank you again for going back-and-forth with me, but I believe we’re at a bit of an impasse.  My questions deal with the out-of-school factors, like having access to health care, that very clearly matter for low-income students, and I don’t think your response addresses the issues I raised.  You’re absolutely right that StudentsFirst isn’t alone in narrowing its policy focus, but the fact that other organizations also do so doesn’t qualify as a defense of that approach.  Talking about what’s “right for kids” means considering more than just education policy.

As I’ve pointed out to critics of typical reform efforts before, I think it would be reasonable for reform organizations to focus their professional advocacy on school-based approaches to the opportunity gap if you did two things:

1) Acknowledge that the best school-based reforms imaginable, while important, would likely only be able to solve 20% to, at most, 33% of the problem.

2) Avoid undermining the anti-poverty work that can address a larger percentage of the opportunity gap.

I don’t believe that StudentsFirst currently does those two things, but I will leave it up to our readers to decide which arguments they find more compelling.

Leave a comment

Filed under Education, Poverty and the Justice System

Cooks, Chefs, and Teachers: A Long-Form Debate on Evaluation (Part 3a)

StudentsFirst Vice President Eric Lerum and I have been debating teacher evaluation approaches since my blog post about why evaluating teachers based on student test scores is misguided and counterproductive.  Our conversation began to touch on the relationship between anti-poverty activism and education reform conversations, a topic we plan to continue discussing.  First, however, we wanted to focus back on our evaluation debate.  Eric originally compared teachers to cooks, and while I noted that cooks have considerably more control over the outcomes of their work than do teachers, we fleshed that analogy out and continue discussing its applicability to teaching below.

Click here to read Part 1 of the conversation.

Click here to read Part 2 of the conversation.

Lerum: I love the analogy you use for this simple reason – I don’t think we’re as interested in figuring out whether the cook is an “excellent recipe-follower” as we are about whether the cook makes food that tastes delicious. And since we’re talking about the evaluation systems themselves – and not the consequences attached (which by and large, most jurisdictions are not using) – then this really matters. The evaluation instrument may reveal that the cook is not an “excellent recipe follower,” which you gloss over. But that’s an important point. It could certainly identify those cooks that need to work on their recipe-following skills. That’s helpful in creating better cooks.

But taking your hypothetical that it identifies someone who can follow a recipe well and executes our strategies, but then the outcome is still bad – that is also important information. It could cause us to re-evaluate the recipe, the meal choice, certain techniques, even the assessment instrument itself (do the people tasting the food know what good food tastes like?). But all of those would be useful and significant pieces of information that we would not get if we weren’t starting with an evaluation framework that includes outcomes measures.

You clearly make the assumption that nobody would question the evaluation instrument or anything else – if we had this result for multiple cooks, we would just keep going with it and assume it’s the cooks and nothing else. But that’s an unreasonable assumption that I think is founded on a lack of trust and respect for the intentions underlying the evaluation. What we’re focused on is identifying, improving, rewarding, and making decisions based on performance. And we want accurate measures for doing so – nobody is interested in models that do not work. That’s why you constantly see the earliest adopters of such models making improvements as they go.

Also, to clarify, we do not advocate for the “use of standardized test scores as a defined percentage of teacher evaluations.” I assume you probably didn’t mean that literally, but I think it’s important for readers to understand the difference as it’s a common and oft-repeated misconception among critics of reform. We advocate for use of measures of student growth – big difference from just using the scores alone. It doesn’t make any sense to evaluate teachers based on the test scores themselves – there needs to be some measure (such as VAM) of how much students learn over time (their growth), but that is not a single snapshot based on any one test.

I appreciate your recommendation regarding the use of even growth data based on assessments, but again, your recommendation is based on your opinion and I respectfully disagree, as do many researchers and respected analysts (also see here and here – getting at some of the issues you raise as concerns, but proposing different solutions). To go back to your analogy, nobody is interested in going to a restaurant run by really good recipe-followers. They want to go where the food tastes good. Period. Likewise, no parent wants to send her child to a classroom taught by a teacher who creates and executes the best lesson-planning. They want to send their child to a classroom in which she will learn. Outcomes are always part of the equation. Figuring out the best way to measure them may always have some inherent issues with subjectivity or variability, but I believe removing outcomes from the overall evaluation itself betrays to some degree the initial purpose.

Spielberg: I think there’s some confusion here about what I’m advocating for and critiquing.  I’d like to reiterate what I have consistently argued in this exchange – that student outcomes should be a part of the teacher evaluation process in two ways:

1) We should evaluate how well teachers gather data on student achievement, analyze the data, and use the data to reflect on and improve their future instruction.

2) We should examine the correlation between the effective execution of teacher practices and student outcome results.  We should then use the results of this examination to revise our instructional practices as needed.

I have never critiqued the fact that you care about student outcomes and believe they should factor heavily into our thinking – on this point we agree (I’ve never met anyone who works in education who doesn’t).  We also agree that it is better to measure student growth on standardized test scores, as value added modeling (VAM) attempts to do, than to look at absolute scores on standardized tests (I apologize if my earlier wording about StudentsFirst’s position was unclear – I haven’t heard anyone speak in favor of the use of absolute scores in quite some time and assumed everyone reading this exchange would know what I meant).  Furthermore, the “useful and significant pieces of information” you talk about above are all captured in the evaluation framework I recommend.

My issue has always been with the specific way you want to factor student outcomes into evaluation systems.  StudentsFirst supports making teachers’ VAM results a defined percentage of a teacher’s “score” during the evaluation process, do you not?  You highlight places, like DC and Tennessee, that use VAM results in this fashion.  Whether or not this practice is likely to achieve its desired effect is not really a matter of opinion; it’s a matter of mathematical theory and empirical research.  I’ve laid out why StudentsFirst’s approach is inconsistent with the theory and research in earlier parts of our conversation and none of the work you link above refutes that argument.  As you mention, both Matt Di Carlo and Douglas Harris, the authors of the four pieces you linked, identify issues with the typical uses of VAM similar to the ones I discuss.  Their main defense of VAM is only to suggest that other methods of evaluation are similarly problematic; Harris discusses a “lack of reliability in essentially all measures” and Di Carlo notes that “alternative measures are also noisy.”  There is, however, more recent evidence from MET that multiple, full-period classroom observations by multiple evaluators are significantly more reliable than VAM results.  While Di Carlo and Harris do have slightly different opinions than me about the role of value added, Di Carlo’s writing and Harris’s suggestion for evaluation on the whole seem far closer to what I’m advocating than to StudentsFirst’s recommendations, and I’d be very interested to hear their thoughts on this conversation.

That said, I like your focus above on what parents want, and I think it’s a worthwhile exercise to look at the purposes of evaluation systems and how our respective proposals meet the desires and needs of different stakeholders.  I believe evaluation systems have three primary purposes: providing information, facilitating support, and creating incentives.

1) Providing Information – You wrote the following:

…nobody is interested in going to a restaurant run by really good recipe-followers. They want to go where the food tastes good. Period. Likewise, no parent wants to send her child to a classroom taught by a teacher who creates and executes the best lesson-planning. They want to send their child to a classroom in which she will learn.

The first thing I’d note is that this juxtaposition doesn’t make very much sense; students taught by teachers who create and execute the best lesson-planning will most likely learn quite a bit (assuming that the teachers who are great lesson planners are at least decent at other aspects of good teaching). In addition, restaurants run by really good recipe-followers, if the recipes are good, will probably produce good-tasting food.  Good outputs are expected when inputs are well-chosen and executed effectively.

The cooking analogy is a bit problematic here because, in the example you give, the taste of the food is both the ultimately desired outcome and the metric by which you propose to assess the cook’s output.  In the educational setting, the metric – VAM, in the case of our debate – is not the same as the desired output.  In fact, VAM results are a relatively weak proxy for only a subset of the outcomes we care about for kids (those related to academic growth).  To construct a more appropriate analogy for judging a teacher on VAM results, let’s consider a chef who works in a restaurant where we want to eat dinner.  We are interested, ultimately, in the overall dining experience we will have at the restaurant. A measurement tool parallel to VAM, one that gives us a potentially useful but very limited picture of only one aspect of the experience other diners had, could be other diners’ assessments of the smell of the chef’s previous meals.

This analogy is more appropriate because the degree to which different diners value different aspects of a dining experience is highly variable.  All diners likely care to some extent about a combination of the food selection, the sustainability of their meal, the food’s taste, the atmosphere, the service, and the price.  Some, however, might value a beautiful, romantic environment over the taste of their entrees, while others may care about service above all else.  Likewise, some parents may care most about a classroom that fosters kindness, some may prioritize the development of critical thinking skills, and others may hold content knowledge in the highest esteem.

Were I to eat at a restaurant, I’d certainly get some information from knowing other diners’ assessments of previous meals’ smells.  Smell and taste are definitely correlated and I tend to value taste above other considerations when I’m considering a restaurant.  Yet it’s possible that other diners like different kinds of food than me, or that their senses of smell were affected by the weather or allergies when they dined there.  Some food, even though it smells bad, tastes quite good (and vice versa).  If I didn’t look deeper and really analyze what caused the smell ratings, I could very easily choose a sub-optimal restaurant.

What I’d really want to know would be answers to the following questions: what kind of food does the chef plan to make?  Does he source it sustainably?  Is it prepared to order?  Is the wait-staff attentive?  What’s the decor like?  The lighting?  Does the chef accommodate special requests?  How does the chef solicit feedback from his guests, and does he, when necessary, modify his practices in response to the feedback?  If diners could get information on the execution in each of these areas, they would be much better positioned to figure out whether they would enjoy the dining experience than if they focused on other diners’ smell ratings.  A chef who did all of these things well and who used Bayesian analysis to add, drop, and refine menu items and restaurant practices over time would almost certainly maximize the likelihood that future guests would leave satisfied.  A chef with great smell ratings might maximize that probability, but he also might not.

The exact same reasoning applies to the classroom experience.  Good VAM results might indicate a classroom that would provide a learning experience appropriate for a given student, but they might not.  Though I will again note that you don’t advocate for judging teachers solely on VAM, VAM scores tend to be what people focus on when they’re a defined percentage of evaluations.  That focus, again, does not provide very good information.  Whether parents value character development, inspiration, skill building, content mastery, or any other aspect of their children’s educational experience, they would get the best information by concentrating on teacher actions. If a parent knows a teacher’s skill – at establishing a positive classroom environment, at lesson planning, at lesson delivery, at using formative assessment to monitor student progress and adapt instruction, at helping students outside of class, etc. – that parent will be much more informed about the likelihood that a child will learn in a teacher’s class than if that parent focuses attention on the teacher’s VAM results.

2) Facilitating support – A chef with bad smell ratings might not be a very good chef.  But if that’s the case, any system that addressed the questions above – that assessed the chef’s skill at choosing recipes, sourcing great ingredients, making food to order, training his wait-staff, decorating his restaurant, responding to guest feedback, etc. – should also give him poor marks.  Bad results that truly signify bad performance, as opposed to reflecting bad luck or circumstances outside of the chef’s control, are the result of a bad input.  The key idea here is that, if we judge chefs on input execution but monitor outputs to make sure the inputs are comprehensive and accurate, judging chefs on their smell ratings won’t give us any additional information about which chefs need support.

More importantly, making smell ratings a defined percentage of a chef’s evaluation would not help a struggling chef improve his performance.  No matter the other components of his evaluation, he is likely to concentrate primarily on the smell ratings, feel like a failure, and have difficulty focusing on areas in which he can improve.  If we instead show the chef that, despite training the waitstaff well, he is having trouble selecting the best ingredients, we give him an actionable item to consider.  “Try these approaches to selecting new ingredients” is much easier to follow and much less demoralizing a directive than “raise your smell ratings.”

I think the parallel here is pretty clear – if we define and measure appropriate teaching inputs and use outcomes in Bayesian analysis to constantly revise those inputs, making VAM a defined percentage of an evaluation provides no new information about which teachers need support.  Especially because VAM formulas are complex statistical models that aren’t easily understood, the defined-percentage approach also focuses the evaluation away from actionable improvement items and towards the assignment of credit and blame.

3) Creating Incentives – Finally, a third goal of evaluation systems is related to workforce incentives.  First, we often wish to reward and retain high-performers and, in the instances in which support fails, exit consistently low-performers.  For retention and dismissal to improve overall workforce quality, we must base these decisions on accurate performance measures.

I don’t think the incomplete information provided by VAM results and smell ratings needs rehashing here; the argument is the same as above.  We are going to retain a higher percentage of chefs and teachers who are actually excellent if our evaluation systems focus on what they control than if our incentives focus on outputs over which they have limited impact.

Of particular concern to me, however, are the incentives teachers have for working with the highest-need populations.  Even efforts that take great pains to “level the playing field” between teachers with different student populations result in significantly better VAM results for teachers and schools that work with more privileged students.  Research strongly suggests that teachers who work in low-income communities could substantially improve their VAM scores by moving to classrooms with more affluent populations (and keeping their teaching quality constant).  When we make VAM results a defined percentage of an evaluation, we provide incentives for teachers who work with the highest-need populations to leave.  The type of evaluation I’m proposing, if we execute it properly, would eliminate this perverse incentive.

Again, I want to reiterate that I support constantly monitoring student outcomes; we should evaluate teachers on their ability to modify instruction in response to student outcomes, and we should also use outcomes to continuously refine our list of great teaching inputs.  But we rely on evaluation systems to provide accurate and comprehensive information, to help struggling employees improve, and to provide appropriate incentives.  VAM can help us think about good teaching practices, but StudentsFirst’s proposed use of VAM does not help us accomplish the goals of teacher evaluation.

Part 3b – in which we return to our discussion about the relationship between anti-poverty work and education reform – will follow soon!

Update (8/21/14) – Matt Barnum alerted me to the fact that the article I linked above about efforts to “level the playing field” when looking at VAM results actually does provide evidence that “two-step VAM” can eliminate the bias against low-income schools.  That’s exciting because, assuming the results are replicable and accurate, this particular VAM method would eliminate one of the incentive concerns I discussed.  However, while Educators 4 Excellence (Barnum’s organization) advocates for the use of this method, I don’t believe states currently use it (if you know of a state that does, please feel free to let me know).  The significant other issues with VAM would also still exist even with the use of the two-step version.

6 Comments

Filed under Education

Eric Lerum and I Debate Teacher Evaluation and the Role of Anti-Poverty Work (Part 2)

StudentsFirst Vice President Eric Lerum and I recently began debating the use of standardized test scores in high stakes decision-making.  I argued in a recent blog post that we should instead evaluate teachers on what they directly control – their actions.  Our conversation, which began to touch on additional interesting topics, is continued below.

Click here to read Part 1 of the conversation.

Lerum: To finish the outcomes discussion – measuring teachers by the actions they take is itself measuring an input. What do we learn from evaluating how hard a teacher tries? And is that enough to evaluate teacher performance? Shouldn’t performance be at least somewhat related to the results the teacher gets, independent of how hard she tries? If I put in lots of hours learning how to cook, assembling the perfect recipes, buying the best ingredients, and then even more hours in the kitchen – but the meal I prepare doesn’t taste good and nobody likes it, am I a good cook?

Regarding your use of probability theory and VAM – the problem I have with your analysis there is that VAM is not used to raise student achievement. So using it – even improperly – should not have a direct effect on student achievement. What VAM is used for is determining a teacher’s impact on student achievement, and thereby identifying which teachers are more likely to raise student achievement based on their past ability to do so. So even if you want to apply probability theory and even if you’re right, at best what you’re saying is that we’re unlikely to be able to use it to identify those teachers accurately on an ongoing basis. The larger point that is made repeatedly is that because outside factors play a larger overall role in impacting student achievement, we should not focus on teacher effectiveness and instead solve for these other factors. This is a key disconnect in the education reform debate. Reformers believe that focusing on things like teacher quality and focusing on improving circumstances for children outside of school need not be mutually exclusive. Teacher quality is still very important, as Shankerblog notes. Improving teacher quality and then doing everything we can to ensure students have access to great teachers does not conflict at all with efforts to eliminate poverty. In fact, I would view them as complementary. But critics of these reforms use this argument to say that one should come before the other – that because these other things play larger roles, we should focus our efforts there. That is misguided, I think – we can do both simultaneously. And as importantly in terms of the debate, no reformer that I know suggests that we should only focus on teacher quality or choice or whatever at the expense or exclusion of something else, like poverty reduction or improving health care.

If you’re interested in catching up on class size research, I highly recommend the paper published by Matt Chingos at Brookings, found here with follow-up here. To be clear about my position on class size, however; I’m not against smaller class sizes. If school leaders determine that is an effective way for improving instruction and student achievement in their school, they should utilize that approach. But it’s not the best approach for every school, every class, every teacher, or every child. And thus, state policy should reflect that. Mandating class size limits or restrictions makes no sense. It ties the hands of administrators who may choose to staff their schools differently and use their resources differently. It hinders innovation for educators who may want to teach larger classes in order to configure their classrooms differently, leverage technology or team teaching, etc. Why not instead leave decisions about staffing to school leaders and their educators?

The performance framework for San Jose seems pretty straightforward. I’m curious how you measure #2 (whether teachers know the subjects) – are those through rigorous content exams or some other kind of check?

I think a solid evaluation system would include measures using indicators like these. But you would also need actual student learning/growth data to validate whether those things are working – as you say, “student outcome results should take care of themselves.” You need a measure to confirm that.

I honestly think my short response to all of this would be that there’s nothing in the policies we advocate for that prevent what you’re talking about. And we advocate for meaningful evaluations being used for feedback and professional development – those are critical elements of bills we try to move in states. But as a state-level policy advocacy organization, we don’t advocate for specific models or types of evaluations. We believe certain elements need to be there, but we wouldn’t be advocating for states to adopt the San Jose model or any other specifically – that’s just not what policy advocacy is. So I think there’s just general confusion about that – that simply because you don’t hear us saying to build a model with the components you’re looking for, that must mean we don’t support it. In fact, we’re focused on policy at a level higher than the district level, and design and implementation of programs isn’t in our wheelhouse.

Spielberg: I believe you discuss three very important questions, each one of which deserves some attention:

1) Given that student outcomes are primarily determined by factors unrelated to teaching quality, can and should people still work on improving teacher effectiveness?

Yes!  While teaching quality accounts for, at most, a small percentage of the opportunity gap, teacher effectiveness is still very important.  Your characterization of reform critics is a common misconception; everyone I’ve ever spoken with believes we can work on addressing poverty and improving schools simultaneously.  Especially since we decided to have this conversation to talk about how to measure teacher performance, I’m not sure why you think I’d argue that “we should not focus on teacher effectiveness.”  I am critiquing the quality of some of StudentsFirst’s recommendations – they are unlikely to improve teacher effectiveness and have serious negative consequences – not the topic of reform itself.  I recommend we pursue policy solutions more likely to improve our schools.

Critics of reform do have a legitimate issue with the way education reformers discuss poverty, however.  Education research’s clearest conclusion is that poverty explains inequality significantly better than school-related factors.  Reformers often pay lip-service to the importance of poverty and then erroneously imply an equivalence between the impact of anti-poverty initiatives and education reforms.  They suggest that there’s far more class mobility in the United States than actually exists.  This suggestion harms low-income students.

As an example, consider the controversy that surrounded New York mayor Bill de Blasio several months ago.  De Blasio was a huge proponent of measures to reduce income inequality, helped reform stop-and-frisk laws that unfairly targeted minorities, had fought to institute universal pre-K, and had shown himself in nearly every other arena to fight for underprivileged populations.  While it would have been perfectly reasonable for StudentsFirst to disagree with him about the three charter co-locations (out of seventeen) that he rejected, StudentsFirst’s insinuation that de Blasio’s position was “down with good schools” was dishonest, especially since a comprehensive assessment of de Blasio’s policies would have indisputably given him high marks on helping low-income students.  At the same time, StudentsFirst aligns itself with corporate philanthropists and politicians, like the Waltons and Chris Christie, who actively exploit the poor and undermine anti-poverty efforts.  This alignment allows wealthy interests to masquerade as advocates for low-income students while they work behind the scenes to deprive poor students of basic services.  Critics argue that organizations like StudentsFirst have chosen the wrong allies and enemies.

I wholeheartedly agree that anti-poverty initiatives and smart education reforms are complementary.  I’d just like to see StudentsFirst speak honestly about the relative impact of both.  I’d also love to see you hold donors and politicians accountable for their overall impact on students in low-income communities.  Then reformers and critics of reform alike could stop accusing each other of pursuing “adult interests” and focus instead on the important work of improving our schools.

2) How can we use student outcome data to evaluate whether an input-based teacher evaluation system has identified the right teaching inputs?

This concept was the one we originally set out to discuss.  I’d love to focus on it in subsequent posts if that works for you (though I’d love to revisit the other topics in a different conversation if you’re interested).

I’m glad we agree that “a solid evaluation system would include [teacher input-based] measures…like [the ones used in San Jose Unified].”  I also completely agree with you that we need to use student outcome data “to validate whether those things are working.”  That’s exactly the use of student outcome data I recommend.  Though cooks probably have a lot more control over outcomes than teachers, we can use your cooking analogy to discuss how Bayesian analysis works.

We’d need to first estimate the probability that a given input – let’s say, following a specific recipe – is the best path to a desired outcome (a meal that tastes delicious).  This probability is called our “prior.”  Let’s then assume that the situation you describe occurs – a cook follows the recipe perfectly and the food turns out poorly.  We’d need to estimate two additional probabilities. First, we’d need to know the probability the food would have turned out badly if our original prediction was correct and the recipe was a good one.  Second, we’d need the probability that the food would have turned out poorly if our original prediction was incorrect and the recipe was actually a bad one.  Once we had those estimates, there’s a very simple formula we could use to give us an updated probability that the input – the recipe – is a good one.  Were this probability sufficiently low, we would throw out the recipe and pick a new one for the next meal.  We would, however, identify the cook as an excellent recipe-follower.

This approach has several advantages over the alternative (evaluating the cook primarily on the taste of the food).  Most obviously, it accurately captures the cook’s performance.  The cook clearly did an excellent job doing what both you and he thought was a good idea – following this specific recipe – and can therefore be expected to do a good job following other recipes in the future.  If we punished him, we’d be sending the message that his actual performance matters less than having good luck, and if we fired him, we’d be depriving ourselves of a potentially great cook.  Additionally, it’s not the cook’s fault that we picked the wrong cooking strategy, so it’s unethical to punish him for doing everything we asked him to do.

Just as importantly, this approach would help us identify the strategies most likely to lead to better meals in the long run.  We might not catch the problem with the recipe if we incorrectly attribute the meal’s taste to the cook’s performance – we might end up continuously hiring and firing a bunch of great cooks before we realize that the recipe is bad.  If we instead focus on the cook’s locus of control – following the recipe – and use Bayesian analysis, we will more quickly discover the best recipes and retain more cooks with recipe-following skills.  Judging cooks on their ability to execute inputs and using outcomes to evaluate the validity of the inputs would, over time, increase the quality of our meals.

Let’s now imagine the analogous situation for teachers.  Suppose a school adopts blended learning as its instructional framework, and suppose a teacher executes the school’s blended learning model perfectly.  However, the teacher’s value added (VAM) results aren’t particularly high.  Should we punish the teacher?  The answer, quite clearly, is no; unless the teacher was bad at something we forgot to identify as an effective teaching practice, none of the explanations for the low scores have anything to do with the teacher’s performance.  Just as with cooking, we might not catch a real problem with a given teaching approach if we incorrectly attribute outcome data to a teacher’s performance – we might end up continuously hiring and firing a bunch of great teachers based on random error, a problem with an instructional framework, or a problem with VAM methodology.

The improper use of student outcome data in high-stakes decision-making has negative consequences for students precisely because of this incorrect attribution.  Making VAM a defined percentage of teacher evaluations leads to employment decisions based on inaccurate perceptions of teacher quality.  Typical VAM usage also makes it harder for us to identify successful teaching practices.  If we instead focus on teachers’ locus of control – effective execution of teacher practices – and use Bayesian analysis, we will more quickly discover the best teaching strategies and retain more teachers who can execute teaching strategies effectively.  Judging teachers on their ability to execute inputs and using outcomes to evaluate the validity of the inputs would, over time, increase the likelihood of student success.

3) As “a state-level policy advocacy organization,” what is the scope of StudentsFirst’s work?

You wrote that StudentsFirst “[doesn’t] advocate for specific models or types of evaluations” but believes “certain elements need to be there.”  One of the elements you recommend is “evaluating teachers based on evidence of student results.”  This recommendation has translated into your support for the use of standardized test scores as a defined percentage of teacher evaluations.  I was not recommending that you ask states to adopt San Jose Unified’s evaluation framework (as an aside, the component you ask about deals mostly with planning and, among other things, uses lesson plans, teacher-created materials, and assessments as evidence) or that you recommend across-the-board class size reduction (thanks for clarifying your position on that, by the way – I look forward to reading the pieces you linked).  Instead, since probability theory and research suggest it isn’t likely to improve teacher performance, I recommend that StudentsFirst discontinue its push to make standardized test scores a percentage of evaluations.  You could instead advocate for evaluation systems that clearly define good teacher practices, hold teachers accountable for implementing good practices, and use student outcomes in Bayesian analysis to evaluate the validity of the defined practices.  This approach would increase the likelihood of achieving your stated organizational goals.

Thanks again for engaging in such an in-depth conversation.  I think more superficial correspondence often misses the nuance in these issues, and I am excited that you and I are getting the opportunity to both identify common ground and discuss our concerns.

Click here to read Part 3a of the conversation, which focuses back on the evaluation debate.

Click here to read Part 3b of the conversation, which focuses on how reformers and other educators talk about poverty.

5 Comments

Filed under Education

StudentsFirst Vice President Eric Lerum and I Debate Accountability Measures (Part 1)

After my blog post on the problem with outcome-oriented teacher evaluations and school accountability measures, StudentsFirst Vice President Eric Lerum and I exchanged a few tweets about student outcomes and school inputs and decided to debate teacher and school accountability more thoroughly.  We had a lengthy email conversation we agreed to share, the first part of which is below.

Spielberg: In my last post, I highlighted why both probability theory and empirical research suggest we should stop using student outcome data to evaluate teachers and schools.  Using value added modeling (VAM) as a percentage of an evaluation actually reduces the likelihood of better future student outcomes because VAM results have more to do with random error and outside-of-school factors than they have to do with teaching effectiveness.

I agree with some of your arguments about evaluation; for example, evaluations should definitely use multiple measures of performance.  I also appreciate your opposition to making student test score results the sole determinant of a teacher’s evaluation.  However, you insist that measures like VAM constitute a fairly large percentage of teacher evaluations despite several clear drawbacks; not only do they fail to reliably capture a teacher’s contribution to student performance, but they also narrow our conception of what teachers and schools should do and distract policymakers and educators from conversations about specific practices they might adopt.  Why don’t you instead focus on defining and implementing best practices effectively?  Most educators have similar ideas about what good schools and effective teaching look like, and a focus on the successful implementation of appropriately-defined inputs is the most likely path to better student outcomes in the long run.

Lerum: There’s nothing in the research or the link you cite above that supports a conclusion that use VAM “actually reduces the likelihood of better future student outcomes” – that’s simply an incorrect conclusion to come to. Numerous researchers have concluded that using VAM is reasonable and a helpful component of better teacher evaluations (also see MET). Even Shankerblog doesn’t go so far as to suggest using VAM could reduce chances of greater student success.

Some of your concerns with VAM deal with the uncertainty built within it. But that’s true for any measure. Yet VAM is one of the few (if not the only) measure that has actually been shown to allow one to control for many of the outside factors you suggest could unfairly prejudice a teacher’s rating.

What VAM does tell us – with greater reliability than other measures is whether a teacher is likely to get higher student achievement with a particular group of students. I would argue that’s a valuable piece of information to have if the goal is to identify which teachers are getting results and which teachers need development.

To suggest that districts & schools that are focusing on implementing new evaluation systems like those we support are not focusing on “defining and implementing best practices effectively” misses a whole lot of evidence to the contrary. What we’re seeing in DC, Tennessee, Harrison County, CO, and countless other places is that these conversations are happening, and with a renewed vigor because educators are working with more data and a stronger framework than ever before.

Back to your original post and my issues with it, however – focusing on inputs is not a new approach. It’s the one we have tried for decades. More pay for earning a Masters degree. Class size restrictions and staffing ratios. Providing funding that can only be used for certain programs. The list goes on and on.

Spielberg: I don’t think anyone thinks we should evaluate teachers on the number and type of degrees they hold, or that we should evaluate schools on how much specialized funding they allocate – I can see why you were concerned if you thought that’s what I recommended.  My proposal is to evaluate teachers on the actions they take in pursuit of student outcomes and is something I’m excited to discuss with you.

However, I think it’s important first to discuss my statement about VAM usage more thoroughly because the sound bites and conclusions drawn in and from many of the pieces you link are inconsistent with the actual research findings.  For example, if you read the entirety of the report that spawned the first article you link, you’ll notice that there’s a very low correlation between teacher value added scores in consecutive years.  I’m passionate about accurate statistical analyses – my background is in mathematical and computational sciences – and I try to read the full text of education research instead of press releases because, as I’ve written before, “our students…depend on us to [ensure] that sound data and accurate statistical analyses drive decision-making. They rely on us to…continuously ask questions, keep an open mind about potential answers, and conduct thorough statistical analyses to better understand reality.  They rely on us to distinguish statistical significance from real-world relevance.”  When we implement evaluation systems based on misunderstandings of research, we not only alienate people who do their jobs well, but we also make bad employment decisions.

My original statement, which you only quoted part of in your response, was the following: “Using value added modeling (VAM) as a percentage of an evaluation actually reduces the likelihood of better future student outcomes because VAM results have more to do with random error and outside-of-school factors than they have to do with teaching effectiveness.”  This statement is, in fact, accurate.  The following are well-established facts in support of this claim:

– As I explained in my post, probability theory is extremely clear that decision-making based on results yields lower probabilities of future positive results when compared to decision-making based on factors people completely control.

– In-school factors have never been shown to explain more than about one-third of the opportunity gap.  As mentioned in the Shanker Blog post I linked above, estimates of teacher impact on the differences in student test scores are generally in the ballpark of 10% to 15% (the American Statistical Association says it ranges from 1% to 14%).  Teachers have an appreciable impact, but teachers do not have even majority control over VAM scores.

Research on both student and teacher incentives is consistent with what we’d expect from the bullet points above – researchers agree that systems that judge performance based on factors over which people have only limited control (in nearly any field) fail to reliably improve performance and future outcomes.

Those two bullet points, the strong research that corroborates the theory, and the existence of an alternative evaluation framework that judges teachers on factors they completely control (which I will talk more about below) would essentially prove my statement even if recent studies hadn’t also indicated that VAM scores correlate poorly with other measures of teacher effectiveness.  In addition, principal Ted Appel astutely notes that, “even when school systems use test scores as ‘only a part’ of a holistic evaluation, it infects the entire process as it becomes the piece [that] is most easily and simplistically viewed by the public and media. The result is a perverse incentive to find the easiest route to better outcome scores, often at the expense of the students most in need of great teaching input.”

I also think it’s important to mention that the research on the efficacy of class size reduction, which you seem to oppose, is at worst comparable to the research on the accuracy of VAM results.  I haven’t read many of the class size studies conducted in the last few years yet (this one is on my reading list) and thus can’t speak at this time to whether the benefits they find are legitimate, but even Eric Hanushek acknowledges that “there are likely to be situations…where small classes could be very beneficial for student achievement” in his argument that class size reduction isn’t worth the cost.  It’s intellectually inconsistent to argue simultaneously that class size reduction doesn’t help students and that making VAM a percentage of evaluations does, especially when (as the writeup you linked on Tennessee reminds us) a large number of teachers in some systems that use VAM have been getting evaluated on the test scores of students they don’t even teach.

None of that is to say that the pieces you link are devoid of value.  There’s some research that indicates VAM could be a useful tool, and I’ve actually defended VAM when people confuse VAM as a concept with the specific usage of VAM you recommend.  Though student outcome data shouldn’t be used as a percentage of evaluations, there’s a strong theoretical and research basis for using student outcomes in two other ways in an input-based evaluation process.  The new teacher evaluation system that San Jose Unified School District (SJUSD) and the San Jose Teachers Association (SJTA) have begun to implement can illustrate what I mean by an input-based evaluation system that uses student outcome data differently and that is more likely to lead to improved student outcomes in the long run.

The Teacher Quality Panel in SJUSD has defined the following five standards of teacher practice:

1) Teachers create and maintain effective environments for student learning.

2) Teachers know the subjects they teach and how to organize the subject matter for student learning.

3) Teachers design high-quality learning experiences and present them effectively.

4) Teachers continually assess student progress, analyze the results, and adapt instruction to promote student achievement.

5) Teachers continuously improve and develop as professional educators.

Note that the fourth standard gives us one of the two important uses of student outcome data – it should drive reflection during a cycle of inquiry.  These standards are based on observable teacher inputs, and there’s plenty of direct evidence evaluators can gather about whether teachers are executing these tasks effectively.  The beautiful thing about a system like this is that, if we have defined the elements of each standard correctly, the student outcome results should take care of themselves in the long run.

However, there is still the possibility that we haven’t defined the elements of each standard correctly.  As a concrete example, SJTA and SJUSD believe Explicit Direct Instruction (EDI) has value as an instructional framework, and someone who executes EDI effectively would certainly do well on standard 3.  However, the idea that successful implementation of EDI will lead to better student outcomes in the long run is a prediction, not a fact.  That’s where the second usage of student outcome data comes in – as I mentioned in my previous post, we should use student outcome results to conduct Bayesian analysis and figure out if our inputs are actually the correct ones.  Let me know if you want me to go into detail about how that process works.  Bayesian analysis is really cool (probability is my favorite branch of mathematics, if you haven’t guessed), and it will help us decide, over time, which practices to continue and which ones to reconsider.

I certainly want to acknowledge that many components of systems like IMPACT are excellent ones; increasing the frequency and validity of classroom observations is a really important step, for instance, in executing an input-based model effectively.  We definitely need well-trained evaluators and calibration on what great execution of given best practices look like.  When I wrote that I’d like to see StudentsFirst “focus on defining and implementing best practices effectively,” I meant that I’d like to see you make these ideas your emphasis.  Conducting evaluations on this sort of input-based criteria would make professional development and support significantly more relevant.  It would help reverse the teach-to-the-test phenomenon and focus on real learning.  It would make feedback more actionable. It would also help make teachers and unions feel supported and respected instead of attacked, and it would enable us to collaboratively identify both great teaching and classrooms that need support.  Most importantly, using these kinds of input-based metrics is more likely than the current approach to achieve long-run positive outcomes for our students.

Part 2 of the conversation, posted on August 11, can be found here.

6 Comments

Filed under Education

Vergara v. California Panel Discussion with Leadership for Educational Equity

Leadership for Educational Equity (LEE), Teach For America’s (TFA’s) partner organization that focuses on alumni leadership development, held an online panel for members interested in learning more about Vergara v. California on June 26.  I was excited to receive an invitation to speak on the panel – I enjoyed talking to LEE members about how teachers unions benefit low-income students at an earlier event and appreciate LEE’s recent efforts to include organized labor in their work.  LEE received over 100 RSVPs from TFA corps members and alumni who tuned in to hear our discussion of the case.

LEE Panel Vergara

USC Professor of Education & Policy Katharine Strunk, Georgetown Professor of Law Eloise Pasachoff, and former Assistant Secretary of Civil Rights for the US Department of Education Russlynn Ali joined me for an engaging hour-long session.  Each of the panelists had ample time to make opening and closing remarks and to respond to each other’s points.  You can listen to the full audio for yourself below, but I also wanted to summarize two points I made at the end of the session:

1) It’s important to read the full text of education research articles because the findings are frequently misconstrued.  As I mentioned during my initial remarks, there’s a pretty strong research basis behind the idea that teachers are the most important in-school factor related to student success (though it’s important to remember that in-school factors, taken together, seem to account for only about 20% of student achievement results).  Nobody disagrees that teacher quality varies, either – it’s clear that low-income students sometimes have teachers who aren’t as high-quality as we would like.  Additionally, there’s broad consensus that improving teacher quality and addressing inequities between low-income and high-income schools are both important objectives.  The research does not suggest, however (and the plaintiffs did not show at trial), that there is a causal link between teacher employment law and either teacher quality issues or inequities between low-income and high-income schools.  There’s plenty of rhetoric about how employment law causes inequity but no actual evidence supporting that claim.  The other panelists and I unfortunately didn’t have enough time to engage in substantive conversations about the validity of the research we discussed, but I hope we have the opportunity to do so in the future.

2) Most union members and most people working within reform organizations have the same goals and should be working together.  We should therefore consider our rhetoric carefully.  Instead of insinuating that the unions who defend teacher employment law care more about protecting bad teachers than helping students, reformers could ask unions how more sensible reforms could make sure the execution of the laws aligns with the ethical, student-oriented theory.  Reformers could then signal their support for organized labor and work with unions to address the real root causes of teacher quality issues and inequities between schools.  The other panelists indicated their belief in reasonable due process protections, improved teacher evaluation and support, and equitable school funding, and kids would benefit if reformers and unions united behind these causes and pursued them with the same vigor with which some have jumped on the Vergara bandwagon.

You can hear more of my thoughts beginning about 22 minutes and 30 seconds into the clip, though I’d encourage you to listen to the whole thing if you have the time.  I’d also love to discuss the case more in the comments with anyone interested.  Hope you enjoy the panel!

Note: An earlier version of this post called LEE “Teach For America’s alumni organization.”  The reference has been changed to reflect that, while LEE focuses on leadership development for TFA alumni, they are an independent organization.

Update (7/19/14): The following sentence was modified to clarify that addressing teacher quality issues and addressing inequities between low-income and high-income schools are distinct tasks: “Additionally, there’s broad consensus that improving teacher quality and addressing inequities between low-income and high-income schools are both important objectives.”  The original sentence read: “Additionally, there’s broad consensus that improving teacher quality and addressing inequities between low-income and high-income schools is important.”

6 Comments

Filed under Education

Informed Student Advocates Pursue Reforms that, Unlike Vergara v. California, Actually Address Inequity

Judge Rolf Treu just ruled in favor of Students Matter in Vergara v. California, deeming teacher permanent status (commonly called “tenure”), due process protections for teachers with permanent status, and seniority-based layoffs unconstitutional.  Treu’s opinion unfortunately reflects a misunderstanding of education research and teacher employment law’s effects.  His decision also erodes labor protections without increasing the likelihood of an excellent education for students in low-income communities.

Reformer excitement about the ruling demonstrates how successfully the plaintiffs have conflated teacher employment law with the existence of ineffective teachers.  Informed advocates for low-income students and communities, on the other hand, are deeply disappointed because both ethical considerations and a thorough analysis of the case indicate the error in Treu’s findings.

The California Teachers Association (CTA) plans to appeal the decision and higher courts will hopefully see through the plaintiffs’ weak case.  No matter the appeal’s outcome, Treu’s opinion raises two issues considerably more significant for low-income students than teacher dismissal and layoff procedures:

1) Teacher evaluation and support practices: Treu wrote that 18+ months of employment is not “nearly enough time for an informed decision to be made regarding the decision of tenure,” arguing that administrator fear of permanent status deprives “teachers of an adequate opportunity to establish their competence.”  He wants “to have the tenure decision made after” California teachers finish BTSA, an induction program teachers must complete to clear their credentials, and he suggests a timeline of three to five years.

Treu is correct that some ineffective teachers are currently retained and some good teachers are currently dismissed under California’s system, but he’s wrong about the primary reason why.  Instead, inadequate approaches to teacher evaluation and a lack of quality teacher support have long hindered the development and retention of excellent teachers.  Nearly two years is far longer than a supervisor should need to evaluate teacher performance and potential for growth if evaluation systems provide frequent opportunities for meaningful feedback and support about specific teacher practices.

Unions and many reform organizations actually agree about the goals of teacher evaluation.  The New Teacher Project (TNTP), for example, believes that “the core purpose of evaluation must be maximizing teacher growth and effectiveness, not just documenting poor performance as a prelude to dismissal.”  Similarly, CTA believes that “the purpose of an effective teacher development and evaluation system is to inform, instruct and improve teaching and learning; to provide educators with meaningful feedback on areas of strength and where improvement is needed; and to ensure fair and evidence-based employment decisions.”  Though reformer support for the use of standardized test score results as a percentage of teacher evaluations may decrease teaching quality and detract from student learning, TNTP and CTA also agree about many areas in which evaluation practices need improvement: the training administrators receive on how to give meaningful feedback, the quality of professional growth plans and professional development opportunities, and the frequency and length of classroom observations.

Extending new teachers’ probationary periods indefinitely will not address the underlying causes of the problem Treu identifies.  In fact, the argument that two years isn’t “nearly enough time” implicitly grants license for administrative incompetence and practices that inadequately address new teachers’ professional needs.  Education stakeholders committed to developing and identifying great new teachers should instead pour their time, money, and energy into aligning evaluation and support systems with their goals.  San Jose Unified School District (SJUSD) and the San Jose Teachers Association (SJTA), for example, have invested in administrator training, evaluative consulting teachers with content-area teaching expertise, evaluation documents that more accurately define effective teaching and require narrative feedback, a Teacher Quality Panel consisting of both teacher and administrator members, and non-evaluative instructional coaching support.

2) School funding: Treu’s ruling erroneously considers Vergara v. California part of a historical record of education-related court cases including Brown v. Board of Education, Serrano v. Priest, and Butt v. California.  These three cases, unlike Vergara, dealt with undebatable and direct inequities in access to educational opportunity for low-income and minority students: segregated schools (Brown), inequitable access to school funding (Serrano), and inequitable access to a full school year (Butt).  Treu fails to note that, despite the Serrano case and the advent of California’s new Local Control Funding Formula (LCFF), major inequities in education funding persist in California today.

In 2012-2013, for example, SJUSD received approximately $9,000 per pupil in revenue.  During the same year, Palo Alto Unified School District (PAUSD) received about 60% more money per pupil, approximately $14,500.  While California guarantees a certain amount of annual funding called a “revenue limit” to every school district in the state, some districts, like PAUSD, bring in property tax revenues that exceed the revenue limit.  These “basic aid” districts keep their excess property tax revenue and often pass parcel taxes that further increase the funding discrepancy between lower-income districts and their higher-income basic aid counterparts.

More funding is not a panacea for low-income schools – how districts spend their money determines its return – but research is clear that funding matters a great deal.  Politicians who cut education-related spending for poor communities often cite a 33-year-old study by Eric Hanushek to oppose equitable school funding, yet even Hanushek himself cautiously supports it.  Asked in a 2006 interview if “it’s a good idea to give very high-poverty districts more funding per pupil than an average district,” Hanushek responded: “I think so. I think you have to provide extra resources and help for kids who start at a lower point because of their backgrounds.”  It’s impossible to support educational equity and justify the funding discrepancy between SJUSD and PAUSD.

One of the most important provisions of the LCFF – the supplemental funding it provides to districts that serve high numbers of English language learners, students from low-income families, and students from foster homes – moves California in the right direction.  However, basic aid districts that have long been able to afford better resources for students will continue to exist.  Based on the case history Treu cites, one could construct a very strong case that the existence of basic aid districts violates the Equal Protection Clause of the Fourteenth Amendment and the California Constitution.  Advocates for low-income students could also make an indirect equal protection case about Proposition 13’s effect on school funding disparities.  Unlike Vergara v. California, these cases could continue the tradition of Brown, Serrano, and Butt by remedying a clear instance of educational inequity.

Treu’s ruling also invites an analysis of the definition of appropriate due process.  The judge asserts that “[t]here is no question that teachers should be afforded reasonable due process when their dismissals are sought,” but he claims that current protections for teachers with permanent status constitute “uber due process.”  Treu proposes replacing teacher dismissal law with the rights guaranteed by the decision in Skelly v. State Personnel Board; because of Skelly, permanent employees facing dismissal must receive “notice of the proposed action, the reasons therefor, a copy of the charges and materials upon which the action is based, and the right to respond, either orally or in writing, to the authority initially imposing discipline.”

In essence, Skelly rights ensure that employers treat permanent employees with some semblance of courtesy and respect.  While Treu asserts that due process considerations are “entirely legitimate,” however, he forgets to mention that probationary teachers do not have Skelly rights; in California, probationary teachers can be non-reelected (fired) without cause.  Treu’s argument is completely contradictory given current law – he simultaneously contends that he believes in the concept of due process and that districts should be able to deprive people of it for three to five years.

Labor organizations support Skelly’s basic protections for all employees because of the extensive history of inappropriate employer practices and a belief in treating people fairly.  Due process protections should also include a requirement that administrators adequately support permanent teachers before attempting to dismiss them.  A support-first mindset is not only the most ethical approach, but it’s also important because, as Jack Schneider explains, “you don’t put…effective teacher[s] in every classroom by holding…sword[s] over their heads.  You do it by putting tools in their hands.”  Advocates for workers rights support streamlined dismissal processes for employees who are unwilling or unable to improve; the defendants in Vergara just know that society and schools benefit when employers are required to treat their employees like human beings.

Judge Treu accurately identifies a few key issues in his decision: administrators may struggle to identify quality teaching in fewer than two years, layoffs may deprive schools and students of stellar teachers, and teacher employment law may fail to grant teachers an appropriate amount of due process.  Unfortunately, Vergara v. California neither improves teacher evaluation and support practices nor rectifies the funding inequities that lead to layoffs and resource cutbacks in districts that serve low-income students.  The decision also ignores the complete lack of due process afforded to probationary teachers and fails to deliver a thoughtful recommendation about how to empower teachers to grow professionally.  Informed, honest student advocates who care more about “providing each child…with a basically equal opportunity to receive a quality education” than about destroying organized labor should therefore hope that an appeals court will reverse Treu’s decision.  In the meantime, they should begin work on reforms more likely to improve opportunities for low-income students.

Note: A version of this post appeared on The Huffington Post on June 13.

2 Comments

Filed under Education, Labor

The Problem with Outcome-Oriented Evaluations

Imagine I observe two poker players playing two tournaments each. During their first tournaments, Player A makes $1200 and Player B loses $800. During her second tournament, Player A pockets another $1000. Player B, on the other hand, loses $1100 more during her second tournament. Would it be a good decision for me to sit down at a table and model my play after Player A?

For many people the answer to this question – no – is counterintuitive. I watched Player A and Player B play two tournaments each and their results were very different – haven’t I seen enough to conclude that Player A is the better poker player? Yet poker involves a considerable amount of luck and there are numerous possible short- and longer-term outcomes for skilled and unskilled players. As Nate Silver writes in The Signal and the Noise, I could monitor each player’s winnings during a year of their full-time play and still not know whether either of them was any good at poker. It would be fully plausible for a “very good limit hold ‘em player” to “have lost $35,000” during that time. Instead of focusing on the desired outcome of their play – making money – I should mimic the player who uses strategies that will, over time, increase the likelihood of future winnings. As Silver writes,

When we play poker, we control our decision-making process but not how the cards come down. If you correctly detect an opponent’s bluff, but he gets a lucky card and wins the hand anyway, you should be pleased rather than angry, because you played the hand as well as you could. The irony is that by being less focused on your results, you may achieve better ones.

As Silver recommends for poker and Teach For America recommends to corps members, we should always focus on our “locus of control.” For example, I have frequently criticized Barack Obama for his approach to the Affordable Care Act. While I am unhappy that the health care bill did not include a public option, I couldn’t blame Obama if he had actually tried to pass such a bill and failed because of an obstinate Congress. My critique lies instead with the President’s deceptive work against a more progressive bill – while politicians don’t always control policy outcomes, they do control their actions. As another example, college applicants should not judge their success on whether or not colleges accept them. They should evaluate themselves on what they control – the work they put into high school and their applications. Likewise, great football coaches recognize that they should judge their teams not on their won-loss records, but on each player’s successful execution of assigned responsibilities. Smart decisions and strong performance do not always beget good results; the more factors in-between our actions and the desired outcome, the less predictive power the outcome can give us.

Most education reformers and policymakers, unfortunately, still fail to recognize this basic tenet of probabilistic reasoning, a fact underscored in recent conversations between Jack Schneider (a current professor and one of the best high school teachers I’ve ever had) and Michelle Rhee. We implement teacher and school accountability metrics that focus heavily on student outcomes without realizing that this approach is invalid. As the American Statistical Association’s (ASA’s) recent statement on value-added modeling (VAM) clearly states, “teachers account for about 1% to 14% of the variability in [student] test scores” and “[e]ffects – positive or negative – attributed to a teacher may actually be caused by other factors that are not captured in the model.” Paul Bruno astutely notes that the ASA’s statement is an indictment of the way VAM is used, not the idea of VAM itself, yet little correlation currently exists between VAM results and effective teaching. As I’ve mentioned before, research on both student and teacher incentives suggests that rewards and consequences based on outcomes don’t work. When we use student outcome data to assign credit or blame to educators, we may close good schools, demoralize and dismiss good teachers, and ultimately undermine the likelihood of achieving the student outcomes we want.

Better policy would focus on school and teacher inputs. For example, we should agree on a set of clear and specific best teaching practices (with the caveat that they’d have to be sufficiently flexible to allow for different teaching styles) on which to base teacher evaluations. Similarly, college counselors should provide college applicants with guidance about the components of good applications. Football coaches should likewise focus on their players’ decision-making and execution of blocking, tackling, route-running, and other techniques.

Input Output Graphic

When we evaluate schools on student outcomes, we reward (and punish) them for factors they don’t directly control.  A more intelligent and fair approach would evaluate the actions schools take in pursuit of better student outcomes, not the outcomes themselves.

Outcomes are incredibly important to monitor and consider when selecting effective inputs, of course. Mathematicians use outcomes in a process called Bayesian analysis to constantly update our assessments of whether or not our strategies are working. If we observe little correlation between successful implementation of our identified best teaching practices and student growth for five consecutive years, for instance, we may want to revisit our definition of best practices. A college counselor whose top students are consistently rejected from Ivy League schools should begin to reconsider the advice he gives his students on their applications. Relatedly, if a football team suffers through losing season after losing season despite players’ successful completion of their assigned responsibilities, the team should probably overhaul its strategy.

The current use of student outcome data to make high-stakes decisions in education, however, flies in the face of these principles. Until we shift our measures of school and teacher performance from student outputs to school and teacher inputs, we will unfortunately continue to make bad policy decisions that simultaneously alienate educators and undermine the very outcomes we are trying to achieve.

Update: A version of this piece appeared in Valerie Strauss’s column in The Washington Post on Sunday, May 25.

8 Comments

Filed under Education, Philosophy