Teacher Stories Episode 4 - Meddlesome Models with Peg Cagle

August 25, 2020

Audio
Transcript:
Interview
Closing

Audio

Transcript:

Welcome to episode four of teacher stories.

It’s been quite a while since our last episode and a lot has happened. I think the last time I was recording, it was pre-pandemic. And I think it’s likely that a lot of the listeners of this podcast are going to be facing a very difficult time, especially now as September is coming close. And a lot of people are probably being put into situations where they have to make some really hard decisions. Regardless of what situation you might find yourself in the coming months, I hope you’re staying safe and I hope you’re able to find some happiness in this difficult time.

So onto the episode, it touches on some very, big topics — standardized testing, evaluating teacher effectiveness, relationships between teachers and administration, relationships between teachers and parents, the portrayal of teaching within our culture and within the media.

So I found it really challenging to… make an episode out of all of this, because I wanted to pick a topic that was self contained and could have a satisfying ending. So I did my best and this partially explains why it’s been such a long time since the last episode.

The most exciting part of this episode for me was the person I was interviewing, Peg Cagle.

I recorded this during NCTM, the national council of teachers of math. Peg was volunteering, doing some monitoring of sessions. And that’s why you hear a little bit of people walking around and cheering in the background. Another friend of mine who was also attending NCTM, when he heard that I was going to be interviewing Peg, described her as “the teacher’s teacher”. And I think that is a very appropriate description. I’ll let Peg introduce herself, but, I think something very impressive about her from my point of view is that she has an incredible resume, a ton of experience. In all sorts of things. She is someone who could very easily be working in policy, could have her own consulting company, and do any number of things. But, she consistently works with teachers and she consistently herself returns to the classroom. And I think that’s just a really powerful message about how important she perceives this work to be.

So without further ado here’s Peg

Interview

Denis: Okay, welcome to the podcast. To begin with, could you just introduce yourself and tell us a little bit about who you are?

Peg: Sure. My name is Peg Cagle. I am a classroom teacher currently in Los Angeles, in the San Fernando Valley. I currently teach high school at Receda High School.

I very often say that I came to education late, but got pissed off early. I came in as a career changer having spent the first 15 years of my working life as an architect, but there were, honest to God, banner headlines in the LA Times, calling for the need for math teachers, and it was something I had already loved and thought, maybe this is someplace I can make even a bigger contribution than I can. What better legacy than the next generation of people, rather than just putting up another building someplace. So I switched careers and spent 17 years in the classrooms, stepped out to do a fellowship in DC, spent some time in higher-ed and found my way back to where I think the most important work is being done, which is in classrooms with kids.

Denis: How does your experience being an architect compare to your experience being a teacher, how would you compare those two professions?

Peg: I think they’re so profoundly different in so many ways, because every bit of being an architect has resonance in terms of it being a profession in terms of it being revered. As you might imagine, I get a very different reaction in parties when I used to tell people I’m an architect, especially as a woman, versus now when I tell them I’m a teacher and I teach mathematics. You can, I can’t even begin to paint the picture of how profoundly different those two reactions are, and I think that sort of summarizes the experience.

Denis: Okay. And, just so we can get to know you a little bit better, can you tell us something about yourself that doesn’t have anything to do with teaching or math or your professional life?

Peg: Oh my goodness. Something that has nothing to do with teaching or math or my professional life. Wow. I went to nine schools by the time I entered ninth grade and no, it was not because I got thrown out of the first eight. So

Denis: yeah, you moved around a lot.

Peg: I moved around a lot. Yeah.

Denis: Well, so we’re here to record a story. Can you give us a little bit of a setup? How long have you been teaching at that time? What sort of school were you at?

Peg: Okay, so this was in my 17th year of teaching. It actually came at a very funny moment because I had already decided that I was stepping out of the classroom and trying to look for a place with bigger levers, if you will, place to, to make a bigger impact. And, I was working in a program, a magnet program within LA Unified at a middle school, but working in a program for gifted and highly gifted students, teaching. Primarily high school kids, high school curriculum to precocious, mathematically precocious middle school kids.

Denis: Yeah, so take us through, what happened.

Peg: Okay. So this was back in the era when No Child Left Behind had been passed. It had been written originally as a civil rights bill to try and really start to level the playing field, to really look at some of the persistent inequities that existed within the fields of education. And part of that was not just looking at what was going on with kids, but looking on what was going on at schools and looking at what was going on with teachers. And a component of that was this call for rating teachers and creating models for rating teachers. And, one of the really popular models at that time was pushed very heavily by a group of economists, who had used these sort of economic models in other arenas and to look at other types of businesses, and they really took it and overlayed it on top of education without any sort of recognition that the model had been built originally to measure soybean growth. So, okay. Yeah.

Denis: So what was your direct experience with this, with this model?

Peg: Absolutely. So, LA unified decided to buy into this idea about value added. And the notion that somehow by looking at kids’ test scores, you could decide if the teacher had actually added any value to this child in terms of their education.

Denis: Okay. So, , they would take your standardized test scores. They would do some number crunching, right. Using the statistical model.

Peg: Exactly.

Denis: And, and what was the, what was the outcome?

Peg: So the outcome was you could come out in, in one of, I think. Four categories. There was highly effective, effective, ineffective, and highly ineffective.

Denis: Did you know ahead of time that they were going to be rating you this way?

Peg: We had heard sort of that there was going to be these new metrics being used and that there was this call for value added measures, and I think a lot of us sort of thought, Oh, okay, so we’re going to give them. Imagine if we gave them the final on day one when they walked into our classrooms and then gave them the final on day 180 and I would be fine with saying, what has this child actually learned in this topic in your room, because we measured their knowledge on that topic on day one and day 180.

Without saying, wait a minute. So on the seventh grade test, they did this, and on the eighth grade test, they did that. And somehow we’re supposed to have shown growth means that you supposed to have made a C student into a B student and a B student into an A student. You know? So I don’t think that any of us, even those of us who were more typically aware, which would have included me, had any clue what these models really were or what these metrics were going to mean or how they were going to be turned into statements about who we were as teachers and what our competency level was.

Denis: So even beyond being measured in this way within the school, I think you mentioned that there was kind of a public release of information. Can you say a little bit more about that?

Peg: So one of the things that happened at the end of this. It’s always interesting to see when districts decide they need to be very, very transparent. And when they obsfucate like nobody’s business. And for whatever reason they decided that the way to truly, you know, honor the public and honor the kids that we were trying to serve was to publish these the scores and turn all of this information over to the Los Angeles Times and list upon list, page upon page of teachers’ names with their scores, with their ratings were published in the spring of 2011.

And, it was devastating, and it was at the time they published all the elementary teacher scores. I don’t know if it was because secondary was lagging or just the way they rolled it out and there was a huge backlash for a variety of reasons. There was an instance of, I think it was a fourth grade teacher, but one of the teachers whose information had been published, who worked in an incredibly challenging school whose administration and parents had nothing but praise for him, so devastated by his ratings. He wound up committing suicide.

Denis: Oh my god, that’s awful.

Peg: And so they pulled back and they never did publish the secondary scores, but they did send them out to us and they did send them out to our schools.

Denis: And I actually did, I did a little bit of research on this. And you can still find these ratings. They have the school accountability project where basically you can look up, any school and they’ll give you like a sort of a value added rating of the school. So maybe they scaled it back from like identifying individual teachers, but they are definitely still ranking academic institutions in this very public way.

So, what was your reaction? Do you remember hearing about the report and what’s going on, finding out about your rating? What was your rating, by the way?

Peg: So, yeah, exactly. So at the time, because I was teaching in a secondary setting, we got rated based on the courses that we taught. And so I taught at the time, both geometry and algebra, and remember that this is in a middle school setting. So I had these very sort of mathematically advanced kids for the setting that I was working in. But, in geometry, I was rated as effective despite the fact that my kids outscored every other school in LA Unified, including the school that was highly gifted, and my school was mostly gifted kids. And in Algebra I was actually rated as highly ineffective at the exact same time.

Denis: Right. And this was an interesting contrast to other things that were going on.

Peg: Yeah. Yeah. So in 05, I won the presidential award for excellence in mathematics teaching for the state of California. In 2006, I had been named by Raytheon as a math hero. I had been selected by USA Today as part of, they used to do something, they called the All-USA teacher team of the top 10 teachers in the country, across all grades, all subjects. So yeah, I had a few things that were sort of, you know, messaging other than “God, you suck”.

Denis: How did you feel when you saw the ineffective rating?

Peg: Well, I was devastated. Of course. I was devastated because every one of us wants to be highly effective on behalf of our kids. Every single one of us is driven to give our students what they need and what they deserve, because that’s our job. But that’s also our calling and it’s our responsibility. It’s what’s been entrusted to us, and so I was very demoralized. It was really hard to go back into school with the same kind of enthusiasm.

Denis: It seems surprising to me actually, because you would think that someone with so many accolades and someone who was like very visibly recognized, consistently, you know. It might be easy to just shrug off. It’s like, Oh, those people, they don’t know what they’re talking about. It’s just bullshit. But that’s not what happens.

Peg: No, I mean, it still hits you because the reality is that they are using something. And I mean, I didn’t think of it at the time, but now, you know, when we think about how math is used as a weapon against individuals, whether we’re talking about some of the egregious inequities in ways which has been used as a tool of oppression. It was really used as a weapon against teachers.

It’s really hard not to look at pages of data and pages of analysis and think this has to mean something. Especially I think as a math, maybe as a math person, that hit even harder because I trusted mathematics. And so looking at those data was difficult. It was really difficult to maintain a sense of, “I know I’m doing good work because they have all these other pieces of evidence that tells me so”.

Denis: And despite that, here you’re looking at a number and there are certain elements of self doubt.

Peg: Yes, yes, yes, yes.

Denis: Do you remember any sort of reactions from your colleagues or any conversations that you’ve had with people during that time?

Peg: Well, I think it was another one of those things that serves to polarize and alienate and sort of balkanize faculty because everybody was a little skiddish about talking about it.

Like I said, the fact that they had released the elementary and so that had gone public, they hadn’t released the secondary, but we all knew that we had all gotten these emails and there were people who were kind of trying to test the waters because they kind of wanted to brag that they had been rated well, but then they didn’t quite want to, you know, alienate their, their colleagues.

So it was a very strange and tense environment to be coming to school in. And then I finally just started telling people. I got rated highly ineffective, but it was also a weird timing because by that point, I had actually just accepted an offer for a fellowship to go to Washington DC and serve as an Albert Einstein distinguished educator fellow.

And had further been not just selected for that, but then been selected to be what’s called a Hill fellow and had a place waiting for me as an education consultant in. The US Senate.

Denis: So there must’ve been, quite a bit of relief when you owned up to your ineffective rating among the other faculty at the school.

Peg: I mean, it definitely was, and I think sort of one of the biggest things was that instantly my friend who was our union rep said, I want your data. I want your stuff. Because again, the notion that here, let’s really take a look at this people. We have all of this other evidence of quality work, and yet this rating says this, and the dissonance of that.

Denis: It seems so irresponsible to me to publish all of that information in isolation, right?

Because people reading that newspaper article don’t have access to any of this other evidence. I know it can be probably really upsetting for a parent to see something like that about your kid’s teacher.

So, you’ve looked at your data. Did you ever get any idea of why you may have been rated as ineffective?

Peg: Well, I did have some theories. One of them being that for algebra, a lot of my kids scored really, really well on the general math test, which is the test they would have taken immediately previous to the algebra test in terms of their years. And these were really strong kids who did really, really well.

And I think that probably the mean score for most of my students was maybe the 98 point something percentile. So they were coming in already, basically having maxed out on the test. So any drop was seen as a teacher deficit. Any drop by there. Right. And we know that algebra is a big step. It’s a big leap of abstraction.

And especially when we’re putting it down into lower grades with kids who have less, you know, sort of frontal cortex really fully engaged in that. That level of abstraction can be a huge cognitive load and there tends to be a sort of, if you will, slippage in scores. So the fact that those, that was the population about which I was scored as highly ineffective, mind you, I think at the end of that year, maybe my, my scores were in the 90th percentile.

All of those kids went on to do geometry the next year and pass. You know, succeeded in many of them are now engineers and doctorates, you know?

Denis: So it seemed like they were fine.

Peg: Yeah. I don’t honestly think I did egregious damage to any of them in terms of their academics or their algebra knowledge. Yeah.

Denis: So do you remember any conversations that you’ve had with your colleagues around that time?

Peg: So one of the biggest, one of the biggest conversations was with a very dear friend and mentor and colleague of mine, Pam Mason, who for the last 10 years has been the director of math for America in Los Angeles. And what we were talking about, this was shortly after the program had started and she was working with bringing new people into the profession. And the conversation was about the fact of, being 17 years in, having had the the numerous things to bolster me and give me that perspective that I was doing good work on behalf of kids, it was still really, really hard to sit comfortably in the space with those, those reports when they came out. And what I kept sort of going back to was what damage this would do if you were a first or second year teacher and you were still getting your sea legs, if you will, still sorting it out, still deciding whether or not you were really going to invest and stay with this profession for the long haul.

And I truly, truly worried about how many people are we driving out right at a time when we’ve got this critical shortage and this looming shortage, which continues to really seriously be on the horizon. And it’s going to hit us one of these days is shortages for qualified STEM teachers, mathematics in particular.

And why would anybody stay when this is what they’re going to be confronted with?

Denis: Yeah, and this goes back to, sort of a comment earlier about being treated as an architect.

Peg: Yes.

Denis: Versus an educator.

Peg: Right.

Denis: Could you imagine a similar set of reports coming out about architects?

Peg: No, of course not. And the other thing is because of the fact of the general regard and the general sense within the culture.

I think any architect who saw something like this would feel completely empowered and completely appropriate in laughing it off.

And yet educators are made to be very, very vulnerable and we carry that with us all the time.

Denis: That’s really, I think that’s really good observation cause I’m definitely certain that something about the atmosphere of being a teacher and how you are treated as a teacher really makes people take this sort of thing to heart.

I’m wondering, did you have any interactions with your administration or with parents in the aftermath?

Peg: I don’t really remember having interactions with my administration. I had a good working relationship with my principal. He completely understood why I had made the decision of wanting to look for, as I say, larger levers. But because I think I had already made the commitment to step out. I didn’t, I didn’t invest deeply at the time in going to him and saying, okay, what are we going to do about this?

Which I in some ways regret because. That notion of whether or not it was about me, it would have made it better for those coming up behind me. Like I said, I did speak to my friend who also happened to be our union rep and said, hey, you want these? Cause I think they’re hilarious, you know, and sort of when you put it in perspective, and I could take those moments, didn’t mean I still didn’t go to that dark place.

But I could set it aside and say, Hey, you want this because this might be a powerful tool for you to use to, to come back. And, and, I talked to some friends who do policy work at the state level and say, so you want to, I’ve got a story for you. And shared it from those regards to try and push back.

LA unified did come back and say that they were, this was a first pass. Mind you that first pass, was the result. What resulted were those scores that got published. But, they went back then and played with some of the metrics. And I’m sure the listeners of your podcast must have heard of Desmos, but as I’ve said, it’s almost as if they took the little sliders and fiddled with things and people went from highly ineffective to highly effective and then back down to this.

And so we would get these. Sort of interim things through adjusting the metrics and it felt like you were on the slide was “wooo”

Denis: Just going all over the place and, and that’s the thing that keeps surprising me about this is it’s almost like a public shaming. And it didn’t seem like the newspaper was really aware of the effect that they would have, it seems like they didn’t really take seriously the effect of what they were doing.

Peg: I don’t, I don’t know where I would point my finger about the the obscenity of, of making that information public in the way that they did. I think that the paper was absolutely complicit in it, but I think I would go first to the district in terms of what was the real motivation because there’s so many times where journalists have sought to get information about the internal workings within our school district and have been completely rebuffed repeatedly. And yet in this case, it wasn’t just, Oh, you want to know how our schools are doing you and how our teachers will give you names and we will give you, you know, just that this very, very fine detailed level, completely stripping out any nuance, completely taking away any context. And I truly think it was for some purpose. That I would say goes beyond just a lack of awareness. I think there was some intentionality behind it. I don’t think they meant for anybody to actually wind up harming themselves. I would never subscribe to that theory, but I definitely think that they had an agenda that this was serving.

Denis: Well. It certainly had a very divisive effect.

Peg: And at the same time they were incentivizing certain schools and they were incentivizing certain programs and there were, there were additional moneys attached to different ratings and all of that sort of stuff, which has also been pretty much debunked.

Denis: Yeah. So yeah, I want to go back on that and talk a little bit about sort of the reasoning behind these models, right? So the idea is that, by measuring student achievements year to year, you can infer what effects each teacher is having. And, you know, it seems sort of, like they’re coming from a good place, right?

And it seems like in practice. It rarely ends up working out that way that it is a positive thing. So, what are your thoughts on that, on this approach of, of sort of scoring teachers and figuring out who’s doing a good job and who’s not?

Peg: If our goal is to actually have the best schools possible, then what we really need to be investing in is not trying to sort and rank and number, et cetera, but actually to elevate and to look at the particular circumstances of the work that teachers are doing within the context of the schools or within the context of the clusters within the neighborhoods and say, what is it that we can do to set up all the possible conditions in the best possible way to ensure that this teacher can do their best possible work for these kids on a daily basis. And then find places where, oh, and this teacher needs this additional information, or this teacher needs this additional extra resource.

That’s what value added really ought to be.

Denis: Looking back at this time, do you think about these events, you know, in your current work or as you moved on to other things?

Peg: Well, I mean, certainly one of the things that was really kind of just, again, sort of timing fluke was as I was stepping out of the district and this happened and then arriving in Washington DC. And one of the very first hearings that I went to was actually in a beautiful old building, on the Hill in the, on the Senate side in the Dirksen building.

And here is Ed Haertel, preeminent statistician and Linda Darling-Hammond who actually lead president Obama’s transition team for education and is just an amazing scholar and fellow with the National Academies and all sorts of things. And they’re talking about value added measures being used for teaching as total junk science.

And it was just sort of put things into perspective in an interesting way. So, I mean. The conversation will always go on about such a thing as bad teachers. And I think we need to shift the narrative to good teaching and bad teaching and let go of this idea of bad teachers. Nobody gets into this work to do a bad job.

Let’s find ways to make all of us do the work better because the kids need that from us.

Denis: So, there might be some listeners who find themselves in a similar position where, I think right now it’s still extremely popular to use this mathematical modeling, statistical modeling to evaluate teacher performance.

If you are someone who had just received a rating, based on standardized test scores that you don’t understand and you’re feeling sort of dejected and you’re doubting yourself, what advice would you have for that person?

Peg: I think, do your research. Do your research. Look up the work that Linda Darling-Hammond has done, the work that Ed Haertel has done, but then reach out to your community. There are other people. You’re not the first person that’s had this done to you, because it really is done to you. And there are people who have fought back in a variety of ways, whether it’s by coming together and being vocal and blogging, but there are people who fought back, literally fought back by suing and have won.

So don’t think that you’re in it alone. You’re not. And for every person who elevates their voice, it’s somebody else further down the line who maybe isn’t going to have to fight this battle or isn’t going to have to fight it alone.

Denis: All right. Well I think that is a fantastic concluding message. Thank you very much for your time.

Closing

So I had a really great time talking to Peg and I want to thank her again for taking time out of her very, very busy schedule to squeeze me in for an interview.

After hearing the story, I became very interested in the LA Times publication and I spent a bunch of time reading the articles that they initially posted and kind of following up on the response from the community, the response from the schools and also from the research community.

So I think the place where it makes sense to get started is to understand what value added modeling is and where it came from.

So value added modeling, as many things tend to be, came out of an idea that really makes a lot of sense. Suppose you have two middle schools. The first middle school, middle school A has 80% achievement on the state standardized assessment, by the time that students are leaving the 8th grade and at the second school, middle school B students are getting 75% achievement. So the most naive thing to do would be to compare these achievement numbers directly. And then you could pretty comfortably declare that, yeah, it seems like middle school A is doing a much better job than middle school B because they’re getting higher numbers.

But what if I told you that according to the same assessments, 5th graders that are slated to enter middle school A tend to have 90% achievement and correspondingly 5th graders that are going to be entering middle school B tend to have 50% achievement. Now the picture looks a lot different middle school A is getting very high achieving students and they’re actually losing 10% of their score over three years and middle school B is getting students who are achieving at much lower rates and are making a dramatic improvement by bumping them up 25%.

So now we might say, clearly middle school B is doing a much better job. This is the sort of direct comparison that value added modeling is meant to address. Instead of comparing scores directly, first you make a prediction of what sort of scores you would expect students at each school to get, and then you would compare their actual scores to those predictions.

This is the approach that the LA times took to arrive at their teacher rankings. They worked with a RAND Corporation economist named Richard Buddin. Buddin released a white paper describing the modeling approach he took to create the rankings. The prediction was based on their score from previous year, whether or not the student was male or female, whether or not the student had ELL or English Language Learner status, whether or not the student qualified for Title I programs and whether or not the student started at LA Unified in first grade or kindergarten.

Based on these variables, Buddin generated a predictive score for a student, and then he compared it to the score that the student actually obtained in the teacher’s class. Averaging across all of the students for a given teacher, he was able to obtain the overall score for the teacher. The LA times then used these scores to group teachers into categories. Each teacher was marked as either effective or ineffective. And then that categorization was published next to each teacher’s last name in their newspaper.

But wait a minute. How accurately would you be able to predict a score of a student given just their gender, language learning status, Title I status and whether or not they attended kindergarten? Aren’t there other variables that matter too, like whether or not the student has behavioral issues or emotional issues? If the student is homeless and they get a lower score than predicted by the model, is that something that we can really blame on the teacher?

What if the school is in a really affluent neighborhood and SAT season is coming up and every one of the kids in the classroom is getting a personal tutor? Is that something that the teacher should get credited for? What about class composition? My partner is an elementary school teacher, and one year she had 28 students, two of which had severe behavior problems and seven of which had individualized education plans. Another year at a different school. She had 16 students. There’s a huge difference between how much time and energy she could allot to each student between those two classrooms.

These are the kinds of things that aren’t taken into account by the LA times model, instead, everything is attributed to teacher effectiveness. So it’s no surprise that teachers might object to being evaluated in this way.

Proponents of value added modeling say that these are the sorts of things that happen for one student or another, but should average out given a sufficient amount of data. But in this situation, we’re talking about elementary teachers and data collected over only a handful of years. The LA times included teachers with over 60 data points. And that’s not a lot of data to make such conclusions from, besides the fact that there can be some factors that are consistent for a given teacher from year to year. For example, what schools they teach in, or whether or not the teacher is bilingual.

A bilingual teacher might be consistently assigned a higher fraction of ELL students, so they might have a much more challenging class year after year because of that. It was concerns like this that caused two researchers from the university of Colorado in Boulder, Derek Briggs and Ben Domingue to conduct a follow up study and publish their findings in a paper in 2011.

They criticize the button paper on a number of points.

First, they note that Buddin didn’t include a sensitivity analysis in his paper. This is a common practice, and is a statistical technique that allows one to see how much the model is reporting something that’s real versus random fluctuations within the data.

Furthermore, they ran a bias analysis where they try to predict a student’s previous scores based on which teacher’s classroom they ended up in. And they found that they were able to do this. This is strong evidence that suggests that students in LA Unified are not assigned to teachers classrooms randomly, and this severely impacts the conclusions of the model.

They also note the absence of some common variables in the analysis, like the composition of the class that the student is enrolled in. They found that including such variables in their analysis caused a large number of teachers to change their effectiveness rating.

I think this speaks of Pegs’ experience when she was describing trying to understand these ratings, noticing that a lot of teachers were going between effective and ineffective when they were making small adjustments to the model.

The LA times responded to the NEPC study with a headline, quote, “Separate Study Confirms Many Los Angeles Times Findings On Teacher Effectiveness”. This was quite a surprise to the study authors who then had to follow up with a fact sheet line by line explaining how, in fact, their analysis disagreed strongly with the LA Times findings, and that their recommendation was to remove the article and the ratings from the LA Times.

This resulted in a dramatic back and forth in which the LA Times accused the NEPC study for being biased due to the fact that they receive a lot of funding from teachers’ unions and organizations.

So my takeaway from all this is that value added modeling is a complex and delicate process. The LA Times analysis was extremely basic and it was irresponsible to present their results in the way that they did. Putting “ineffective” next to a teacher’s last name is a definitive verdict in what amounts to a public shaming of LA unified elementary teachers. Such a presentation does not make clear the inherently imprecise nature of predictive modeling.

Towards the end of the interview, Peg mentioned the names of a couple of education researchers Ed Haertel and Linda Darling-Hammond. I found reading their writings on value added models a stark contrast to the Buddin paper. Heartel, and Darling-Hammond are ruthlessly conservative with the interpretations of their models. It’s clear that the models are estimates and the scope of what they can measure is limited.

Compare this to Buddin who in his paper freely equates value added modeling results with teacher quality and teacher effectiveness. He also goes as far as to dismiss class sizes and teacher qualifications based on his model, and throws in a recommendation to base hiring and salary decisions based on such models as well. There’s not a single word of caution or any acknowledgement that predicting student test scores might not say everything there is to know about a teacher’s effectiveness.

Especially at the elementary level, teachers are helping students develop social and emotional skills that allow them to navigate their lives inside and outside of school. Teachers are also shaping their students’ identities as artists, writers, scientists, and mathematicians. And this may not have a direct impact on the students’ test scores for years to come. But that doesn’t mean that it isn’t incredibly important.

So, what can you do when an evaluation like this is forced upon you? My first instinct when writing this episode was to try and challenge the model from a technical point of view. I was hoping that if I read the papers and dug into the math, I would be able to pinpoint some faulty assumption or technique and discredit the approach altogether.

I quickly learned that this isn’t a good angle of attack. There’s shorthand notation, assumptions, references, and precedents to unwind. I studied machine learning in graduate school and worked in web search professionally for two years. So I’m no stranger to applying models to data and interpreting their results, but getting to a definitive critique would entail an entire research project. And that’s only one relatively simple model applied to one set of data. I don’t have that kind of time and I’m sure teachers who find themselves in similar situations don’t either.

I think the exchange between the NEPC and the LA Times also illustrates how things can go wrong with this approach. Even if you do put in the time to do a research project, your critique can be misinterpreted, discredited or just ignored. If even the experts in this field can end up talking past each other, what hope is there for us mere amateurs?

Instead, I think there’s a better approach. Suppose that you were able to build a really good model. You took a lot of variables into account and reduced bias as much as possible with this model. And you can predict pretty accurately what grade a particular student should receive on their end of year assessment.

What should we do with this information? Assigning scores to individual teachers and publishing the results in the newspaper might cause public outrage among the members of the community and cause some teachers, especially those who scored well, to seek employment elsewhere. Maybe you’ll give bonuses to the teachers that do well, but that too might cause resentment. And it may lead to more teachers teaching to the test or trying to game the system in some other ways. What if you fire the worst performing teachers, but then find out that down the line teachers you fired ended up doing exceptionally well at other districts? You may find yourself in a situation where your scores are falling across the board, year after year as teacher morale is destroyed by your experiments.

The thing is that even with a good model, you still need to make decisions about what the model means and how its results should be used. And this is an entirely separate question that needs to be answered by policymakers and educators.

Unfortunately, this question is often skipped over. It’s taken for granted that a teacher’s ability to beat estimates on an annual standardized test summarizes their entire worth as an educator. Somehow, even when the models are questionable, as they currently are, the prescription is always to fire, shame or otherwise punish the teachers with the lowest scores.

But there’s nothing in the value added model or the surrounding research that states that this has to be the case. A district could just as easily decide to use the scores as an additional piece of information to confidentially communicate to the teacher, to help them reflect on their practice, or to target programs providing additional support.

And there’s another option. In my opinion, which is shared by many educators and researchers in this field, these models are not suitable for evaluating individual teachers, and it’s not worth the risk to try and use them this way.

Well, that is the end of the episode. Thank you again for listening. This has been the Teacher Stories podcast. You can find a transcript and discussion of this interview online at teacherstories.xyz. If you’re an educator and you have an interesting story to share, please get in touch with me. You can contact me at contact@teacherstories.xyz, and I’m always looking for interviews and I would love to record your story. So once again, thanks for listening. And until next time.

← Previous

Teacher Stories Episode 3 - Changes, Changes with Ethan Weker

Teacher Stories Episode 5 - Learning to Lead with Noelle Tabor