In the following two posts, I ask teachers and administrators – especially in New York – to ponder this look at Teacher Effectiveness Ratings with an open mind. I think you will find the evidence and argument thought-provoking – worthy of further discussion and inquiry locally.
Can a chronically “failing” school be inhabited by 100% “effective” teachers? That’s the question that has caused Governor Cuomo to take a tough stance against the status quo of education in New York (for reasons known only to the Governor). His office recently released a grim document entitled The State of New York’s Failing Schools. The report rhetorically asks and tries to answer the opening question. I have no comment here on the politics or wisdom of this move by Gov. Cuomo. I’m interested, rather, in a dispassionate consideration of the larger question raised by the report: What is and what ought to be the relationship between teaching ratings and school quality measures?
The gist of the Governor’s Report. Here are the facts that frame the Governor’s report: In the 2013-2014 school year, the teacher evaluation system resulted in the following ratings for New York State:

  • 95.6 percent of teachers were rated Highly Effective and Effective
  • 3.7 percent of teachers were rated Developing
  • 0.7 percent of teachers were rated Ineffective

Yet, the report notes, the schools on the watch list are struggling with student achievement and have not shown much improvement over time:

  • ELA Proficiency 5.9% (vs. 31.4% statewide)
  • Math Proficiency 6.2% (vs. 35.8% statewide)
  • Graduation Rate 46.6% (vs. 76.4% statewide)

Here’s the key conclusion in the Governor’s report:

It is incongruous that 99% of teachers were rated effective, while only 35.8 % of our students are proficient in math and 31.4 % in English language arts. How can so many of our teachers be succeeding when so many of our students are struggling?

That is surely a reasonable question; it does seem incongruous on its face. We should all be willing to consider that question, regardless of our feelings about the Governor and the politics of reform. After the general case for pursuing this issue is made in the Cuomo report, each “failing” school is profiled in the Appendix. (The Governor’s Report calls these “failing” schools while the designation actually used by the Department of Education is “Priority” Schools.)
A closer look at the high school. Because I am greatly interested in high school reform, I decided to concentrate on that data. This also has the virtue of factoring out the new tough and controversial Common Core exams used in the lower grades because the data for HS is based on the widely-accepted Regents Exam results and Graduation rates. The failing/priority determination was not made by the Governor’s Office. That designation is based on longstanding NYSED criteria for high schools: adequate growth in the English and Math Performance Index over two years, and a Graduation Rate of at least 60% and growth over two years. Strictly speaking, then, the designation “failing,” in the Cuomo Report should read: “failed to show adequate improvement once targeted as an under-performing school.”
Below is a typical profile from the Governor’s report – one page for each “failing” school. As with many schools on the list, this high school has not made adequate progress over a ten-year period: Screen Shot 2015-03-09 at 11.11.26 AM Effectiveness Ratings – for the school. Alas, as you see above, the Cuomo report inexplicably only highlights district teacher effectiveness scores, not school scores. (Though, in this case, since this district reports 0% Ineffective and Developing teachers, we can infer that this must be true for this school.) Nor, as you can see, are exam scores given on the report for the HS, just graduation rates.
So, to truly make the case the Governor wants to make we need school-based Teacher-Effectiveness ratings and school Regents Exam scores. Fortunately, with a little digging on the NYSED site, I was able to find all the school data I needed. I picked three struggling high schools from the Cuomo Report list and 3 of the top-rated high schools in New York City to compare. (All six schools are public schools.)
3 key questions. Before we look at any specific school data, let’s ponder three predictive questions. What do you think:

  1. Should Teacher Effectiveness ratings in struggling schools in general be lower, equal or higher to such ratings in the most successful high schools?
  2. In a school that needs to improve and does not, what would be a reasonable expectation for %s in Teacher Effectiveness in the 4 categories of Ineffective, Developing, Effective, Highly Effective?
  3. Should teacher ratings in the most struggling schools be lower than the district or state average, more or less equal to it, or higher than average? Or should there be no correlations at all?

I think that you’ll be interested in trying to predict the answers and in what I found.
The data. Below are the teacher effectiveness scores from 6 NY public high schools. So: based on the data concerning teacher effectiveness ratings, which high schools are struggling and which are very successful on state outcome measures? Note that four different ratings make up the total Teacher Effectiveness rating: the composite score, a state-calculated score based on test scores and value-added metrics, and two internally-generated scores.
The two scores that are most salient, then, are the bottom two because they are based on locally-assigned ratings by administrators and on locally-developed growth measures proposed by teachers (along with other local measures, in some cases).1. Brooklyn Latin Tchr Eff 11.50.34 AM 2. Bacc Global HS Tchr Eff 2015-03-06 at 12.02.34 PM 3. Schl of Future Tchr Eff  11.51.45 AM 4. Albany HS Tchr Effct 12.09.26 PM 5. East HS Tchr Eff  12.12.39 PM 6. Wash Irv Tchr Eff Data 12.16.25 PM From this data on six high schools, then, which would you predict are the three schools that are struggling and which would you predict are the three very successful schools?  Let’s add a bit more suspense before revealing the answer.
School performance data for the six schools. Here is the data for the same high schools on Regents Exams and Graduation Rate trends: 1. Brooklyn Latin grad data PMScreen Shot 2015-03-09 at 11.25.36 AMSchool of Future Grad data4. Albany HS Grad Data 12.10.14 PM5. East HS Grad Data at 12.13.27 PM 6. Wash Irv HS grad data 2015-03-06 at 12.17.06 PM You can thus fairly easily see which data reflect the successful vs. unsuccessful schools in absolute terms – the first three schools in this data. But can you pair each set of performance results with its partner teacher effectiveness data, above?
The results. Made your predictions? Here are the results: The first three schools listed in both sets are the successful schools, and the second set of six schools pairs up with its like-numbered partner in the first set. In other words, School #1 in the first list is School #1 in the second list, and Schools 1 -3 are the successful ones. So: the teachers in the struggling schools are far more highly-rated internally than the teachers in some of New York’s most successful schools. (Two of these successful schools are on many short lists as the best high schools in the City). Indeed, in one of the struggling high schools (School #5), a school that has not made adequate improvement for 10 years, almost all teachers are rated Highly Effective locally!
Tentative conclusion. From this (limited) data we can infer that in a successful school – whether clearly improving or doing well in absolute terms, on credible exams and client survey results – the local teacher effectiveness ratings are often lower, sometimes far lower, than those provided locally to teachers in failing schools. So, there would appear to be some merit to the core premise of the Cuomo report, regardless of how mean-spirited the approach feels to many NY educators.
The Next Post. In Part 2, which I will post in a few days, I take a closer look at School #3 on this list – a successful high school in NYC – and compare its data to those from two other high schools in the City. New York City publishes a great deal of accountability data beyond test scores and teacher ratings, including survey data from teachers, students, and parents; and offers an extensive Quality Report for each school, based on site visits. So, we can come away far more confident as to whether the Teacher Effectiveness Ratings have merit or not in City schools. I will look at the data from the three schools and end with 4 recommendations on how to make the teacher effectiveness ratings more honest and valid – hence, credible.

Categories:

12 Responses

  1. This just raises questions for me, which I am sure is your point.
    1. How are the schools that are very successful defined? Are they successful based on overall results or are they successful based on how much they have helped students grow?
    2. There is evidence that the rate at which one learns more information is based on how much one already knows and a consequence of this is that schools with incoming students who do not know as much suffer under value-added models (they have more room to grow, but the incoming knowledge of their students makes growth more challenging). There is also evidence that many external assessments suffer from a ceiling effect where students who already know a lot receive perfect or near perfect scores on the assessment and so how much they have actually grown is not known. The question is then, to what degree are the external measures biased toward measuring growth in students in the middle?
    3. Are we in fact measuring the right things to help support schools in improving? It seems to me that devoid of any research on what the successful schools are doing differently (and how these practices integrate into the rest of what they do) that an effectiveness score compresses the information we would want to know to make decisions too much.

    • NY has an improvement model, not an absolute threshold model. As you’ll see in the next post, a school with similar demographics, in Harlem, has a high rating on accountability from NY even though its absolute results are modest. But it has made steady progress and its Quality Report is very positive as is the teacher and student survey data. That there are schools that DO improve indicates that it is possible.
      Whether the metrics are the right one – I think the ones in NYC are very solid, as you’ll see in the next post. The NY State ones are cruder. But gains in graduation rate sure seems like a necessary goal if not sufficient for a high school. When you see the sharp contrast between the student/teacher surveys in an improving school and an unimproving school, you’ll see that there are significant differences on lots of metrics.

  2. A few things that jump out to me:
    The 3rd school, an “effective” school, has more than half the teachers rated ineffective locally. How can that not be a dysfunctional environment – either at the teaching level, administrative level, or both? By what calculations was this determined to be an effective school?
    The “effective” schools had more teachers rated “ineffective” based on state growth factors than the ineffective schools. This undermines your point about local ratings a bit.
    Did you notice that almost all teachers at school #5, the school with 10 years of inadequate progress, were also rated highly effective based on “state growth factors”? What do you make of that?
    I have no idea how teachers at “ineffective” schools would compare to other teachers. We would be comparing apples and oranges on many different levels.

  3. Initially I thought that school #3 had over half their teachers receiving poor administrative evaluations. On further investigating, it appears that “locally selected” is local measures of achievement & “other measures” is administrative evaluations. I retract the dysfunctional comment.

  4. Correlation, which in this case appears to be no more than an artifact of the evaluation system, is not causation. The first thing that came to mind was ceiling effects as mentioned in the first reply. I see that no examination of student demographics is being considered, no acknowledgement of the very wide range that exists in the capability of families and communities to insure that students can show up at school ready, willing and able to learn. Attempting to talk about schools while excluding any discussion of the nature of the students they serve seems to be a pointless exercise. The 7 state VAM study done by reformers that found that about 98% of teachers were highly effective comes to mind, mostly because they were stunned by the results which did not align at all with their assumptions. Naturally they rejected the findings of their own inquiry for that reason. Mr. Vollmer’s tale of personal transformation as described in the blueberry story also comes to mind for a different reason: he was persuaded by reality to reconsider his position which he did, resulting in a reversal of his previous views. That kind of intellectual integrity is far to rare if it even exists at all among policy makers today. The unspoken assumption of Cuomo and all others that keep whipping the dead and buried horse of teacher ratings vs. student outcomes is that everything in every school is exactly the same except for the teachers, an obviously absurd idea. http://www.jamievollmer.com/blueberries.html

    • I have served as an administrator in multiple school settings…urban, suburban, and rural. I have also studied teacher supervision and evaluation for the past 18 years. Multiple sources of data need to be part of the supervision process, but used carefully in evaluation. There is a great difference between supervision and evaluation…formative vs. summative. Student data and “appropriate” individual learning/elective goals should be part of the conversation about continuous improvement. I struggle with evaluation based on student data.The supervision process changes teacher behavior, not evaluation. I have had the pleasure to supervise outstanding teachers who have internalized student growth as their mission. Even with growth models, their students scores do not accurately reflect the quality of their teaching. Our focus needs to be on our administrators. Quality, collaborative supervision creates conditions for quality student achievement. We have work to do with our administrators. We also need to give them the latitude to make needed changes.

      • Agreed, fully. I think when you see the triangulated data for 3 schools in the follow-up post you’ll see this reflected in the data. The survey data and Quality Reports are very revealing.

  5. Grant,
    Thanks for the post. One metric / data point that readers should be aware of is that teachers receive their score NOT on the students’ state exam score, but rather on the students’ growth score. Therefore it is possible for a student who was level 1 in 2nd grade to have a high growth score in 3rd grade while still not being proficient (say he received a higher level 1 score).
    Additionally, because students are compared year to year based on like scale scores from the state tests, it is highly probable that a rebound effect would happen for students who fall or rise substantially in one year to do the opposite in the next. There seems to be a flaw in this normative system but without looking at really large data sets it is difficult to determine what the margin of error is.
    Looking forward to your next post. I ask frequently if internal norms have too much of an impact on our perceptions of efficacy and think every educator should be asking how to control against this.
    Mike

    • Mike, you are spot on – these are all growth scores, which is why a number of schools with modest absolute results score highly on the accountability measures. That was what immediately caught my eye in going back and forth between Quality Reports, Surveys, Test Scores, and Teacher Effectiveness Scores (as you’ll see in the next post). The “local norms” issue is what is hurting us – and has been for years with both teacher and student results. By contrast, in other countries (and IB and AP schools & program) there are big incentives to calibrate internal results to external standards. This is the ONLY way forward to avoid excessive mandates like what we now have. That’s what many critics of reform fail to grasp: until and unless internal standards are more valid, schools will be pressed externally by the state and by policy-makers.
      In Canada, there is far better calibration internally and externally because exam scores count in student grades and for university entry. We need something more like that.

  6. Most of what Cuomo says about failing schools, appears to originate from the Big 6 (major NY City School Districts). I’m interested to know how he feels about Upstate schools in general. We have much higher ELA and Math, Regents scores, and graduation rates. A blanket approach to NYS School reform is not the answer. His beef should be with the city schools.
    Paul Kelly Grade 4 Teacher at Lake George Elementary School

    • Well, it’s not Cuomo, recall: this is the NYSED list of schools in need of improvement. For sure, the schools are overwhelmingly urban (though there are schools on Long Island on the list). Agreed, though: blanket approach unwise.

  7. I have read this post several times and really have trouble understanding your tentative conclusions.
    You seem to be calling particular attention to the fact that the local ratings are higher at the struggling schools. Why? The teachers at the struggling schools have MUCH better ratings based on the state growth measures as well – 35% highly effective vs 20% at the highly rated schools & only 5% ineffective vs 15% at highly rated schools. Are you questioning the state growth model as well?
    As an aside, it seems “incongruous” that the struggling schools have teachers with HIGHER “state growth ratings” – which means their students had higher growth scores – which means they “learned” more than “expected” – which means it is a good school – which is why it is on the in need of improvement list… Waitaminute!

Leave a Reply

Your email address will not be published. Required fields are marked *