To develop a teacher evaluation system that is exemplary, there have to be clear, valid, and robust standards for such a system. So, before offering my particular version, as promised recently, I offer below a set of standards for use in building, critiquing, or improving any such system – including my own, to be posted next time.
The purpose of any proper evaluation is legitimate accountability and helpful feedback. Accountability means that we must be both responsible and responsive to feedback against legitimate organizational goals. Humans need to be held accountable because we have blind spots as well as good intentions. So, formal feedback against results is useful for both organization and employee in a healthy system.
Evaluation asks: how are we doing against our obligations? i.e. in schools it means asking: how well are students engaging, learning, and achieving? What have been our personal successes as causers of learning? What (inevitable) improvements are suggested by results to better honor our responsibilities?
Thus, for any evaluation to be legitimate and helpful it must be –
- Outcome-based, using salient performance-based job descriptions & indicators
- Evidence-based, in which all key inferences are supported by data
- Valid, based on Mission and key learning goals and tasks, with no arbitrary value-added targets, tests, performance criteria, or weighting of criteria
- Reliable, based on multiple measures, evidence, and feedback sources over time
- Transparent, based on direct evidence that provides a clear account of achievement as well as helpful and actionable feedback
- Honest about employee strengths and weaknesses relative to goals
- Fair, based on opportunities to show one’s results and strengths, in context; and where one can appeal a rating that one believes in unfair
- Growth focused, to encourage ongoing learning and constant adjustments, not unimaginative compliance
- Credible to all stakeholders; not hypocritically imposed unilaterally
- Feasible in terms of time for sufficient evidence collection by supervisors and discussion with employees.
- Effective, whereby evaluations have substantive consequences that align with institutional interests and personal aspirations
I trust that you find these standards sensible on their face, but let me make a few observations about their implications: By these standards –
- The current systems in place in New York and New Jersey (among other states) are utterly unacceptable. By virtue of relying somewhat on ‘secure’ tests, non-released items and no item-by-item analysis, there is zero transparency. (The decision by many states in recent years, such as Florida, to end release of tests is truly wrong-headed and unethical.) The current value-added measures – while ‘growth focused’ in theory – are based on completely arbitrary growth targets, a function of non-transparent psychometrics, as opposed to direct and actionable feedback that one can use to improve. This is true even if the value-added scores are psychometrically sound – which many researchers question when used for just 1 year. (See my prior post on the analogy of basketball players and coaches playing without direct evidence of their achievement.) No teacher evaluation system can be valid and credible to many stakeholders without a careful look at student work on meaningful academic tasks.
- Almost all evaluation systems based on one or two classroom observations lack an outcomes-focus, honesty, validity, reliability, and a growth focus; and they lack credibility to most stakeholders (since historically almost everyone is evaluated as being fine.) Evaluations based on a few observations – the dominant approach nationally – are especially unreliable and invalid when they focus on teacher behavior as opposed to teacher accomplishment – i.e. if they focus on the teacher and students instead of the learning. A teacher evaluation system that ignores teacher assessments and results (and student feedback) is invalid on its face.
- District-test-score-based evaluation systems are unreliable and mostly invalid. There are too few data-points and varied points of view from which to triangulate the data, direct assessment of complex performance related to Mission is typically missing, and there is no opportunity to address one’s context or special situation.
- Any generic evaluation system (e.g. the Danielson or Marzano framework) is likely to be invalid since it fails to evaluate according to local Mission statements, school and program aims, and personal goals; and too many of the dimensions in those frameworks have little to do with core outcomes.
Though I’ll save my own system for next time, readers will be able to predict many of its elements simply by considering the standards proposed and the weaknesses just cited. More to the point, one can easily imagine better evaluation systems simply by considering the systems that exist in the top professional and corporate organizations in which many of these standards have been deliberately attended to.
Put bluntly: many people outside of education would not stand for the evaluation systems inside education, and so it is rank hypocrisy for leaders in those organizations to propose most of the current schemes.
Here are some other helpful resources on teacher evaluation standards and processes:
http://www.nea.org/assets/docs/HE/TeachrAssmntWhtPaperTransform10_2.pdf
http://www.aft.org/pdfs/teachers/devmultiplemeasures.pdf
http://www.nea.org/assets/docs/2011NEA_Teacher_Eval_Toolkit.pdf
www.ccsso.org/documents/…/key_elements_for_educational_2007.pdf
http://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf
http://www.cpre.org/images/stories/cpre_pdfs/rb38.pdf
http://cisi.ucdavis.edu/wp-content/uploads/2012/09/EdSource-TEBrief06-11.pdf
http://www.uft.org/files/attachments/final-eval-quickstart-guide.pdf
http://www.uft.org/news-stories/evaluation-problems-worse-imagined-0
Readers should post links to other documents that offer sensible evaluation principles or policies.
27 Responses
Reblogged this on Diary of a Temporary Full Time Foreign EFL Instructor and commented:
If you work in education as a teacher or administrator this is a must read. I can’t wait for Part 2. It’s wrong to just accept the status quo in thinking about teacher evaluations.
Very helpful links. Thanks for posting them. I learned that my district is using the 2011 edition, rather than the 2012. Or, if they have changed we were not informed. I agree with smkelly8, the current status is unacceptable. However, besides contacting the union president in my district, what can be done? I feel l like “Pavlov’s dogs” always jumping through hoops. It’s disheartening.
PS. any info on using UbD in elementary general music class?
I’ll have more to say on what can be done in the next post.
Let me look thru units I have and see if I have any elem. music ones. Recurring EQs: How can we be more musical together? What are we trying to communicate in this piece? How can we better communicate it?
+1. Not that you need my approval or anything! Thank you for your analysis of the current state of teacher evaluation. It is spot on!
Thank you for a very helpful post. I have added a link here the the Australian national teacher professional standards. I my experience teachers use these the develop their goals and gather their own evidence of practice to demonstrate their achievement of and professional development towards achieving aspects of these standards. http://www.teacherstandards.aitsl.edu.au/OrganisationStandards/Organisation
Thank you for the link! This will be of interest to US readers.
Finally a voice of reason. I would add to this that district and building administrators create clear plans of professional development to find growth opportunities as they discover staff needs. We talk about collaboration and differentiation in the classroom. Why would teacher evaluations be any different?
That’s part of my pitch about ‘helpful’ to come in Part 2. But it also requires more actionable, specific feedback from results.
One of my ongoing concerns about the farce of VAM has been it’s utter blindness to chronic absenteeism, particularly to the fact that in high truancy schools, a teacher may never have the same group of kids in class each time that class meets. In effect, the teacher is being evaluated on the work of students that she can be viewed as never having had an honest chance of teaching. Scripted curricula and the difficulties of trying to teach it on time while catching kids up who are behind at random places in that curricula exacerbate the problem. This also holds true for non scripted curricula. I look forward to seeing how your system addresses all this. Here are some resources. http://education.illinois.edu/ber/School_Attendance and http://www.attendanceworks.org/
Alas, the links on the Illinois site are not good any more. Can you check it out? I clicked on the attendance link and got an error message at ieb
I clicked both links on your page here and both worked for me. I know this is no guarantee though, but why???? No clue!
No, I didn’t make myself clear. The ILLINOIS links work, but when you scroll down on that research page and click on STUDENT ABSENTEEISM, you get an error message.
For STUDENT ABSENTEEISM, it has been relocated. Try this link: http://nces.ed.gov/pubs2006/2006071_3.pdf
Scroll down to p. 61.
Thank you so much for all you write. Have you found a fair way to tie the results of these evaluations to a teacher’s salary? Our district has, not fairly in my opinion, instituted a merit based pay system. This coming year we are implementing one of the models you mentioned above and I’m pretty nervous.
[This book review came out today. From Gene Glass at ASU, editor of EDUCATION REVIEW: http://www.edrev.info/. Since teachers are evaluated partly on standardized test scores, this review & book has relevance. — Jane Jackson, Modeling Instruction Prgm, Arizona State University.]
de-testing + de-grading schools: Authentic Alternatives to Accountability and Standardization
edited by Joe Bower and P. L. Thomas
http://www.edrev.info/reviews/rev1265.pdf
Reviewed by J. Spencer Clark, Utah State University
In de-testing + de-grading schools: Authentic Alternatives to Accountability and Standardization, Joe Bower and P. L. Thomas bring together essays that provide a landscape of high-stakes accountability and standardization in current schools. More importantly, they highlight the ways in which administrators, teachers, and teacher educators are negotiating this landscape to lessen the presence of grades and testing their classrooms and schools. Bower and Thomas provide a strong case against high-stakes accountability and standardization by bringing together a wide range of perspectives and educational stakeholders from Canada and the United States (U.S.).
…
[The authors] provide key distinctions between assessing and measuring students, collecting information on and evaluating students, as well as sharing information with and grading students. The authors in this section also highlight the agency of individuals to counter the enormous structural constraints of high-stakes accountability …
Great comments about evaluations. In Colorado, legislators passed a law requiring teacher evaluations based 50% on student growth and 50% on teacher performance. Jeffco Public Schools, my district, was several years into a grant studying strategic compensation. We had already developed a teacher performance rubric and were able to continue to use that rubric. It has sections for professional preparation, professional techniques, and professional responsibilities. The grant is, also, being used to fund peer evaluators. Teachers are observed at least 3 times for entire class periods(formal observations) and, I think, 5 or so times for shorter periods (informal observations). You can, also, request an additional observation. I had a formal observation yesterday. The observer emailed her notes and thoughts, I added my comments, emailed it back. Tomorrow we will meet and discuss it. I think the system is pretty good. I hate it when a class is observed that turns into a train wreck, for sure. And love it when I teach like a rock star (well, not really – rock stars probably would suck in my classroom). But, we have really worked at my school to use the observations and feedback to collaborate and build our effectiveness. It is HARD work – but worth it. Stakes are high – tenure is no longer a reality.
The student growth part of the evaluation is based on multiple measures including, of course, the high-stakes state test. After this year, that will be replaced by the PARCC. There are, also, school and team goals factored in and other assessments.
Perfect? Of course not and that can be terrifying – but we’re working on it.
Nancy, when you say that the stakes are high – tenure is no longer a reality, does that mean the new evaluation system overrides tenure?
Also: I looked here http://www.cde.state.co.us/educatoreffectiveness/smes-teacher
When it says that part of the 50% which is based on student growth is going to be the state-wide tests, how much weight does that state-wide test have?
I plan to respond to this in my next post. Short answer: SOME form of external calibration of internal standards is needed. Then, the challenge is how to honor the other criteria (such as fairness and transparency).
Grant, I see you mention district-wide assessments, but what about state-assessments? Do you think those should be used to measure student growth in teacher evaluations?
I wonder about the relationship between teacher evaluation and professional development. Is there a direct correlation between the two? That is, the components emphasized in teacher evaluation should be the components focused in the profession development (and even vice versa?). If so, how do unit plans fit in teacher evaluation if our school is focusing on developing unit plans in our professional development?
I have long felt that the clinical supervision people were correct: it should be separated.
This song sums up teacher evaluations. The song was written by a high school physics and chemistry teacher, John Novak, in Minnesota. Please share it; John gave permission. — Jane Jackson
——————————
Let Us Test
(Sung to the Tune of “Let It Snow”)
Oh the weather outside is delightful,
But here inside it is frightful.
And since the State knows what’s best,
Let us test, let us test, let us test.
Our students are diligently working,
I hope that each question they’re reading.
And since we’re compared to the rest.
Let us test, let us test, let us test.
(Bridge)
And when we finally get results,
I hope that they turn out on top.
For if our students don’t do really well,
We teachers will get our pay docked!
It’s just another class interruption,
I’d complain, but I don’t have the gumption.
But we can’t give our students our best,
When we test, when we test, when we test.
——————————
from a music teacher…that’s just BRILLIANT!
Grant, another thing: speaking with some colleagues in Colorado who are trying to implement the expanded evaluation system one piece of feedback is that it is incredibly time consuming. Already we have administrators who are stretched pretty thin, so increasing their duties by expanding the evaluation system, while good in theory, makes me wonder if there is enough bang for your buck to put this into practice. Obviously we need to find ways of getting rid of bad teachers, however these kinds of reforms concern me as the regression to the mean might land us back in the situation we have now: High stakes standardized testing.
Agreed. That’s why ‘feasible’ is a criterion!
Was Part 2 posted? I have been waiting ever so eagerly for it, but can’t find it!
Cheers, Kim
Kim Carter Executive Director QED Foundation “Choices for learning; Choices for life”
“If the question of where to start seems overwhelming, you are at the beginning not the end of this adventure. Being overwhelmed is the first step if you are serious about trying to get at things that really matter on a scale that makes a difference. So what do you do when you feel overwhelmed? Well, you have two things. You have a mind, and you have other people. Start with those, and change the world.” — Liz Coleman
Alas, not yet. I hope to get to it soon!