It is a longstanding ugly fact in education: the child’s socio-economic status is tightly correlated with test scores. The just-released SAT data from the College Board are right there for all to see and contemplate with the telling pattern visible for the umpteenth year: for every additional 20,000 dollars in parental income, scores rise in an almost perfect linear relationship by approximately 15 points.
Liberal policy-makers use such data to rail about inequities in education – see, the schools and teachers need more money! Conservatives bemoan the liberal take that money is what matters when there is little evidence that giving money to bad schools improves them; and teachers in schools that serve the poor cry out that all accountability systems are patently unfair if they ignore this ugly fact about achievement.
Me? I think they are all wrong. I would encourage readers to take a deep breath and consider a more likely meaning of all the data. I think the data best suggest that most schools are ineffective. Inappropriately and needlessly ineffective, given what we know about learning, and given what effective teachers and schools do and have always done.
Bear with me on this; keep an open mind.
I have spent 30 years working in schools in every state in the union, and in over a dozen countries around the world. In the US, I have worked extensively in Massachusetts schools (the best, statistically) and Mississippi (the worst, statistically). I have observed classes and done professional development work in the best prep schools and the worst of the worst urban schools; I have worked in Scarsdale and New Trier HS, and I have worked in a few of lowest-performing schools in New Jersey, Ohio, and Georgia. I have watched Ron Clark engage 35 middle school inner-city kids in a high-powered lesson on area at his famed Ron Clark Academy, and I have watched a 30-year veteran prep school teacher in a highly-regarded school bore me and his entire class to tears for two days running on the Civil War.
In the ‘worst’ schools in the country there are some great teachers (think: Jaime Escalante, for example); in the ‘best’ schools in America there are some truly horrible teachers who know little about planning backward from goals or modern pedagogy (Prep school teachers need not be certified or take any education courses to be hired). It is extremely rare to find a school in which almost all teachers are good, solid. Be honest! Do you think that almost every teacher in your school is effective in causing engaged and effective learning against key goals? I have personally never seen it, and in private most educators agree with me – rare. Indeed, I think this large range of quality in all schools is the dead give-away that schools are less effective than they might be.
This ‘crazy’ idea of mine can be substantiated a few additional ways. The key ideas here are value added and quality control. In genuinely effective schools we would see value added – given what walks in the door, they get better – and we would see quality control – minimal variance across teachers in effectiveness and engagement, and obvious mechanisms whereby error is sought out, corrected, and made likely not to occur next time. In short, MOST students and teachers would improve over time when pre/post or longitudinal assessments are used. But that is not what we see in school.
First, consider the massive value-added study of student achievement in Tennessee. As William Sanders (the architect of the value-added system) and Sandra Horn write:
For grades three through eight, the cumulative gains for schools across the entire state have been found to be unrelated to the racial composition of schools, the percentage of students receiving free and reduced-price lunches, or the mean achievement level of the school… Schools, systems, and teachers who do best under TVAAS are those who provide academic growth opportunities for students of all levels of prior academic attainment.
Differences in teacher effectiveness were found to be the dominant factor affecting student academic gain. The importance of the effects of certain classroom contextual variables (e.g., class size, classroom heterogeneity) appears to be rather minor and should not be viewed as inhibiting to the appropriate use of student outcome data in teacher assessment. These results indicate that any realistic teacher evaluation process should include as a major component a reliable, valid measure of a teacher’s effect on student academic growth. If the ultimate goal is the improvement in academic growth of student populations, one must conclude that improvement of student learning must begin with the improvement of relatively ineffective teachers regardless of the student placement strategies deployed within a school.
Good teaching matters, as Kati Haycock put it in summarizing this research a decade ago. “On average, the least effective teachers (Q1) produce gains of about 14 percentile points during the school year. By contrast, the most effective teachers (Q5) posted gains among low-achieving students that averaged 53 percentile points.”
Never mind public schools in Tennessee. There is little value added in the so-called best schools, public and private. That’s what most fatalists about SES miss. On average the kids come smart, leave smart and on formal pre- and post-tests don’t gain much if at all. In one of the elite prep schools in the country, they hired ETS to help them design a pre/post test of critical thinking for grades 9 and 12 – no gain; none. (The faculty bashed the test). Maybe the most striking and overlooked sentence in Saunders’ research is this one: “Disproportionately, high-scoring students were found to make somewhat lower gains than average and lower-scoring students.”
Similar results can be found when looking at the literature on misconception in science. Using the Force Concept Inventory in Physics, many students never shed their basic misunderstandings of the counter-intuitive aspects of the big ideas, even after a full year of Physics in our most esteemed colleges. It takes aggressive dialogical/interactive instruction to overcome these hard-to-change ideas (which is not the norm). And for almost 110 years the research on transfer shows that it is achieved with great difficulty – and that many common school practices work against developing flexible transfer.
Another clue is in the international data in recent years. As readers may know, in recent administrations of TIMSS some states (e.g. Minnesota and Massachusetts) and a few districts (e.g. Naperville IL and Montgomery County MD) were able to participate head to head with all the other countries. The results are striking, and give the lie to the typical review of international data. While the US is well back in the pack in terms of mean scores, the districts mentioned outperform almost all other countries (and low-performing Miami-Dade County was almost at the bottom compared to all countries in the world)!
Again, it is the range or spread of scores in a school, in a district, in a state, and nationally that shows lack of quality control. It’s that lack that holds us back in these measures. It is not true at all that our best performers in math and science are worse than those of other countries; it’s that the range of results in the US pulls our score down. No wonder Shanghai and Singapore outperform us ‘on average’, then: there are numerous mechanisms in place in those places to ensure greater quality control of teaching and learning (yes, including culture, but not primarily because of culture). It’s not that Singapore Math is brilliant stuff – in fact, Singapore is greatly interested in using Understanding by Design to improve performance beyond the rote! It’s merely that Singapore math is used with fidelity in Singapore.
Here’s another set of findings about value added – effect size. Consider the exhaustive meta-analysis of educational approaches done by John Hattie and summarized in his amazing book Visible Learning. In looking at the effect size of dozens of interventions, this statistic caught my eye: there are more than 30 pedagogical “moves” that produce a larger effect size than socio-economic status. What are some of the most powerful? Making students self-assess, giving them robust feedback, high-quality teaching, and engaging them in a high-order questioning/inquiry curriculum – the very thing many of us have been going on about for decades. (Ken Bain found the same things in his exhaustive study of the best college teachers. Good summary and video here.)
A final reason my idea may not be so off-base is that most large-scale standardized test scores have remained basically static over decades, except for some movement at the extremes. This year had one of the lowest mean reading scores but more perfect scores than ever in math; on NAEP, performance at minimum competency is up over 20 years but there have been no broad gains in higher-order performance for years. This all says to me something that most people overlook: the student is in a lousy feedback system. In any decent feedback system, average performers improve over time. If scores on average remain stuck for decades that says to me that the average learner is in a feedback void – and, therefore, that nature/nurture from the family will make a bigger difference than it should.
Why should that static pattern surprise us, given the range of teachers AND the timing and security of tests which cause the feedback to be too little too late? Now we compound the ineffectiveness. When tests are given once per year at the end of the year, are secure, and the results come back weeks later (sometimes without even the ability to see all the questions and item analysis) then you have an absurdly ineffective feedback system. By contrast, look at music, video games or other transparent feedback systems – everyone improves. Consider sports: high-schoolers in track and swimming now perform at levels that were once the province of college stars 30 years ago. I often use this thought experiment in workshops, based on my cross country coaching experience. Suppose we didn’t keep your times in cross country; suppose we only reported on your place of finish – and graded you on it. (That’s pretty much what local grading is). Who would improve greatly as a runner under such a system? By contrast, even my ‘worst’ runners made great gains. Tellingly, the ‘worst’ (Numbers 8 and 9) runners never became the Number 1 runner but the average time for the team dropped a full minute from the start to the end of the season, with the greatest gains coming from the worst runners (as value-added math suggests will always happen in an effective system).
What does this argument suggest we do more of? First, look more closely at the outlier teachers and, especially, schools, not the average teachers and schools; and demand that ‘best practice’ be better attended to in accountability schemes. This will of course be old news to readers of the work of Ron Edmunds and Larry Lezotte, and Doug Reeves with the 90-90-90 schools and others who have been focusing on the outlier schools and the causes of their success for decades. And it was the main conclusion drawn by Saunders et al in the value-added research:
Though the debate about whether student achievement data should be used as part of an assessment, evaluation, and accountability system for teachers will assuredly continue, the results of this study suggest that teachers do make a difference in student achievement. It is recognized here, however, that there were no direct, systematic observations of the quality of teaching and learning at the classroom level in this study. Thus, identifying teachers that clearly get results over time, and comparing them to teachers over time who do not, seems a logical, worthwhile next step in addressing the issues raised here and in further developing general lines of inquiry about the important relationship between teacher effectiveness and teacher evaluation.
Secondly, there has to be a much more intense and consistent effort to put in place a quality control system where we don’t wait until year’s end to find out how we did. It amazes me that 20 years into the Standards movement, most teachers cannot predict student performance on May’s test in November – and intervene accordingly. (Every music, art, and sports coach does this all the time by contrast). The work of Deming – now decades old – has still not permeated school walls, though many of us have tried to make it happen. Drop your cudgels and prejudices for a moment, educators: how can we expect consistent quality – in any activity, really – if we just close the door and put not-particularly-well-trained-out-of-college teachers in a room for 30 years, with few models, no other adults, minimal feedback, and no valid self-assessment and self-correction systems? Indeed, the lack of consistency in teaching is arguably worse in many ‘good’ private schools than in good public school systems: most teachers never get any feedback or attend regular professional development sessions; many are forever stuck in idiosyncratic unhelpful habits that get romanticized ad nauseum among private school people as the ‘art’ of teaching.
Thirdly, the management of schools is totally ineffective viewed structurally not personally. As every major organizational consultant from Peter Drucker, Jim Collins,and Steve Denning (recently) has noted, schools are stuck in a very old management paradigm, one that business long ago outgrew, in which workers are told what to do instead of given incentives and opportunities to find, own, and solve problems in teams related to performance as a key job function. What American schools lack – public and private, good and bad – are procedures and incentives that force every teacher, as part of the job, to minimize ineffective practice and fix achievement problems on a timely basis (instead of saying: I taught ’em, so the rest is their problem).
So, for me the question that really matters, if we are serious about reform, is: when will we all honestly face the fact that from a value-added perspective school is not very effective? When, therefore, will we face the fact that what the outliers do is OUR JOB to understand and do more of? When will education become like medicine and not allow people to free-lance core technique, serve on their own, or advance until they have met performance standards in residencies and internships? When will teachers be required, as central to the job description and evaluation – and for which time is allotted – to examine data and adjust practice accordingly on a regular and timely basis? (I firmly believe that education is currently where medicine was in the late 18th century – not yet but almost a science of best practice; we’re still in the village-barber, old-wives-tale phase of teaching).
Look, no one wants to try hard and be ineffective. This is neither a scolding nor a holier-than-thou harangue about ‘bad’ teachers. I merely think that a better explanation for the facts – a longstanding inability to raise student achievement – has to do with effectiveness, i.e. with what is in our control. I see little evidence for the fatalistic view that teachers and students are born, not made. The most important fact in Jay Matthews’ book on Jaime Escalante is overlooked by all the fatalists: even before Escalante left Garfield HS, his newly-enlightened colleagues were getting similar or better AP results that he did! In fact, such fatalism is a very odd position for any educator to take: to give up before you start.
The issue is not ‘great’ teachers or schools, just solid ones in which best practice is the norm not the exception, and where the first instinct is not to blame the kid but to blame our practices. By contrast, as long as we allow teachers to do whatever they feel most comfortable doing, in isolation, schools will in general be ineffective and SES will thus be the determining factor of achievement results in all schools. Shouldn’t we at least try this idea on for size?