More tests with higher stakes? Then, more test prep! Uh, no: bad logic. The problem is not the tests but our response.

February 8, 2012

Can we stop the hysteria about testing, as if asking kids to take tests is inherently harmful and unfair? Perhaps not, but let me try by posing a more focused question in this plea for reason. Can we please stop the bogus logic that says that the growing demand of tests requires that the local response be massive ‘test prep’?
Put a tad sarcastically: would somebody please show me the research that says that the only way to RAISE test scores – and keep them rising – is via mind-numbing bad-teaching ‘test prep’? Would you please point me to any research that says that the best or only way to raise test scores is to teach worse?
I didn’t think so.
Look, accountability is a serious business. And the stakes have gotten much higher over the past 20 years. (Be careful what you wish for: in my day the lament was that nobody cared in the least about education and schools and it was mostly true.) But it is simply false to say that asking students to take tests is essentially wrong. The challenge is to offer kids a fabulous education; then, the test results take care of themselves, as they are supposed to.
A thought experiment. Hold your immediate disagreement; let’s try a simple thought experiment. Close your eyes and visualize the classrooms in the best schools and districts in your state – especially in those schools that are outliers in their demographic. What do you see? Do you see endless and grim test prep regimen, with horrible Gradgrind teachers? Or do you see far better teaching than is found in the low-performing schools, whose only arrow in the quiver is – more worksheets? You know the answer: in the best schools in America, private as well as public, we see more high-level questioning, more intriguing assignments, more constructed response tasks on local tests, and more higher-order instructional approaches than in low-performing schools. And we see high, not low, test scores. How could it be otherwise? That’s validity 101.
In fact, by definition in the best schools in America, local assessments are more rigorous than state tests. The most recent review of the NY State Regents exams only underscores what many of us have known (or should have known) for decades: state tests are not very hard, and there is going to be hell to pay when the new common assessments are rolled out (as NAEP scores have told us for decades.)
Can we consider common sense for a spell, then, without the posturing? The crude response of test prep is understandable in light of local fears, and the harm it does to students and teachers is real. But test prep exists not because there are tests but because educators in weak schools have utterly lost their way. Test prep, in other words, is a compensatory strategy by educators who don’t seem to know what high-quality teaching and learning actually look like, and/or they don’t have faith in the power of a great education to cause good test results.
What we should then be freaking out about is that so many educators still don’t seem to know or believe in ‘best practice’, have no access to it in their work, and are under no accountability pressures to get that expertise in their jobs. (My electrician and carpenter work to higher standards: they have to be re-certified every few years on current code and expert knowledge.) Alas, awareness and use of best practice is optional in almost every school in America. Would those of you who are quick to disagree with me here want your child to be taught by someone who prized their freedom to do whatever, over their obligation to find out and do what works best – when current approaches are not clearly working? Would you go to doctors who had such an attitude?
The tests themselves: nowhere near as bad as people seem to believe. I have written various pieces over the last few years on what I have learned from looking at released test items. I refer you to those pieces as a start, in hopes of having a more rational discussion of the proper role, strengths and weaknesses of external testing in each school. If you do what I did – look at all the released tests in those states that do so, such as Florida and Massachusetts – I think you’ll come away thinking pretty much what I did: most test questions in the core subjects of reading and math are fair and appropriate, given the standards. (I confess I am less enthusiastic about some tests in other subjects, but those haven’t been tied to NCLB and other high stakes in most places.)
Here’s the epiphany that I had after reviewing dozens of the most challenging questions (as judged by the patterns of scores): the questions that are most difficult ironically demand transfer of learning, not rote learning – the very aim we all prize. This becomes especially obvious in ELA: the student gets a brand-new reading (or writing prompt) and has to make sense of it, using whatever strategies and skills they have learned, without being told explicitly what to do or without hints from teachers. The questions that are difficult for our kids in reading are inevitably questions at the heart of instruction: ‘main idea’ and ‘author purpose’. (see gallery, below)
Why would teachers who had taught her kids to read well fear such a test? Why are local administrators so intellectually bankrupt in weak schools that all they can counsel is test prep, in light of what the tests actually demand? Why isn’t there a local plan to hammer away at the big ideas of reading across all grade levels, as a team effort, given that the results on these targets have been dismal for years?
Same in math: I didn’t find one question that I thought was unfair or pointless, given the Standards. The questions may not be brilliant, but after a while, they start to seem pretty straightforward and even predictable, and the poor student results start to seem really weird. For example, when there is a triangle and some missing info about sides and angles, you can almost bet before looking at the numbers and the figure that it is either about 180-degrees in a triangle or the Pythagorean Theorem. In each case, this is core content. Yet once again, the results are surprisingly poor.
Here are many instructive examples in ELA and math:

The 64-million dollar question, then is: why are our low-performing kids not taking content we know they were taught and tested on and using it to address a test question that demands it – if you think about the question? The difficulty, in other words, is not in the content complexity but in the thinking demand: do the students know which content to apply, when -in a question that is not so dumbed-down as to basically tell you what to recall and use? That is the point of math: to see if, when faced with novel problems, students can solve them.
The weak results speak volumes to the weakness of instruction and (especially) assessment locally: in their classes kids get few real problems, just low-level ‘plug and chug’ as if superficial drill can bypass the need to think.
In over 30 years of work in schools I have rarely seen solid school-wide and district-wide tests. Rather, in typical local common assessments they often mimic the format of state tests without having the rigor of state tests. In the best schools, however, many individual teacher assessments are often creative, funny, challenging. I challenge you, therefore, to audit your local assessments using these audit materials. ELA blank audit assessment Math blank audit assessment Audit of Algebra I Test. Included is my audit of a good district’s exam in Algebra; you may want to compare it to your own. (I will provide an audit of a fully-released state tests in math and ELA in my follow-up post.)
In that light, it’s worth remembering that Jaime Escalante had no interest in suing ETS or the College Board over his students’ scores on the AP being challenged because he felt that the external tests were key to the raising of the bar at Garfield HS (A point I made via sports in a past blog.) It is also worth remembering that once Escalante proved what was possible, AP scores from many other teachers at Garfield also climbed – even in non-math courses.
You don’t have to agree with me or like what I am saying. But please suspend disbelief until you have investigated all the released state test items, gone to the best schools in your state to see what they do, and (especially) audited local assessment. Then, see if you don’t feel as I do: that the problem is not testing and accountability per se but our unthinking response to testing and accountability in low-performing schools.
I am not criticizing the hard work or intentions of teachers in test-prep schools. I am criticizing the fact that though the methods are ineffective and doomed to be so by simple logic, test prep is all they keep doing. it’s wrong for kids and it’s a thoughtless response to the challenge. Someone locally needs to say that the test-prep emperor has no clothes.
Nor am I saying that good scores = good schools; that reverses what I said. I said that in really good schools it happens that test scores tend to be high – as they should be, if the tests are valid. Nor am I saying that any school with only high SES is a good school. There are countless schools in the suburbs that are not very good at all, i.e. they provide little or no value added, and the work is incredibly dreary – kids only persist out of extrinsic motivation. Such districts receive able kids, they graduate able kids, and not much vital learning happens in between.
Good schools and teachers, and our obligation to learn from them. What, then, do I mean by ‘good schools’ that are about something more than test prep, even in the face of tests? Nothing fancy. I mean the obvious: schools that any of us would want our kids to go to. Highly-qualified staff; a caring environment; where they really know your kid, and play to his/her interests/strengths; where they use endlessly engaging ways of getting kids hooked on a subject; where there is a challenging, yet stimulating curriculum, with worthy assignments; where they demand much of students but give much in the way of support; and whose graduates, regardless of GPAs, go on to make it in the world at something they care about – without being shocked and dismayed to learn that school standards were so low that they are unprepared for anything beyond school. In short, staff are mindful of the tests but not in a panic about them.
Many of you might not so much disagree with me as be disappointed that in this political war we find ourselves in that I would provide aid and comfort to the crazies who would destroy public education as we know it.
But I know this deep in my bones: if we who care about public education keep avoiding reasoned challenges to our beliefs from friends, if we keep dragging our heels on reform and accountability, and if we keep falling back on ad hominem attacks against everyone who disagrees with us, then we are no better than our enemies. And we disgrace ourselves as educators who, more than any other group in society, have the obligation to keep learning and questioning – including the questioning of thoughtless or demonstrably ineffective approaches such as ‘coverage’, and ‘test prep’.

PS: A week later this piece in Edutopia proved my point beautifully.

10 Responses

David Ginsburg says:

February 8, 2012 at 10:57 am

I couldn’t agree more, Grant, and made the same point in a recent article with examples of actual kids who’ve been hurt by misguided responses to test score pressure: http://blogs.edweek.org/teachers/coach_gs_teaching_tips/2011/11/nclb_no_chance_for_latinos_and_blacks_1.html

Reply
Jennifer Borgioli (@DataDiva) says:

February 9, 2012 at 11:48 am

Grant – I raise a glass in your general direction. I agree with your points and think you’ve a solid argument for focusing on what really matters. There’s one point I’d like to raise, though, and that’s around the issue of “test prep”. You talked about students doing worksheets and taking old state tests over and over again. What I’ve seen happen is that people recognize the what you describe is the worst kind of a test prep, the worst kind of instruction and stop. They then adopt the philosophy of “If we have high quality instruction aligned to the standards, the tests will take care of themselves.” It’s my opinion that the second approach is nearly as dangerous as the first.
In a 1992 study, Scruggs and Mastropieri put forth that students who are test-wise can outperform students of equal ability but lacking test-wiseness. When this conclusion is combined with the reality of testing anxiety [(As many as 25% of all elementary- and secondary-level students in the United States, about 10 million, have some degree of test anxiety, and about 10%, some 4 to 5 million, have a level test anxiety considered high. (Pintrich and Schunk, 2006)], it almost seems unethical to subject students to tests without preparing them in some way for that particular test. Cronbach calls it the concept of “maximum performance.” Clearly, it’s a given that performing reading and math skills in isolation on a particular day in a particular way isn’t really maximum performance, but it is a way that students are asked to demonstrate their learning and we have to acknowledge that and empower students to their best. Ideally, this happens right before the test itself and is couched in the bigger picture of meta-cognition.
Students do well on tests, regardless of their content understandings, when they have a plan. When they understand their learning needs, their strength and weaknesses, their stressors and tensions, they are much less likely to experience testing anxiety. They rarely, if ever, get a chance to tap into that knowledge in school. Instead, they’re told strategies to use because they worked for their teacher or the teacher read them in a book. “Test Prep” isn’t necessarily a bad thing unto itself. Bad test prep is bad, just like a bad rubric is bad or a bad essential question is bad.
I wonder about the consequences, though, of trying to wipe test prep off the table and sending students into that high-stakes, high-pressure exam with no understanding of who they are a a learner under pressure, no conversation about the demands of the test and explicit understanding of what it means to transfer the awesome, messy stuff they learned in school to the dry yet rigorous and complex, tasks asked the day of the test.

Reply
- grantwiggins says:
  
  February 10, 2012 at 9:31 am
  
  You raise an important point: test prep in the good sense means preparing students for the experience of having to go it alone under pressure. In that sense, some preparation for performance is clearly needed. I didn’t mean to exclude such preparation. Test wiseness is needed as is the recognition that the test is looking for the ‘best’ answer not necessarily the perfect answer. But in good schools they do that – give kids some practice with the format and silent context – without reducing teaching to drill and kill. Thanks for the clarification.
  
  Reply
Walter Stroup says:

February 29, 2012 at 12:33 pm

“Put a tad sarcastically… Would you please point me to any research that says that the best or only way to raise test scores is to teach worse?”
“Test prep, in other words, is a compensatory strategy by educators who don’t seem to know what high-quality teaching and learning actually look like, and/or they don’t have faith in the power of a great education to cause good test results.”
With your sarcasm duly noted, you like so many champions of the current testing regime seem not to notice the hard evidence that “test results” are not anywhere near as sensitive to “high-quality teaching” as you presume. Apparent face validity (a math test looking like a math test) is no guarantee of construct validity. And if a relatively stable test taking ability (an artifact of how item response theory get implemented in the real world of high stakes testing) effectively crowds out factors related to domain-specific teaching and learning, then it is indeed a rational, if perverse, strategy for educators at every level in the system to target this test taking ability with what you rightly characterize as “mind-numbing” test prep.
We shouldn’t be surprised to find best practices associated with content-related teaching are being marginalized by a testing-centered curriculum backed up by barrage of short-cycle assessments that make anything other than test prep seem deviant, dangerous, and irresponsible.
Two closely related questions should be asked by state legislators, governors, secretaries of education, parents, the business community, and anyone else having a stake in our educational system: (1) What is the evidence, at scale, that the tests are adequately sensitive to the quality of instruction to be able to serve the goals of a high-stakes accountability system? Put another way, when implemented at scale (e.g. across a district or state) what fraction of the variance can be shown to be sensitive to instruction on anything like an annual yearly basis? (2) Given that from the earliest days of high-stakes testing nearly everyone involved has acknowledged that some level of preparation for testing (e.g., what sometimes is discussed as a kind of familiarity) impacts results, we need to ask ourselves what fraction of the overall variance in student outcomes would we be willing to tolerate as an unfortunate, if unavoidable, feature of real-world test development?
Not wishing to prejudice anyone’s response to these two questions, our experience has been that the reality is so far out of line with what everyone seems to be assuming, that most begin to ask if tests that give the same results across years of instruction provide any additional information that would warrant their more frequent administration (e.g., in end-of-course exams or even the short-cycle assessments that are being administered in some districts every two weeks) or continued use as instruments of schooling-related accountability.
Moreover, the repackaging of this invariance as a measure of something vaguely referred to as “college and career readiness” shouldn’t distract us from the simple fact that this invariance (around 72% of the variance in our studies) – whatever it gets called – so completely overwhelms any annual, school-related, input factors (under 16% of the variance is what we are starting to believe is a pretty robust upper limit of the sensitivity to instruction for large-N studies), that targeting this test taking construct makes sense if scores on these tests are to be the terms by which everyone, and every policy, is to be judged.
So maybe your sarcasm might need to be replaced by a willingness to actually look under the hood of how tests are developed and how these procedures result in tests that are remarkably insensitive to precisely those factors for which schools can, and should, be held accountable.
Some of our empirical results as well as a discussion of alternatives that can be administered at scale and at costs likely to be well under that which my state is current spending (close to $500 M for the current contract with Texas for the “new” STAAR tests) can be found at participatorylearning.org.

Reply
grantwiggins says:

March 13, 2012 at 2:01 pm

I agree that the unintended consequences of so much testing are significant. And I thoroughly agree that conventional testing is relatively insensitive to day-in and day-out student learning (indeed, I made this point myself almost 20 years ago in Assessing Student Performance). But that still doesn’t fully explain why people act irrationally in the face of such tests. It is still the case that when people like Jaime Escalante held kids accountable to AP standards they passed the tests and otherwise would not have.
I have looked at the tests. And I have worked on many of the major alternative-assessment projects. (Vermont Portfolio, Coalition of Essential Schools). I have worked for AP, I have worked for IB and I have consulted to 3 states on their curriculum and assessment – I think i know of what I speak. More to the point, all really great schools, be they 90-90-90 schools or the Lezotte schools or the teachers in the value-added studies in TN make ‘outlier’ gains of value added on such tests. I have audited dozens of local assessments: they are almost all inferior to state and national assessment in terms of both validity and rigor. That’s my only point here: improvement on these tests in within our control, and great teaching and (especially) great assessment is key. I neither approve of excessive testing or high-stakes accountability based on a single test score. All I am saying is that there is no evidence available that test prep is more effective than really good teaching and instructional activities at raising scores, so let’s worry about the right things and stop abdicating our obligations as educators.

Reply
Walter Stroup says:

March 13, 2012 at 8:04 pm

“But that still doesn’t fully explain why people act irrationally in the face of such tests.”
To clarify, our explanation goes something like this: If we all agree that the tests we are using are remarkably insensitive (certainly in Texas) to meaningful content-related instruction (that is, to what the public and the state governments want them to be sensitive to for accountability purposes), then a question naturally arises … What are they sensitive to? And it is precisely in our willingness to try and answer to this question that, as a practical matter, we just might start to understand how even the most zealous forms of test prep. we are seeing in schools could be seen as a plausibly rational (if, still perverse) response to the realities educators currently face.
Bluntly put, if your job is seen to depend — disproportionately, if not yet exclusively — on getting “the scores” to go up in the very near term, then you’ll probably put your limited resources, time and effort toward what might matter most in changing the scores. With only a few arrows to work with in your academic change quiver, why wouldn’t you shoot at the bigger target in terms of the overall variance? Why wouldn’t you target the 70+% of the variance that is some kind of test-taking ability (or whatever the publishers or their apologists might choose to call this part of the variance this week) and focus much less on the relatively small portion of the variance that can be attributed to, in any large N implementation study we’ve seen (certainly in math and science), differences in instructional practices, materials, or other domain-specific input factors?
If, however, the “analyses” of high stakes items stays only at the level of which ones we like/don’t-like or other under-specified characteristics, then we’ll never get at the underlying psychometric properties that drive the larger reality of what schooling starts to look like “at scale”. These like/don’t-like analyses might still have the up side, nonetheless, of permitting us to continue to dismiss the passionate and increasingly well-informed concerns of “people” attempting to do the best they can by way of their children or their students, by calling them “irrational” … or even trying out a “bit” of sarcasm to tamp down their enthusiasm.
Finally, we can compare credentials, backgrounds and the like some other time. I’ve never doubted your qualifications to speak to the issues you’ve raised. My only hope is that you might allow that those of us who now find ourselves with very different accounts of what is happening in schools are fully committed to being reasonable (or, at the very least, not hysterical) … and then we’ll see where we can go from there…

Reply
Brenna Scanlon says:

March 21, 2012 at 3:39 pm

I like what you are saying but how do you specifically address the needs of low income and minority students? My husband teaches English at a low- income, predominately African American school on the west side of Chicago. He creates engaging curriculum that focuses on the higher order skills you spoke of, he truly is a star teacher and does an amazing job, yet his students still do not do well on tests, even when he KNOWS they KNOW the answer. Why? Because they lack something called domain knowledge, which often comes from having a stable family and lots of rich and engaging experiences on an everyday level. Many of his high school students don’t know where Europe is and if it is a country or a continent. There is a lot they do not know. So while he may get them engaged in a lesson that requires higher order thought- that success does not always translate to a test. I have said it for years: “create great schools with great teachers and the high test scores will be there without any prep.” But, I am not so sure that this is true anymore. What we need is the high level questioning, intriguing assignments, etc., that you spoke of, REGARDLESS of test scores. We can use the scores as a guide but they do not indicate whether learning is taking place in the classroom. I dare you and anyone else to see my husband in his classroom and tell me those kids are not learning and engaged. Of course, not all teachers are as great as my husband, but you could multiply him in all the classrooms in the entire school and the test scores probably would not go up that much. The lack of domain knowledge, the crises in these kids lives, lack of sleep, and many other unmet basic care necessities would make it really difficult to have high test scores.
There is no easy out in education. There is no panacea. Look for good teachers not through test scores but by watching them teach. Then get them to mentor other teachers. Use test scores as a guide not as a goal for an arbitrary number. Test scores should never be the focus– good teaching should be.
I apologize for any typos. I am trying to type this while taking care of my 9 month old daughter!

Reply
Smith says:

March 26, 2012 at 8:54 am

That’s really an interesting article. I am not so good at this points, but I would appreciate if all the parents give a happiness test for their children to get an actual measure of interest and ability to check. Thought I’d share!

Reply
Common Core Assessment: When Teaching Gets In The Way Of Reading Comprehension says:

May 13, 2013 at 10:42 am

[…] understanding of the text? Might there then be a link between such an approach and the fact that released test items show repeatedly that only about half of our students can identify the main idea in a text […]

Reply
Great Teaching Means Letting Go - TeachThought PD says:

August 31, 2019 at 11:07 am

[…] you see, therefore, how test preparation done right would mean that students gain practice in drawing from their repertoire with no teacher prompting, […]

Reply