You might expect to see a headline like this in the Onion, but you won’t. The Onion can’t run it because it isn’t just ironic—it’s 100% true.
A few years ago, a researcher at one of the big testing companies told me that when developing a reading comprehension test, knowledge is a source of bias. He did not mean the obvious stuff like knowledge of a yacht’s anemometer. He meant typical K–12 subject matter.
Since reading comprehension depends chiefly on knowledge of the topic (including the vocabulary) in the passage, the student with that knowledge has a large advantage over the student without it. And since there have always been great educational inequities in the United States, students’ knowledge—acquired both at home and at school—is very strongly correlated with socioeconomic status.
A logical solution would be to test reading comprehension using only those topics that students have been taught. Teachers can do this, but testing companies can’t—how would they have any idea what topics have been taught in each grade? It’s rare for districts, much less states, to indicate what or when specific books, people, ideas, and events should be taught.
Without a curriculum on which to base their assessments, testing companies have devised their own logic—which is sound given the bind they’re in. They distinguish between common and specialized knowledge, and then they select or write test passages that only have common knowledge. In essence, they’ve defined “reading comprehension skill” as including broad common knowledge. This is perfectly reasonable. When educators, parents, etc. think about reading comprehension ability, they do not think of the ability to read about trains or dolphins or lightning. They expect the ability to read about pretty much anything one encounters in daily life (including the news).
I already had this basic understanding, but still I found the “ETS Guidelines for Fairness Review of Assessments” eye opening. Guideline 1 is to “avoid cognitive sources of construct-irrelevant variance…. If construct-irrelevant knowledge or skill is required to answer an item and the knowledge or skill is not equally distributed across groups, then the fairness of the item is diminished” (p. 8). It continues, growing murkier:
Avoid unnecessarily difficult language. Use the most accessible level of language that is consistent with valid measurement…. Difficult words and language structures may be used if they are important for validity. For example, difficult words may be appropriate if the purpose of the test is to measure depth of general vocabulary or specialized terminology within a subject-matter area. It may be appropriate to use a difficult word if the word is defined in the test or its meaning is made clear by context. Complicated language structures may be appropriate if the purpose of the test is to measure the ability to read challenging material.
Avoid unnecessarily specialized vocabulary unless such vocabulary is important to the construct being assessed. What is considered unnecessarily specialized requires judgment. Take into account the maturity and educational level of the test takers in deciding which words are too specialized.
On page 10, it offers this handy table that “provides examples of common words that are generally acceptable and examples of specialized words that should be avoided…. The words are within several content areas known to be likely sources of construct-irrelevant knowledge”:
Since having good reading comprehension means being able to read about a wide variety of common topics, table 1 seems just fine. But testing companies’ silence about what their reading comprehension tests actually measure is not. They say they are measuring “reading comprehension skill,” but their guidelines show that they are measuring a vaguely defined body of “common knowledge.”
Common words are not common to all. Even “common” knowledge is knowledge that must be taught, and right now—at home and at school—far too many children from low-income homes don’t have an opportunity to learn that knowledge (which is common to youth from middle-class and wealthy homes). That’s why reading comprehension scores are so strongly and stubbornly correlated with socioeconomic status.
These tests of “common” knowledge are accurate assessments and predictors of reading comprehension ability; but they are not fair or productive tests for holding children (and their teachers) accountable before an opportunity to learn has been provided.
If all testing companies would clearly explain that their reading comprehension tests are tests of knowledge, and if they would explain—as the ACT’s Chrys Dougherty does—that the only way to prepare for them is to build broad knowledge, then we could begin to create a fair and productive assessment and accountability system. Before the end of high school, all students should have broad enough knowledge to perform well on a reading comprehension test. But what about in third, fourth, or even seventh grade? In the early and middle grades, is a test drawn only from topics that have been taught in school the only fair way to test reading comprehension? How many years of systematically teaching “common” knowledge are needed before a reading comprehension test that is not tied to the curriculum is fair, especially for a student whose opportunities to learn outside of school are minimal?
The answer depends not so much on the test as on what is done with the scores. If we accepted the fact that reading comprehension depends on broad knowledge, we would radically alter our accountability policies. Scores on “common knowledge” reading comprehension tests would be recognized as useful indicators of where students are in their journey toward broad knowledge—they would not be mistaken for indicators of teaching quality or children’s capacity. Instead of holding schools accountable for scores on tests with content that is not tied to the curriculum, we would hold them accountable for creating a content-rich, comprehensive, well-sequenced curriculum and delivering it in a manner that ensures equal opportunity to learn. To narrow the inevitable gaps caused by differences in out-of-school experiences, we would dramatically increase free weekend and summer enrichment opportunities (for toddlers to teenagers) in lower-income neighborhoods. (We would also address a range of health-related disparities, but that’s a topic for another day.)
In sum, reading comprehension really does rely on having a great deal of common knowledge, so our current reading comprehension tests really are valid and reliable. To make them fair and productive, children from lower-income families must be given an equal opportunity to learn the knowledge that is “common” to children from higher-income homes.
Reading is always a test of knowledge (image courtesy of Shutterstock).