JoLLE Forum-The Disconnect Between Standards and Assessment

Elizabeth A. Kahn
James B. Conant High School, Retired
Hoffman Estates, IL

 

Since at least the early 1960’s, educators and policymakers have been engaged in a recurring process of developing, revising, updating, and rewriting learning standards (earlier called instructional objectives or goals), whether at the local, state, or national level. What has been constant through more than a half century is that for the most part standards and assessment have been disconnected from each other. I maintain that this disconnect makes it difficult and unlikely that either standards or assessment will achieve the intended purpose of improving teaching and learning.

In what sense have standards and assessment been disconnected? Let’s first consider the instructional purposes of learning standards. The basic purpose of standards is to answer the following two questions that “drive the daily work of the school”: “What do we want students to learn (what should each student know and be able to do?) and how will we know if they have learned?” (DuFour, DuFour, & Eaker, n.d.). Standards are intended to communicate the answers to these two questions as clearly and as transparently as possible to teachers, parents, students, administrators, community members, legislators, policymakers, and other stakeholders. The goal of a common set of standards is that any interested party should have a clear, shared, identical understanding of what students will know and be able to do, and how others will know if they have learned the intended curriculum. The Illinois State Board of Education website states that with the new Common Core standards, “students and parents will clearly understand the knowledge students are expected to attain each year” (emphasis in original).

In work on how to develop instructional objectives (now “standards”), which has sold over three million copies worldwide, Robert F. Mager (1962, 1975, 1984) argues that an objective (or standard) is useful to the extent that it “succeeds in communicating an intended instructional result to the reader . . . to the extent that it conveys to others a picture of what a successful learner will be able to do that is identical to the picture the objective writer had in mind” (p. 19; emphasis in original). In order to clearly communicate an intended instructional result, Mager (1984) argues that an objective (or standard) needs three elements: the performance (what the learner is expected to be able to do, sometimes the product or result of the doing), the conditions under which the performance is to occur, and criteria describing the characteristics of acceptable performance, i.e., how well the learner must perform in order to be considered acceptable. Mager (1984) states that “a well-written objective will dictate the form of the test items by which the objective can be assessed” (p. 98; emphasis added). In other words, a standard will dictate the form of assessment.

While the Common Core standards for English language arts have some of these three elements to varying degrees, for the most part they are missing elements and as a result do not achieve the intended goal of clearly communicating expectations for student learning. Here’s an example from the standards for Grades 9-10 (all references to the Common Core standards for English language arts are from the Illinois State Board of Education website.):

Cite strong and thorough textual evidence to support analysis of what the text says explicitly as well as inferences drawn from the text.

The question that is not answered concerns what exactly will students be asked to do to show their achievement of this standard. In other words, what will students do, or be expected to do, to show that they can “cite strong and thorough textual evidence to support analysis…”?

In Mager’s (1984) terms, this standard provides rather sketchy information about the performance expected of students, no explanation of the conditions for the performance, and a few insufficient criteria describing an acceptable performance. What specifically will students be asked to do? Would the assessment involve saying to students, “Now show us in some way that you can cite strong and thorough textual evidence to support analysis of what a text says explicitly as well as inferences drawn from the text”? Will students be given a specific text to read? Will it be a “surprise” text, or will they focus on a text that they have read previously? Will they be able to choose a text themselves? Will students be asked specific questions to answer about the text? If so, what will the nature of the questions be? Will questions be open-ended responses or multiple choice/forced choice? Will they be text dependent (e.g. How do Rainsford’s attitudes toward animals differ from Whitney’s?) or general questions that can be applied to various texts (Does the main character of the story change in any significant ways?). Will students be expected to bubble a given answer, write a sentence or two, write a short argument (a paragraph or two or two), or write a well-developed, multi-paragraph argument—forms of writing that Langer and Applebee (1987) have demonstrated rely on vastly different challenges in thinking? The standard does not indicate what task students will be asked to do to show that they have achieved this standard. It hardly hints at—much less dictates—the form of the assessment.

Some of the Common Core standards for writing come a little closer to suggesting how students will be assessed. Here’s an example for Grades 9-10:

*    Write arguments to support claims in an analysis of substantive topics or texts, using valid reasoning and relevant and sufficient evidence.

  1. Introduce precise claim(s), distinguish the claim(s) from alternate or opposing claims, and create an organization that establishes clear relationships among claim(s), counterclaims, reasons, and evidence.
  2. Develop claim(s) and counterclaims fairly, supplying evidence for each while pointing out the strengths and limitations of both in a manner that anticipates the audience’s knowledge level and concerns.
  3. Use words, phrases, and clauses to link the major sections of the text, create cohesion, and clarify the relationships between claim(s) and reasons, between reasons and evidence, and between claim(s) and counterclaims.
  4. Establish and maintain a formal style and objective tone while attending to the norms and conventions of the discipline in which they are writing.
  5. Provide a concluding statement or section that follows from and supports the argument presented.

This standard identifies the performance and provides criteria describing the characteristics of an acceptable performance. However, the conditions are unclear. What kind of writing prompt will students be given? What does “substantive topic” mean? Will students be expected to write in a limited time frame under “test conditions”? Will students be given evidence to examine on the topic? Will students be able to engage in research to find evidence? Will students be expected to generate evidence on the spot only from their own memory? Will students be able to craft pieces of writing over time with input from others, as in creating a portfolio of writing?

The conditions are important because they determine what kind of instruction teachers will need to design to prepare students for the assessment. Teaching students how to generate effective supporting evidence from their own memory in a short time period (say, 30 minutes) will involve different instructional activities than teaching students to find evidence within a text or to find relevant evidence using sources such as the Internet.

So why does it matter? So what if the standards don’t dictate the assessment? Surely all secondary English language arts teachers focus on teaching students to provide textual evidence that is warranted to support their analyses or on writing effective arguments, both impromptu and long-term assignments? Certainly all secondary English language arts teachers use various methods on a regular basis to assess whether students can support claims, provide textual evidence, use valid and warranted reasoning, and so forth? Is this concern about how to develop standards just nit-picking “edu-babble”? Imagine encountering the following standard:

Students will read a text and answer multiple-choice questions focusing on vocabulary and explicit and implicit information in the text.

In over 35 years of teaching, I’ve never seen this standard in any set of standards. It doesn’t pass Mager’s (1984) criteria of being a meaningful, authentic performance worthy of status as a learning standard. I believe that most people would agree that the purpose of learning to read well is not for the ultimate purpose of taking traditional multiple-choice, standardized tests. Being able to read well is crucial for learning processes related to one’s career, understanding relationships, knowing what is happening in the world, participating in a democratic society, and otherwise engaging fruitfully with others.

However, the standard I created reflects the most common form of high-stakes assessment, hence the mismatch between standards and assessment. When this mismatch occurs, the focus becomes teaching to the high-stakes test rather than to the standards.

What is involved in creating standards that “dictate” assessment? Teachers need to determine what specific tasks we will ask students to perform to show that, for example, they can interpret a text and support their viewpoints with textual evidence. The explanation of the tasks would include the performance, the conditions for students, and the criteria describing the characteristics of acceptable performance. I use the term task, not in its connotation as an unpleasant, tiresome job, but to mean a specific action (such as relating a narrative or engaging in argumentation) with an observable outcome (such as a written or spoken performance) that is authentic, worthwhile, and meaningful in and of itself to the writer or speaker and, ideally, with those communicated with.

Here’s an example of what one assessment task (standard) might be for the end of Grade 9. I’m not arguing that satisfying this task’s demands and expectations should indeed be a Common Core standard, but simply illustrating how the three elements—performance, conditions, and criteria—might be provided when developing a standard:

Students read a short literary text and write an argument explaining the central meaning of the piece with specific supporting evidence from the text. They will be given a literary text; its difficulty level could be described in much the way that the Common Core standards currently specify difficulty level with a number of sample works provided as illustrations. Students will have an hour to read the text and write their response without access to outside sources, etc. The standard would include a rubric describing the characteristics of an acceptable response as well as sample student responses at various levels of performance to help all those involved (students, teachers, parents, and other stakeholders) clearly understand the expectations.

A standard presented in this way would communicate clearly what students will be asked to do, under what conditions they will be assessed, and what characteristics are necessary for an acceptable response. This sample standard dictates the form of the assessment. And I believe that an argument can be made that this task is worthwhile, meaningful, and authentic. (For other possible assessment tasks, see Smagorinsky, Johannessen, Kahn, & McCann, 2010, 2011-2012).

How would developing literacy standards in this way make a difference? How would it change the ways that standards are used and the effects of standards on teaching and learning?

First, in English language arts, we would have fewer standards. Realistically, how many different open-ended tasks can students be expected to perform in an end-of-year assessment situation? Standards developers would need to focus in on the most important “tasks” that students need to be able to perform. Though there would be fewer standards, each would include much more explanation since each standard would specify a task that students would be asked to perform along with conditions and criteria. Each would include a rubric for evaluating student work and numerous samples of student work at various levels of performance to illustrate the characteristics described in the rubric. Some of what are now separate, discrete standards in the Common Core, such as citing textual evidence, would become criteria within other standards, rather than being standards by themselves.

With fewer standards, it would most likely be more difficult for all stakeholders to agree to a set of standards. One way to achieve consensus on standards is by including everything to satisfy a multitude of different groups and interests. Some will argue that fewer standards will narrow the curriculum too much. But if standards are written so that they dictate assessment, can we realistically expect to assess students at the end of a school year by having them do 25 or 35 different assessment tasks that each likely involves extended responses? A number of researchers (Ainsworth, 2003; DuFour et al., 2010; Marzano & Haystead, 2008; Reeves, 2002) argue for fewer standards, say, six to 12 “power standards” each year, so that each can be assessed in the classroom every month or quarter.

Most important, the standards could be used meaningfully by teachers in designing instruction. By knowing exactly what the final assessments will be, they can use Wiggins and Tighe’s (2005) “backwards design” model (cf. Hillocks, McCabe, & McCampbell, 1971). Teachers would have enough information to be able to develop their own specific assessments to use in the classroom (such as the assessment task above involving interpreting the central meaning of a text). They could give students a pretest at the beginning of the year to determine what aspects of this task students could already do and what specific areas they would need to target through instruction. Teachers could then design instructional activities to engage students in practicing various aspects of the assessment task. In small groups, students could analyze a set of responses to the prompt and rate them according to the rubric, thereby learning to understand and apply the criteria. Teachers could give formative assessments on an ongoing basis to determine student progress and to direct the course of instruction as the year continues. This kind of “teaching to the test” can lead to effective instruction and is very different from teaching to a traditional standardized test.

Formative assessments replicating the standards would provide teachers with meaningful data for instructional decision making. For example, teachers might discover that on the pretest most of the students provided a short summary of the plot of the text as the central meaning. They would then conclude that they need to design instructional activities that will help students learn the difference between retelling what literally happens and interpreting the central meaning. And they would be able to have some idea of which students have and which students lack this understanding. They would be able to identify students who are confused about necessary background information or who can’t figure out the meaning of specific words. It is much easier to glean this kind of information about students when they explain their thinking in open-ended responses than when teachers have lists of scores based on responses to forced-choice questions on standardized tests. Standardized test scores provide little information about how the student was thinking about an item and what caused him or her to miss certain items or interpret them in ways not anticipated by the manner in which the task was posed. “Data” from formative assessments such as described above can guide instructional decision making in much richer ways.

With clearer standards, teachers would have a strong basis for working collaboratively in professional learning teams to create assessment tasks that match the standards, to reach a level of consistency in evaluating their students’ work, to identify strengths and weaknesses, and to plan instruction to address their students’ needs.

The standards would communicate clearly and specifically to teachers, parents, students, administrators, and policymakers exactly what expectations for students are. Therefore, it would be clear when a particular assessment doesn’t match the standards. For example, a standardized, multiple-choice reading test does not match a standard that asks students to write an argument interpreting the central meaning of a text. At present, since the standards are vague and often do not dictate the method of assessment, it is difficult for teachers to convince policymakers, parents, and the general public that a particular assessment is inappropriate or problematic. And certainly standards that dictate meaningful assessment tasks would show the problems involved in evaluating teachers based on students’ scores on tests that are inconsistent with the standards. If there is a perceived need for traditional standardized testing, it would have to be justified for a purpose other than directly assessing the standards.

With clearer standards, teachers would be able to show evidence of their students’ progress by administering and evaluating the standards/assessment tasks themselves. They would be able to show evidence of student achievement without the need for flawed “value-added measurements” based on standardized test scores.

At this point assessments have not yet been developed for the Common Core standards. And because of the large number and vagueness of the standards, it is difficult to guess what the assessments will be. Teachers and administrators are left with many questions about what students will be expected to do. When the Common Core standards were developed, instead of thinking about standards (what do we want students to know or be able to do) in conjunction with thinking about the assessment (what will we ask students to do to show that they have learned), the process developed in at least two separate stages, starting with the standards, without clearly specifying assessment tasks (performance, conditions, and evaluation criteria). Then, some years later, assessments are devised and unveiled.

There are certainly many valid concerns that can be raised about the Common Core standards: their content, their development, their use, and their commercialization, to name a few. I’m not necessarily advocating for common national or state standards. But for standards to be effective in helping improve teaching and learning, they need to communicate clearly and specifically what students will be asked to do to show their learning, what conditions will be established, and what criteria will be used in determining whether students have learned. In other words, standards need to focus more explicitly on how student learning will be assessed.

References

Ainsworth, L. (2003). Power standards. Englewood, CO: Advanced Learning Press.

DuFour, R., DuFour, R., & Eaker, R., (n.d.). A big picture look at Professional Learning Communities. Bloomington, IN: Solution Tree. Available at http://www.allthingsplc.info/pdf/links/brochureText.pdf

DuFour, R., DuFour, R., Eaker, R., & Karhanek, G. (2010). Raising the bar and closing the gap. Bloomington, IN: Solution Tree.

Hillocks, G., McCabe, B. J., & McCampbell, J. F. (1971). The dynamics of English instruction, grades 7-12. New York: Random House. Available at http://smago.coe.uga.edu/Books/Dynamics/Dynamics_home.htm

Illinois State Board of Education. (2012). The new Illinois Learning Standards incorporating the Common Core. Available at www.isbe.state.il.us/common_core/default.htm.

Langer, J. A., & Applebee, A. N. (1987). How writing shapes thinking: A study of teaching and learning. NCTE Research Report No. 22. Urbana, IL: National Council of Teachers of English. Available at http://wac.colostate.edu/books/langer_applebee/langer_applebee.pdf

Mager, R. F. (1984). Preparing instructional objectives (2nd ed.). Belmont, CA: David S. Lake Publishers.

Marzano, R. J., & Haystead, M. W. (2008). Making standards useful in the classroom. Alexandria, VA: ASCD.

Reeves, D. B. ( 2002). Making standards work (3rd ed.). Denver, CO: Advanced Learning Press.

Smagorinsky, P., Johannessen, L. R., Kahn, E. A., & McCann, T. M. (2010). The dynamics of writing instruction: A structured process approach for middle and high school. Portsmouth, NH: Heinemann.

Smagorinsky, P., Johannessen, L. R., Kahn, E. A., & McCann, T. M. (2011-2012). The dynamics of writing instruction series. Portsmouth, NH: Heinemann.

Wiggins, G., & McTighe, J. (2005). Understanding by design. Alexandria, VA: ASCD.

 

Disclaimer

The views expressed on this website and contained within featured documents are solely those of the author(s) and artist(s) and do not reflect the views of the Department of Language & Literacy Education, The College of Education, or The University of Georgia.

Please report any issues or questions regarding the website to jolle.webmeister@gmail.com.