Spring cleaning? How about cleaning out – and cleaning up – your local assessments?
“Good educational systems must have the capacity to evolve over time. Testing systems must also have this capacity, both in relation to their purposes and the actual assessment instruments that are created. Given the more rigorous demands on learning and teaching that have become accepted internationally, exemplified by recent Common Core State Standards, test validation requires a concomitant rigor with a broad range of strong evidence.… not just testing those aspects that are easy to test” (ISDDE Working Group on Examinations and Policy, 2011).
What is alignment?
For many years, I’ve worked with states to develop assessments and/or conduct alignment studies for large-scale assessments administered to both general education and special education students. Alignment has generally been defined as a measure of the extent to which a state’s standards and assessments “agree” and the degree to which they work in conjunction with each other to guide and support student learning. It is not a question that yields a simple ‘yes’ or ‘no’ response; rather, alignment is a considered judgment based on a number of complex factors that collectively determine the degree to which the assessment tools used and evidence collected will gauge how well students are demonstrating achievement of the standards. In other words, how effective is the end-of year summative assessment - and the assessment system as a whole - in measuring the depth and breadth of knowledge and skills set forth in the content standards in relation to the performance goals and expectations for each grade level? Seems like a tall order for local assessment systems to undertake, huh?
Actually, alignment studies used to be pretty straight forward: analyze each test item for what content and depth of knowledge it assesses, and use that data to determine what is emphasized. Is this test too easy? Too hard? Too narrow in scope? Or, as with the Goldilocks story, “is it just right?”
Enter college and career-ready (CCR) standards and assessments
By their nature, CCR assessments must be broader in scope than assessments have needed to be in the past. They are comprised not only of test items, but also include performance tasks/prompts, scoring guides, a range of text passages and text types, AND in how well they assess the complex challenges laid out in CCR expectations (e.g., the ability to conduct short research projects and presentations, the ability to construct viable mathematical arguments). Let’s face it, this probably cannot be done well in one on-demand test at the end of the school year. Given the purpose, scope, and challenges implementing today’s college- and career-ready standards and assessments, a broader and deeper examination is now required than has been addressed with alignment studies of the past. I believe that a combination of mixed measures (e.g., complementary formative, interim, and summative assessments), rather than a single assessment is the best source for making the overall determination of strong alignment.
Development of new methodologies for validating CCR assessments is currently underway. One example is “An Alignment Study Methodology for Examining the Content of High-Quality Summative Assessments of College and Career Readiness in English Language Arts/Literacy and Mathematics” (Hess & Gong, May 2015).
In this paper, we propose six central questions to be answered by the results of the alignment analyses of large-scale assessments and/or local assessment systems, as a means to consider making overall judgments about the degree of alignment between CCR standards and what is actually assessed. We then provide detailed guidance and sample tools as to how each question might be “answered” by reviewers using a combination of findings related to the established criteria and data collected.
The Item/Form Evaluation methodology for high-quality
college- and career-ready ELA/literacy and mathematics assessments
is designed to answer these six central alignment questions:
To what degree is there a strong content match between the test items/tasks (and the test as a whole) and the state’s college and career content standards (e.g., defined by grade-level eligible content or content assessment targets)? Content alignment includes: grade-level skills/concepts and the ways in which students are expected to demonstrate their understanding of them.
Are the test items/tasks (and the test as a whole) more rigorous, less rigorous, or of comparable rigor and complexity to the intended rigor of the state’s college and career content and performance standards at each grade level? Rigor alignment includes analysis of the complexity of content, processing demands (e.g., recall versus justification), and the degree to which assessment tasks reflect preparation for more challenging work in the next grade level or course.
Is the source of challenge for test items/tasks appropriate? That is, is the hardest thing about the test items/tasks that which the item/task is targeting for assessment; or is there an underlying factor making the item more difficult to access or comprehend than it should be? (E.g., is there an algebra demand embedded in items assessing other math content? Does the reader need particular background knowledge of the topic in order to answer some of the text-related questions? Is there an unnecessary or extensive linguistic or reading load (e.g., complex sentences) in mathematics items? Do math stimuli include unfamiliar and perhaps unnecessary or above grade level vocabulary?)
Are the texts/stimuli for reading/writing/literacy assessments of appropriate length and complexity for this grade level? And does the balance between literary and informational texts appropriately reflect the intent of the college and career standards and have coherence with the state’s test blueprint?
To what degree does the content coverage and test design (of this assessment) include assessment of all of the major strands or claims (i.e., evidence-centered designed assessments) as described in grade-level eligible content (standards) in English language arts/literacy and mathematics at corresponding grade levels?
To what degree does the test blueprint and set of items (test as a whole) emphasize content and performance expectations (e.g., application of mathematical practices to mathematics concepts and procedures; increasing text complexity) to elicit evidence that students are preparing to perform more sophisticated, college and career-type work in the future?
More about the alignment criteria
Our first step in developing the content alignment methodology was to examine the criteria and sample evidences laid out in the document, Criteria for Procuring and Evaluating High-Quality Assessments. “…this document pays particular attention to not only the criteria states could ask vendors to meet, but also to the evidence states could ask vendors to provide to demonstrate criteria have been – or will be – met. States will, of course, adapt these criteria to reflect their context, standards, and procurement regulations” (CCSSO, 2013, p.1). The content criteria and sample evidences for mathematics and ELA/literacy assessments prioritize specific areas of the Common Core State Standards and describe particular features to be included in the vendor’s test design.
[CCSSO. (Oct. 2013). Criteria for procuring and evaluating high quality assessments. [Online] available: 2014.pdf]
Applying alignment methods locally
An examination of each central alignment question - through a thoughtful review of multiple test forms (including item banks), test blueprints, and supporting rationales that guided test development - will yield a range and variety of information (evidence) about the overall quality of the summative assessment. School districts can use many of these same tools and protocols to examine the quality of their local assessment systems, asking: how well does the combination of local and large-scale assessments give us meaning and actionable information about student learning? The quality of the summative assessment now becomes part of making your overall decision.