Remarks on general and technical assessment topics in the Race to the Top Public Meeting held in Washington, DC on January 20, 2010.

Founded in 1926, CTB/McGraw-Hill has a long track record of innovation and distinction in serving the needs of students, educators, and policy makers with high quality, valid and reliable assessments. Right now, for example, we are delivering online writing assessments with artificial intelligence scoring for both formative and summative state tests, technology-enriched interim assessments in some of the nation’s largest school districts and states, vertical scales that support growth models in a number of state NCLB programs, and language assessments that meet the Title III needs of a consortium of states. No organization has greater expertise and experience in building high quality tests of achievement on nationally appropriate, K–12 standards, and conducting the rigorous large scale research studies required to validate such instruments. We care deeply about the education and testing of our nation’s students, and we appreciate opportunities now and in the future to help shape meaningful education reform.

To address the questions on General and Technical Assessment issued in advance of the meeting, I offer these observations for the Department’s consideration:

  • Summative assessments—those that have real consequences for students, teachers, and schools—must be standardized and validated. They should be built in compliance with the Standards for Educational and Psychological Testing (AERA, APA, & NCME, 1999), reviewed and approved by appropriately credentialed technical advisory committees, and administered with appropriately high levels of test security. Assessments meeting these requirements may measure new Common Core Standards, may include diverse item types eliciting complex student performances (see Yen & Ferrara, 1997, for example), and may be designed in innovative ways (see Patz, 2006, for example).
  • If the "through-course" assessments under Department consideration could be built and used as "interim assessments" and not used for summative purposes requiring standardization, then important goals might be achieved sooner. Such interim assessments can be thoroughly researched, scaled, and linked to summative tests through appropriate data analyses. Freed of the burden of standardization and validation required for summative use, these tests could be used much more flexibly and in a variety of ways (e.g., administered at different times). Test security would be much simpler, teacher professional development could be enhanced, and the pedagogical value of a comprehensive summative assessment (e.g., at end of course) would not be lost.
  • It is more important that tools for formative and interim assessments integrate seamlessly with instructional practice than it is for these assessments to integrate seamlessly with summative tests. The tools that enable a teacher to utilize student performance data in real time or near real time and adapt instruction accordingly are distinctly different from the tools required to securely administer a high stakes large scale assessment. Although it is possible to create one system incorporating all these tools, separate systems may be more feasible and optimal.
  • Opportunities to improve the current K-12 state testing landscape abound. In this endeavor much of the finest talent available to help guide this improvement resides in the testing organizations. Finding ways to engage this talent, learn from past successes and failures, and leverage existing capabilities, will be critical to designing future assessment programs that are truly innovative, technically defensible, and feasible.

CTB/McGraw-Hill supports the goals of the Common Core Standards initiative and the Race to the Top grant program. We look forward to opportunities to support the U.S. Department of Education, the states and their consortia, and the agencies and organizations that are advancing these important goals.

References

American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Standards for Educational and Psychological Testing. Washington, D.C.: American Educational Research Association. 1999.

Patz, R. J. (2006). Building NCLB science assessments: Psychometric and practical considerations. Measurement: Interdisciplinary Research and Perspectives, 4, 199-239

Yen, W., and Ferrara, S. (1997). Maryland State Performance Assessment Program: Performance assessment with psychometric quality suitable for high stakes usage. Educational and Psychological Measurement, Vol. 57, 60-84.