AK Star
It is essential that assessment design aligns with evidence-based instructional practices. When there is a disconnect between how students are taught and how they are assessed, the validity and reliability of the resulting data are compromised (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014).
In classroom instruction, students are explicitly taught research-supported strategies such as previewing questions to establish purpose and flagging items to revisit. These practices are grounded in cognitive science. Research demonstrates that reducing cognitive load improves performance and accuracy (John Sweller, 1988), while metacognitive strategies such as self-monitoring and revisiting work significantly enhance outcomes (Barry J. Zimmerman, 2002). Additionally, large-scale testing organizations have found that students who are able to review and change answers tend to improve overall scores (Educational Testing Service, 2018).
However, the current structure of the Alaska System of Academic Readiness (AK STAR) does not allow students to preview questions or flag and return to items. This creates a clear misalignment between instructional practice and assessment conditions.
From a measurement perspective, this introduces several critical threats to data quality:
Increased random error: When students encounter difficult questions early and cannot skip, they are more likely to guess, reducing score reliability (Lee J. Cronbach, 1951).
Construct-irrelevant variance: The assessment begins measuring factors such as anxiety and frustration tolerance rather than intended academic skills (AERA, APA, & NCME, 2014).
Reinforcement of guessing behavior: Once a student is forced to guess early, the likelihood of continued guessing increases, compounding error across the assessment (James W. Pellegrino et al., 2001).
Cognitive fatigue and disengagement: Restrictive testing conditions increase anxiety and reduce persistence, particularly among younger learners (Daniel T. Willingham, 2009).
When a student encounters a challenging question within the first few items and is unable to skip or return, the assessment shifts from measuring problem-solving ability to capturing a forced response. At that point, the resulting data can no longer be considered a valid indicator of student understanding.
If assessment systems are intended to produce valid, reliable, and actionable data, they must allow students to apply the same cognitive and metacognitive strategies they are taught in daily instruction. Without this alignment, we risk making high-stakes instructional and policy decisions based on data that reflect testing constraints rather than true student proficiency.