Multistage testing (MST) is a type of computerized adaptive testing (CAT). This means it is an exam delivered on computers which dynamically personalize it for each examinee or student. Typically, this is done with respect to the difficulty of the questions, by making the exam easier for lower-ability students and harder for high-ability students. Doing this makes the test shorter and more accurate while providing additional benefits. This post will provide more information on multistage testing so you can evaluate if it is a good fit for your organization.
Already interested in MST and want to implement it? Contact us to talk to one of our experts and get access to our powerful online assessment platform, where you can create your own MST and CAT exams in a matter of hours.
What is multistage testing?
Like CAT, multistage testing adapts the difficulty of the items presented to the student. But while adaptive testing works by adapting each item one by one using item response theory, multistage works in blocks of items. That is, CAT will deliver one item, score it, pick a new item, score it, pick a new item, etc. Multistage testing will deliver a block of items, such as 10, score them, then deliver another block of 10.
The design of a multistage test is often referred to as panels. There is usually a single routing test or routing stage which starts the exam, and then students are directed to different levels of panels for subsequent stages. The number of levels is sometimes used to describe the design; the example on the right is a 1-3-3 design. Unlike CAT, there are only a few potential paths, unless each stage has a pool of available testlets.
As with item-by-item CAT, multistage testing is almost always done using item response theory (IRT) as the psychometric paradigm, selection algorithm, and scoring method. This is because IRT can score examinees on a common scale regardless of which items they see, which is not possible using classical test theory.
Why multistage testing?
Item-by-item CAT is not the best fit for all assessments, especially those that naturally tend towards testlets, such as language assessments where there is a reading passage with 3-5 associated questions.
Multistage testing allows you to realize some of the well-known benefits of adaptive testing (see below), with more control over content and exposure. In addition to controlling content at an examinee level, it also can make it easier to manage item bank usage for the organization.
How do I implement multistage testing?
1. Develop your item banks using items calibrated with item response theory
2. Assemble a test with multiple stages, defining pools of items in each stage as testlets
3. Evaluate the test information functions for each testlet
4. Run simulation studies to validate the delivery algorithm with your predefined testlets
5. Publish for online delivery
Our industry-leading assessment platform manages much of this process for you. The image to the right shows our test assembly screen where you can evaluate the test information functions for each testlet.
Benefits of MST
There are a number of benefits to this approach, which are mostly shared with CAT.
- Shorter exams: because difficulty is targeted, you waste less time
- Increased security: There are many possible configurations, unlike a linear exam where everyone sees the same set of items
- Increased engagement: Lower ability students are not discouraged, and high ability students are not bored
- Control of content: CAT has some content control algorithms, but they are sometimes not sufficient
- Supports testlets: CAT does not support tests that have testlets, like a reading passage with 5 questions
- Allows for review: CAT does not usually allow for review (students can go back a question to change an answer), while MST does
Examples of multistage testing
MST is often used in language assessment, which means that it is often used in educational assessment, such as benchmark K-12 exams, university admissions, or language placement/certification. One of the most famous examples is the Scholastic Aptitude Test from The College Board; it is moving to an MST approach in 2023.
Because of the complexity of item response theory, most organizations that implement MST have a full-time psychometrician on staff. If your organization does not, we would love to discuss how we can work together.
Nathan Thompson, PhD, is CEO and Co-Founder of Assessment Systems Corporation (ASC). He is a psychometrician, software developer, author, and researcher, and evangelist for AI and automation. His mission is to elevate the profession of psychometrics by using software to automate psychometric work like item review, job analysis, and Angoff studies, so we can focus on more innovative work. His core goal is to improve assessment throughout the world.
Nate was originally trained as a psychometrician, with an honors degree at Luther College with a triple major of Math/Psych/Latin, and then a PhD in Psychometrics at the University of Minnesota. He then worked multiple roles in the testing industry, including item writer, test development manager, essay test marker, consulting psychometrician, software developer, project manager, and business leader. He is also cofounder and Membership Director at the International Association for Computerized Adaptive Testing (iacat.org). He’s published 100+ papers and presentations, but his favorite remains https://scholarworks.umass.edu/pare/vol16/iss1/1/.