Adaptive assessment: Benefits and Considerations

conditional standard error of measurement (CSEM)

Adaptive assessment, more often called computerized adaptive testing or computer-adaptive testing (CAT), is an AI-based technology that has existed since the 1980s.  Adaptive assessment provides substantial benefits by personalizing the test to each student, from shorter exams to increased security. But, most assessments in the world still don’t capitalize on the benefits of adaptive testing.  This post will introduce you to some of the advantages

 

What is adaptive assessment?

Adaptive assessment is the delivery of a test to an examinee using an algorithm that adapts the difficulty of the test to their ability, as well as adapting the number of items used (no need to waste their time).  It is sort of like the High Jump in Track & Field.  You start the bar in a middling-low position.  If you make it, the bar is raised.  This continues until you fail, and then it is dropped a little.  Or if you fail the first one, it can be dropped until you succeed.  Adaptive assessment takes this idea but builds it around a machine learning paradigm known as item response theory.

For more info, visit this post.

infographic-CAT

Benefits of adaptive assessment

As you might imagine, by making the test more intelligent, adaptive testing provides a wide range of advantages.  Some of the well-known benefits of adaptive testing, recognized by scholarly psychometric research, are listed below.

Shorter tests

Research has found that adaptive tests produce anywhere from a 50% to 90% reduction in test length.  This is no surprise.  Suppose you have a pool of 100 items.  A top student is practically guaranteed to get the easiest 70 correct; only the hardest 30 will make them think.  Vice versa for a low student.  Middle-ability students do no need the super-hard or the super-easy items.

Why does this matter?  Primarily, it can greatly reduce costs.  Suppose you are delivering 100,000 exams per year in testing centers, and you are paying $30/hour.  If you can cut your exam from 2 hours to 1 hour, you just saved $3,000,000.  Yes, there will be increased costs from the use of adaptive assessment, but you will likely save money in the end.

For the K12 assessment, you aren’t paying for seat time, but there is the opportunity cost of lost instruction time.  If students are taking formative assessments 3 times per year to check on progress, and you can reduce each by 20 minutes, that is 1 hour; if there are 500,000 students in your State, then you just saved 500,000 hours of learning.

More precise scores

CAT will make tests more accurate, in general.  It does this by designing the algorithms specifically around how to get more accurate scores without wasting examinee time.

More control of score precision (accuracy)

CAT ensures that all students will have the same accuracy, making the test much fairer.  Traditional tests measure the middle students well but not the top or bottom students.  Is it better than A) students see the same items but can have drastically different accuracy of scores, or B) have equivalent accuracy of scores, but see different items?

Better test security

Since all students are essentially getting an assessment that is tailored to them, there is better test security than everyone seeing the same 100 items.  Item exposure is greatly reduced; note, however, that this introduces its own challenges, and adaptive assessment algorithms have considerations of their own item exposure.

A better experience for examinees, with reduced fatigue

Adaptive assessments will tend to be less frustrating for examinees on all ranges of ability.  Moreover, by implementing variable-length stopping rules (e.g., once we know you are a top student, we don’t give you the 70 easy items), reduces fatigue.

Increased examinee motivation

Since examinees only see items relevant to them, this provides an appropriate challenge.  Low-ability examinees will feel more comfortable and get many more items correct than with a linear test.  High-ability students will get the difficult items that make them think.

doing test

Frequent retesting is possible

The whole “unique form” idea applies to the same student taking the same exam twice.  Suppose you take the test in September, at the beginning of a school year, and take the same one again in November to check your learning.  You’ve likely learned quite a bit and are higher on the ability range; you’ll get more difficult items, and therefore a new test.  If it was a linear test, you might see the same exact test.

This is a major reason that adaptive assessment plays a formative role in K-12 education, delivered several times per year to millions of students in the US alone.

Individual pacing of tests

Examinees can move at their own speed.  Some might move quickly and be done in only 30 items.  Others might waver, also seeing 30 items but taking more time.  Still, others might see 60 items.  The algorithms can be designed to maximize the process.

Disadvantages

Of course, adaptive assessment is not the best fit for every organization.  For starters, it requires an expert psychometrician with experience in item response theory and CAT.  They are quite expensive.  You also need a software platform; there are not many which support this.  You’ll most likely need a larger item bank, items that can always be scored in real time (no essays), and an appropriate piloting strategy.

Want to learn more?

Interested in learning more about the benefits of adaptive testing?  If you want a full book, I recommend Computerized Adaptive Testing: A Primer by Howard Wainer.  Prefer a short article for now?  Here’s my favorite.

Want to implement the benefits of adaptive assessment?

The first step is to perform simulation studies to evaluate the potential benefits for your organization.  We can help you with those, or recommend software if you prefer to do it on your own.  Ready to develop and deliver your own adaptive tests?  Sign up for a free account on our platform!

 

Nathan Thompson, PhD

Nathan Thompson, PhD, is CEO and Co-Founder of Assessment Systems Corporation (ASC). He is a psychometrician, software developer, author, and researcher, and evangelist for AI and automation. His mission is to elevate the profession of psychometrics by using software to automate psychometric work like item review, job analysis, and Angoff studies, so we can focus on more innovative work. His core goal is to improve assessment throughout the world.

Nate was originally trained as a psychometrician, with an honors degree at Luther College with a triple major of Math/Psych/Latin, and then a PhD in Psychometrics at the University of Minnesota. He then worked multiple roles in the testing industry, including item writer, test development manager, essay test marker, consulting psychometrician, software developer, project manager, and business leader. He is also cofounder and Membership Director at the International Association for Computerized Adaptive Testing (iacat.org). He’s published 100+ papers and presentations, but his favorite remains https://scholarworks.umass.edu/pare/vol16/iss1/1/.

Share This Post

Facebook
Twitter
LinkedIn
Email

More To Explore

waves paper
Psychometrics

The One Parameter Logistic Model

The One Parameter Logistic Model (OPLM or 1PL or IRT 1PL) is one of the three main dichotomous models in the item response theory (IRT)

laptop and numbers
Education

What is a z-Score?

A z-score measures the distance between a raw score and a mean in standard deviation units. The z-score is also known as a standard score