Cutscores set with classical test theory, such as the modified-Angoff method, Nedelsky, or Ebel methods, are easy to implement when the test is scored classically. The Angoff cutscore approach is legally defensible and meets international standards such as AERA/APA/NCME, ISO 17024, and NCCA. It also has the benefit that it does not require the test to be administered to a sample of candidates first; methods like Contrasting Groups, Borderline Group, and Bookmark do so.
But if your test is scored with the item response theory (IRT) paradigm, you need to convert your cutscores onto the theta scale. The easiest way to do that is to reverse-calculate the test response function (TRF) from IRT. This post will discuss that.
The Test Response Function
The TRF (sometimes called a test characteristic curve) is an important method of characterizing test performance in the IRT paradigm. The TRF predicts a classical score from an IRT score, as you see below. Like the item response function and test information function (item response and test information function ), it uses the theta scale as the X-axis. The Y-axis can be either the number-correct metric or proportion-correct metric.
In this example, you can see that a theta of -0.4 translates to an estimated number-correct score of approximately 7. Note that the number-correct metric only makes sense for linear or LOFT exams, where every examinee receives the same number of items. In the case of CAT exams, only the proportion correct metric makes sense.
Classical cutscore to IRT
So how does this help us with the conversion of a classical cutscore? Well, we hereby have a way of translating any number-correct score or proportion-correct score. So any classical cutscore can be reverse-calculated to a theta value. If your Angoff study (or Beuk) recommends a cutscore of 7 out of 10 points, you can convert that to a theta cutscore of -0.4 as above. If the recommended cutscore was 8, the theta cutscore would be approximately 0.7.
Because IRT works in a way that it scores examinees on the same scale with any set of items, as long as those items have been part of a linking/equating study. Therefore, a single study on a set of items can be equated to any other linear test form, LOFT pool, or CAT pool. This makes it possible to apply the classically-focused Angoff method to IRT-focused programs.
Nathan Thompson, PhD
Latest posts by Nathan Thompson, PhD (see all)
- What is Psychometrics? Definition: Improve exams with science. - July 5, 2023
- Certification Management System: Streamline Credential Management - May 19, 2023
- ANSI/ISO 17024 Accreditation - May 16, 2023