Posts

Some time ago, I received this question regarding interpreting IRT cutscores (item response theory):

In my examination system, we are currently labeling ‘FAIL’ for student’s mark with below 50% and ‘PASS’ for 50% and above.  I found that this amazing Xcalibre software can classify students’ achievement in 2 groups based on scores.  But, when I tried to run IRT EPC with my data (with cut point of 0.5 selected), it shows that students with 24/40 correct items were classified as ‘FAIL’. Because in CTT, 24/40 correctly answered items is equal to 60% (Pass).  I can’t find its interpretation in Guyer & Thompson (2013) User’s Manual for Xcalibre.  How exactly should I set my cut point to perform 2-group classification using IRT EPC in Xcalibre to make it about equal to 50% achievement in CTT?

In this context, EPC refers to expected percent/proportion correct.  IRT uses the test response function (TRF) to convert a theta score to an expectation of what percent of items in the pool that a student would answer correctly.  So this Xcalibre user is wondering how to set IRT cutscores on theta that meets their needs.

Setting IRT cutscores

The short answer, in this case, would be to evaluate the TRF and reverse-calculate the theta for the cutscore.  That is, find your desired cutscore on the y-axis, and determine the corresponding value of theta.  In the example below, I have found a % cutscore of 54 and found the corresponding theta of -0.13 or so.  In the case above, a theta=0.5 likely corresponded to a percent correct score of 60%-70%, so observed scores of 24/40 would indeed fail.

test response function

Of course, it is indefensible to set a cutscore to be arbitrary round numbers.  To be defensible, you need to set the cutscore with an accepted methodology such as Angoff, modified-Angoff, Nedelsky, Bookmark, or Contrasting Groups.

A nice example is a the modified-Angoff, which is used extremely often in certification and licensure situations.  More information is available on this method here.  The result of this method will typically be a specific cutscore, either on the raw or percent metric.  The TRF can be presented in both of those metrics, allowing the conversion on the right to be calculated easily.

Alternatively, some standard-setting methods can work directly on the IRT theta scale, including the Bookmark and Contrasting Groups approaches.

Interested in applying IRT to improve your assessments?  Download a free trial copy of Xcalibre here.  If you want to deliver online tests that are scored directly with IRT, in real time (including computerized adaptive testing), check out FastTest.

Want to improve the quality of your assessments?

Sign up for our newsletter and hear about our free tools, product updates, and blog posts first! Don’t worry, we would never sell your email address, and we promise not to spam you with too many emails.

Newsletter Sign Up
First Name*
Last Name*
Email*
Company*
Market Sector*
Lead Source

The Contrasting Groups Method is a common approach to setting a cutscore.  It is very easy to do, but has the important drawback that some sort of “gold standard” is needed to assign examinees into categories such as Pass and Fail.  This “gold standard” should be unrelated to the test itself.

For example, suppose you wanted to set a cutscore on a practice test that is helping examinees determine if they are ready for a high-stakes certification test.  You might have past data for examinees who took both your practice exam and the actual certification test.  Their results from the certification test can be used to assign them to groups of Pass or Fail, and then you can evaluate the practice test score distributions for each group.  These distributions are typically smoothed, and their intersection represents an appropriate cutscore for the practice test.  In the example below, the two curves intersect near a score of 85, suggesting that this is an appropriate cutscore for the practice test that will closely predict the results of the official certification test.

I developed a simple tool in MS Excel that allows our psychometricians to easily produce both the smoothed and unsmoothed versions of this method, given nothing more than a list of practice test scores and “real” test classification for examinees.  If you think this method might be appropriate for your exams, please contact us at sales@54.89.150.95 and one of our consultants will get in touch with you.

Want to improve the quality of your assessments?

Sign up for our newsletter and hear about our free tools, product updates, and blog posts first! Don’t worry, we would never sell your email address, and we promise not to spam you with too many emails.

Newsletter Sign Up
First Name*
Last Name*
Email*
Company*
Market Sector*
Lead Source