Wesolowsky (2000) Zjk collusion detection index

wesolosky

Wesolowsky’s (2000) index is a collusion detection index, designed to look for exam cheating by finding similar response vectors amongst examinees. It is in the same family as g2 and Wollack’s ω.  Like those, it creates a standardized statistic by evaluating the difference between observed and expected common responses and dividing by a standard error.  It is more similar to the g2 index in that it is based on classical test theory rather than item response theory.  This has the advantage of being conceptually simpler as well as more feasible for small samples (it is well-known that IRT requires minimum sample sizes of 100 to 1000 depending on the model).  However, this of course means that it lacks the conceptual, theoretical, and mathematical appropriateness of IRT, which is the dominant psychometric paradigm for large-scale tests for good reason.

Wesolowsky defined his collusion detection index as

Wesolowsky collusion detection index

where

 

 

 

Here, the expected number of common responses  is equal to the joint probability of each examinee (j and k) getting item i correct, plus both getting it incorrect with the same distractor t selected.  This is calculated as a single probability for each item then summed across items.  The probability for each item is then of course multiplied by one minus itself to create a binomial variance.

The major difference between this and g2 is that g2 estimated the probability using a piecewise linear function that grossly approximated an item response function from IRT.  Wesolowsky utilized a curvilinear function he called “iso-contours” which is better in that it is curvilinear, but it is still not on par with the item response function in terms of conceptual appropriateness.  The iso-contours are described by a parameter Wesolowsky referred to as a (completely unrelated to the IRT discrimination parameter), which must be estimated by bisection approximation.

How to interpret?  This index is standardized onto a z-metric, and therefore can easily be converted to the probability you wish to use.  A standardized value of 3.09 is default for g2, ω, and Zjk because this translates to a probability of 0.001.  A value beyond 3.09 then represents an event that is expected to be very rare under the assumption of no collusion.

Want to calculate this index? Download the free program SIFT.

Nathan Thompson, PhD

Nathan Thompson, PhD, is CEO and Co-Founder of Assessment Systems Corporation (ASC). He is a psychometrician, software developer, author, and researcher, and evangelist for AI and automation. His mission is to elevate the profession of psychometrics by using software to automate psychometric work like item review, job analysis, and Angoff studies, so we can focus on more innovative work. His core goal is to improve assessment throughout the world.

Nate was originally trained as a psychometrician, with an honors degree at Luther College with a triple major of Math/Psych/Latin, and then a PhD in Psychometrics at the University of Minnesota. He then worked multiple roles in the testing industry, including item writer, test development manager, essay test marker, consulting psychometrician, software developer, project manager, and business leader. He is also cofounder and Membership Director at the International Association for Computerized Adaptive Testing (iacat.org). He’s published 100+ papers and presentations, but his favorite remains https://scholarworks.umass.edu/pare/vol16/iss1/1/.

Share This Post

Facebook
Twitter
LinkedIn
Email

More To Explore

waves paper
Psychometrics

The One Parameter Logistic Model

The One Parameter Logistic Model (OPLM or 1PL or IRT 1PL) is one of the three main dichotomous models in the item response theory (IRT)

laptop and numbers
Education

What is a z-Score?

A z-score measures the distance between a raw score and a mean in standard deviation units. The z-score is also known as a standard score