government licensure exams

Guttman errors are a concept derived from the Guttman Scaling approach to evaluating assessments.  There are a number of ways that they can be used.  Meijer (1994) suggests an evaluation of Guttman errors as a way to flag aberrant response data, such as cheating or low motivation.  He quantified this with two different indices, G and G*.

What is a Guttman error?

It occurs when an examinee answers an item incorrectly when we expect them to get it correct, or vice versa.  Here, we describe the Goodenough methodology as laid out in Dunn-Rankin, Knezek, Wallace, & Zhang (2004).  Goodenough is a researcher’s name, not a comment on the quality of the algorithm!

In Guttman scaling, we begin by taking the scored response matrix (0s and 1s for dichotomous items) and sorting both the columns and rows.  Rows (persons) are sorted by observed score and columns (items) are sorted by observed difficulty.  The following table is sorted in such a manner, and all the data fit the Guttman model perfectly: all 0s and 1s fall neatly on either side of the diagonal.

 

  Score Item 1 Item 2 Item 3 Item 4 Item 5
P =   0.0 0.2 0.4 0.6 0.8
Person 1 1 1 0 0 0 0
Person 2 2 1 1 0 0 0
Person 3 3 1 1 1 0 0
Person 4 4 1 1 1 1 0
Person 5 5 1 1 1 1 1

 

Now consider the following table.  Ordering remains the same, but Person 3 has data that falls outside of the diagonal.

 

  Score Item 1 Item 2 Item 3 Item 4 Item 5
P =   0.0 0.2 0.4 0.6 0.8
Person 1 1 1 0 0 0 0
Person 2 2 1 1 0 0 0
Person 3 3 1 1 0 1 0
Person 4 4 1 1 1 1 0
Person 5 5 1 1 1 1 1

 

Some publications on the topic are unclear as to whether this is one error (two cells are flipped) or two errors (a cell that is 0 should be 1, and a cell that is 1 should be 0).  In fact, this article changes the definition from one to the other while looking at two rows the same table.  The Dunn-Rankin et al. book is quite clear: you must subtract the examinee response vector from the perfect response vector for that person’s score, and each cell with a difference counts as an error.

 

  Score Item 1 Item 2 Item 3 Item 4 Item 5
P =   0.0 0.2 0.4 0.6 0.8
Perfect 3 1 1 1 0 0
Person 3 3 1 1 0 1 0
Difference 1 -1

 

Thus, there are two errors.

Usage of Guttman errors in data forensics

Meijer suggested the use of G, raw Guttman error count, and a standardized index he called G*:

G*=G/(r(k-r).

Here, k is the number of items on the test and r is the person’s score.

How is this relevant to data forensics?  Guttman errors can be indicative of several things:

  1. Preknowledge: A low ability examinee memorizes answers to the 20 hardest questions on a 100 item test. Of the 80 they actually answer, they get half correct.
  2. Poor motivation or other non-cheating issues: in a K12 context, a smart kid that is bored might answer the difficult items correctly but get a number of easy items incorrect.
  3. External help: a teacher might be giving answers to some tough items, which would show in the data as a group having a suspiciously high number of errors on average compared to other groups.

 

How can I calculate G and G*?

Because the calculations are simple, it’s feasible to do both in a simple spreadsheet for small datasets. But for a data set of any reasonable size, you will need specially designed software for data forensics, such as SIFT.

What’s the big picture?

Guttman error indices are by no means perfect indicators of dishonest test-taking, but can be helpful in flagging potential issues at both an individual and group level.  That is, you could possibly flag individual students with high numbers of Guttman errors, or if your test is administered in numerous separate locations such as schools or test centers, you can calculate the average number of Guttman errors at each and flag the locations with high averages.

As with all data forensics, though, this flagging process does not necessarily mean there is nefarious goings-on.  Instead, it could simply give you a possible reason to open a deeper investigation.

The following two tabs change content below.
Avatar for Nathan Thompson, PhD

Nathan Thompson, PhD

Nathan Thompson earned his PhD in Psychometrics from the University of Minnesota, with a focus on computerized adaptive testing. His undergraduate degree was from Luther College with a triple major of Mathematics, Psychology, and Latin. He is primarily interested in the use of AI and software automation to augment and replace the work done by psychometricians, which has provided extensive experience in software design and programming. Dr. Thompson has published over 100 journal articles and conference presentations, but his favorite remains https://scholarworks.umass.edu/pare/vol16/iss1/1/ .