Entries by Nathan Thompson, PhD

,

Coefficient Alpha Reliability Index

Coefficient alpha reliability, sometimes called Cronbach’s alpha, is a statistical index that is used to evaluate the internal consistency or reliability of an assessment. That is, it quantifies how consistent we can expect scores to be, by analyzing the item statistics. A high value indicates that the test is of high reliability, and a low […]

Differential Item Functioning (DIF)

Differential item functioning (DIF) is a term in psychometrics for the statistical analysis of assessment data to determine if items are performing in a biased manner against some group of examinees.  This analysis is often complemented by item fit analysis, which ensures that each item aligns appropriately with the theoretical model and functions uniformly across […]

,

“Dichotomous” Vs “Polytomous” in IRT?

What is the difference between the terms dichotomous and polytomous in psychometrics?  Well, these terms represent two subcategories within item response theory (IRT) which is the dominant psychometric paradigm for constructing, scoring and analyzing assessments.  Virtually all large-scale assessments utilize IRT because of its well-documented advantages.  In many cases, however, it is referred to as […]

, ,

How do I develop a test security plan?

A test security plan (TSP) is a document that lays out how an assessment organization address security of its intellectual property, to protect the validity of the exam scores.  If a test is compromised, the scores become meaningless, so security is obviously important.  The test security plan helps an organization anticipate test security issues, establish […]

, ,

Multistage Testing

Multistage testing (MST) is a type of computerized adaptive testing (CAT).  This means it is an exam delivered on computers which dynamically personalize it for each examinee or student.  Typically, this is done with respect to the difficulty of the questions, by making the exam easier for lower-ability students and harder for high-ability students.  Doing […]

, ,

Automated Item Generation

Automated item generation (AIG) is a paradigm for developing assessment items (test questions), utilizing principles of artificial intelligence and automation. As the name suggests, it tries to automate some or all of the effort involved with item authoring, as that is one of the most time-intensive aspects of assessment development – which is no news […]

, ,

Ebel Method of Standard Setting

The Ebel method of standard setting is a psychometric approach to establish a cutscore for tests consisting of multiple-choice questions. It is usually used for high-stakes examinations in the fields of higher education, medical and health professions, and for selecting applicants. How is the Ebel method performed? The Ebel method requires a panel of judges who […]

,

Distractor Analysis for Test Items

Distractor analysis refers to the process of evaluating the performance of incorrect answers vs the correct answer for multiple choice items on a test.  It is a key step in the psychometric analysis process to evaluate item and test performance as part of documenting test reliability and validity. What is a distractor? An item distractor, […]

What is multi-modal test delivery?

Multi-modal test delivery refers to an exam that is capable of being delivered in several different ways, or of a online testing software platform designed to support this process. For example, you might provide the option for a certification exam to be taken on computer at third-party testing centers or via paper at the annual […]

Confidence Interval for Test Scores

A confidence interval for test scores is a common way to interpret the results of a test by phrasing it as a range rather than a single number.  We all understand that tests provide imperfect measurements at a specific point in time, and actual performance can vary over different occasions.  The examinee might be sick […]