Psychometrics

What is Job Task Analysis for Certification?
Job Task Analysis (JTA) is an essential step in designing a test to be used in the workforce, such as pre-employment or certification/licensure, by analyzing data on what is actually being done in the job.

Four-Fifths Rule: Fair Employment Selection
The Four-Fifths Rule is a term that refers to a guideline for fairness in hiring practices in the USA. Because tests are often used in making hiring decisions, the Four-Fifths Rule applies to them so

Classical Test Theory: Item Statistics
Classical Test Theory (CTT) is a psychometric approach to analyzing, improving, scoring, and validating assessments. It is based on relatively simple concepts, such as averages, proportions, and correlations. One of the most frequently used aspects

Content Validity in Assessment
Content validity is an aspect of validity, a term that psychometricians use to refer to evidence that interpretations of test scores are supported. For example, predictive validity provides evidence that a pre-employment test will predict

Predictive Validity of Test Scores
Predictive Validity is a type of test score validity which evaluates how well a test predicts something in the future, usually with a goal of making more effective decisions about people. For instance, it is

Classical Test Theory vs. Item Response Theory
Classical Test Theory and Item Response Theory (CTT & IRT) are the two primary psychometric paradigms. That is, they are mathematical approaches to how tests are analyzed and scored. They differ quite substantially in substance

All Psychometric Models Are Wrong
The British statistician George Box is credited with the quote, “All models are wrong but some are useful.” As psychometricians, it is important that we never forget this perspective. We cannot be so haughty as

The Graded Response Model – Samejima (1969)
Samejima’s (1969) Graded Response Model (GRM, sometimes SGRM) is an extension of the two parameter logistic model (2PL) within the item response theory (IRT) paradigm. IRT provides a number of benefits over classical test theory,

Coefficient Alpha Reliability Index
Coefficient alpha reliability, sometimes called Cronbach’s alpha, is a statistical index that is used to evaluate the internal consistency or reliability of an assessment. That is, it quantifies how consistent we can expect scores to

Differential Item Functioning (DIF)
Differential item functioning (DIF) is a term in psychometrics for the statistical analysis of assessment data to determine if items are performing in a biased manner against some group of examinees. This analysis is often

“Dichotomous” Vs “Polytomous” in IRT?
What is the difference between the terms dichotomous and polytomous in psychometrics? Well, these terms represent two subcategories within item response theory (IRT) which is the dominant psychometric paradigm for constructing, scoring and analyzing assessments.

How do I develop a test security plan?
A test security plan (TSP) is a document that lays out how an assessment organization address security of its intellectual property, to protect the validity of the exam scores. If a test is compromised, the

Multistage Testing
Multistage testing (MST) is a type of computerized adaptive testing (CAT). This means it is an exam delivered on computers which dynamically personalize it for each examinee or student. Typically, this is done with respect

Maximum Likelihood Estimation
Maximum Likelihood Estimation (MLE) is an approach to estimating parameters for a model. It is one of the core aspects of Item Response Theory (IRT), especially to estimate item parameters (analyze questions) and estimate person

Multidimensional Item Response Theory
Multidimensional item response theory (MIRT) has been developing from its Factor Analytic and unidimensional item response theory (IRT) roots. This development has led to an increased emphasis on precise modeling of item-examinee interaction and a

The IRT Item Pseudo-guessing Parameter
The item pseudo-guessing parameter is one of the three item parameters estimated under item response theory (IRT): discrimination a, difficulty b, and pseudo-guessing c. The parameter that is utilized only in the 3PL model is

The IRT Item Discrimination Parameter
The item discrimination parameter a is an index of item performance within the paradigm of item response theory (IRT). There are three item parameters estimated with IRT: the discrimination a, the difficulty b, and the

Automated Item Generation
Automated item generation (AIG) is a paradigm for developing assessment items (test questions), utilizing principles of artificial intelligence and automation. As the name suggests, it tries to automate some or all of the effort involved