What is psychometrics?
Psychometrics is the science of assessment, that is, the testing of psychoeducational variables. It is often confused with psychological assessment, but it is actually far wider. Psychometrics studies the assessment process itself (what makes a good test?) regardless of what the test is about. As such, it also covers many other areas of testing, from K-12 math exams to a certification to be an Accountant to assessment of basic job skills to university admissions, and much more.
Psychometrics an essential aspect of assessment, but to most people it remains a black box. However, a basic understanding is important for anyone working in the testing industry, especially those developing or selling tests.
Psychometrics is centered around the concept of validity, which is the documentation that the interpretations you are making from test scores are actually supported. There is a ton of work that goes into making high-quality exams.
This serves an extremely important purpose in society. We use tests every day to make decisions about humans, from hiring a person to helping a 5th grader learn math to providing career guidance. By using principles of engineering and science to improve these assessments, we are making those decisions more accurate, which can have far-reaching effects.
How can psychometrics help your organization?
Why is psychometrics so important, and how will it benefit your organization? There are two primary ways to implement better psychometrics in your organization: process improvement (typically implemented by psychometricians), and specially-designed software.
This article will outline some of the ways that your tests can be improved, but first, let me outline some of the things that psychometrics can do for you.
Define what should be covered by the test
Before writing any items, you need to define very specifically what will be on the test. Psychometricians typically run a job analysis study to form a quantitative, scientific basis for the test blueprints. A job analysis is necessary for a certification program to get accredited.
Improve development of test content
There is a corpus of scientific literature on how to develop test items that accurately measure whatever you are trying to measure. This is not just limited to multiple-choice items, although that approach remains popular. Psychometricians leverage their knowledge of best practices to guide the item authoring and review process in a way that the result is highly defensible test content. Professional item banking software provides the most efficient way to develop high-quality content and publish multiple test forms, as well as store important historical information like item statistics.
Set defensible cutscores
Test scores are often used to classify candidates into groups, such as pass/fail (Certification/Licensure), hire/non-hire (Pre-Employment), and below-basic/basic/proficient/advanced (Education). Psychometricians lead studies to determine the cutscores, using methodologies such as Angoff, Beuk, Contrasting-Groups, and Borderline.
Statistically analyze results to improve the quality of items and scores
Psychometricians are essential for this step, as the statistical analyses can be quite complex. Smaller testing organizations typically utilize classical test theory, which is based on simple mathematics like proportions and correlations. Large, high-profile organizations typically use item response theory, which is based on a type of nonlinear regression analysis. Psychometricians evaluate overall reliability of the test, difficulty and discrimination of each item, distractor analysis, possible bias, multidimensionality, linking multiple test forms/years, and much more. Software such as Iteman and Xcalibre is also available for organizations with enough expertise to run statistical analyses internally.
Establish and document validity
Validity is the evidence provided to support score interpretations. For example, we might interpret scores on a test to reflect knowledge of English, and we need to provide documentation and research supporting this. There are several ways to provide this evidence. A straightforward approach is to establish content-related evidence, which includes the test definition, blueprints, and item authoring/review. In some situations, criterion-related evidence is important, which directly correlates test scores to another variable of interest. Delivering tests in a secure manner is also essential for validity.
Is there a lot of Math in Psychometrics?
Absolutely. A large portion of the work involves the statistical analysis of exam data, as mentioned above. Classical test theory uses basic math like proportions, averages, and correlations. An example of this is below, where we are analyzing a test question to determine if it is good. Here, we see that the majority of the examinees get the question correct (65%) and that it has a strongly positive point-biserial, which is good, given the low sample size in this case.
Item response theory analyzes many of the same things, but with far more complex mathematics by fitting nonlinear models. However, doing so provides a number of advantages. It is much easier to equate across forms or years, build adaptive tests, and construct forms.
Here’s an article that compares CTT to IRT, if you are interested in learning more.
Where is Psychometrics Used?
In certification testing, psychometricians develop the test via a documented chain of evidence following a sequence of research outlined by accreditation bodies, typically: job analysis, test blueprints, item writing and review, cutscore study, and statistical analysis. Web-based item banking software like FastTest is typically useful because the exam committee often consists of experts located across the country or even throughout the world; they can then easily log in from anywhere and collaborate.
In pre-employment testing, validity evidence relies primarily on establishing appropriate content (a test on PHP programming for a PHP programming job) and the correlation of test scores with an important criterion like job performance ratings (shows that the test predicts good job performance). Adaptive tests are becoming much more common in pre-employment testing because they provide several benefits, the most important of which is cutting test time by 50% – a big deal for large corporations that test a million applicants each year. Adaptive testing is based on item response theory, and requires a specialized psychometrician as well as specially designed software like FastTest.
Most assessments in education fall into one of two categories: lower-stakes formative assessment in classrooms, and higher-stakes summative assessments like year-end exams. Psychometrics is essential for establishing the reliability and validity of higher-stakes exams, and on equating the scores across different years. They are also important for formative assessments, which are moving towards adaptive formats because of the 50% reduction in test time, meaning that student spend less time testing and more time learning.
Universities typically do not give much thought to psychometrics even though a significant amount of testing occurs in higher education, especially with the move to online learning and MOOCs. Given that many of the exams are high stakes (consider a certificate exam after completing a year-long graduate program!), psychometricians should be used in the establishment of legally defensible cutscores and in statistical analysis to ensure reliable tests, and professionally designed assessment systems used for developing and delivering tests, especially with enhanced security.
Nathan Thompson, PhD
Latest posts by Nathan Thompson, PhD (see all)
- Finding the Best Online Testing Platform - January 4, 2022
- Assessment Systems Partners with Sumadi to Revolutionize AI-Based Assessment for Education and Employment - December 6, 2021
- EdTech Expert, Chris Dufour EdD, Joins ASC as Director of Business Development - November 30, 2021