laptop data graph

Criterion-related validity is evidence that test scores are related to other data which we expect them to be.  This is an essential part of the larger issue of test score validity, which is providing evidence that test scores have the meaning we intend them to have.  If you’ve ever felt that a test doesn’t cover what it should be covering, or that it doesn’t reflect the skills needed to perform the job you are applying for – that’s validity.

What is criterion-related validity?

Criterion-related validity is an aspect of test score validity which refers to evidence that scores from a test correlate with an external variable that it should correlate with.  In many situations, this is the critical consideration of a test; for example, a university admissions exam would be quite suspect if scores did not correlate well with high school GPA or accurately predict university GPA.  That is literally its purpose for existence, so we want to have some proof that the test is performing that way.  A test serves its purpose, and people have faith in it, when we have such highly relevant evidence.

Incremental validity is a specific aspect of criterion-related validity that assesses the added predictive value of a new assessment or variable beyond the information provided by existing measures.  There are two approaches to establishing criterion-related validity: concurrent and predictive.  There are also two directions: discriminant and convergent.

Concurrent validity

The concurrent approach to criterion-related validity means that we are looking at variables at the same point in time, or at least very close.  In the example of university admissions testing, this would be correlating the test scores with high school GPA.  The students would most likely just be finishing high school at the time they took the test, excluding special cases like students that take a gap year before university.

Predictive validity

The predictive validity approach, as its name suggests, is regarding the prediction of future variables. For instance, in university admissions testing, we use test scores to predict outcomes like university GPA or graduation rates. Studies show that SAT scores correlate with college GPA, with values typically ranging from 0.30 to 0.50, reflecting data from predictive validity research conducted by organizations like the College Board.  A common application of this is pre-employment testing, where job candidates are testing with the goal of predicting positive variables like job performance, or variables that the employer might want to avoid, like counterproductive work behavior.  Which leads us to the next point…

Convergent validity

Convergent validity refers to criterion-related validity where we want a positive correlation, such as test scores with job performance or university GPA.  This is frequently the case with criterion-related validity studies.  One thing to be careful of in this case is differential prediction, also known as predictive bias.  This is where the validity is different for one group of examinees, often a certain demographic group, even though the average score might be the same for each group.

Here is an example of the data you might evaluate for predictive convergent validity of a university admissions test.

Predictive validity

Discriminant validity

Unlike convergent, discriminant validity is where we want to correlate negatively or zero with other variables.  As noted above, some pre-employment tests have this case.  An integrity or conscientiousness assessment should correlate negatively with instances of counterproductive work behavior, perhaps quantified as number if disciplinary marks on employee HR files.  In some cases, the goal might be to find a zero correlation.  That can be the case with noncognitive traits, where a measure of conscientiousness should not have a strong correlation in any direction with other members of the Big Five.

The big picture

Validity is a complex topic with many aspects.  Criterion-related validity is only one part of the picture.  However, as seen in some of the examples above, it is profoundly critical to some types of assessment, especially where the exam exists only to predict some future variables.

Want to delve further into validity?  The classic reference is Cronbach & Meehl (1955).  We also recommend work by Messick, such as this one.  Of course, check with relevant standards to your assessment, such as AERA/APA/NCME or NCCA.

The following two tabs change content below.
Avatar for Nathan Thompson, PhD

Nathan Thompson, PhD

Nathan Thompson earned his PhD in Psychometrics from the University of Minnesota, with a focus on computerized adaptive testing. His undergraduate degree was from Luther College with a triple major of Mathematics, Psychology, and Latin. He is primarily interested in the use of AI and software automation to augment and replace the work done by psychometricians, which has provided extensive experience in software design and programming. Dr. Thompson has published over 100 journal articles and conference presentations, but his favorite remains https://scholarworks.umass.edu/pare/vol16/iss1/1/ .