Nedelsky Method of Standard Setting

nedelsky method meeting

The Nedelsky method is an approach to setting the cutscore of an exam.  Originally suggested by Nedelsky (1954), it is an early attempt to implement a quantitative, rigorous procedure to the process of standard setting.  Quantitative approaches are needed to eliminate the arbitrariness and subjectivity that would otherwise dominate the process of setting a cutscore.  […]

What is Psychometrics?

psychometrics- Ruthson Zimmerman on UNsplash

Psychometrics is the science of educational/psychological testing.  It studies how tests are developed, delivered, and scored.  Psychometricians tackle fundamental questions around assessment, such as how to determine if a test is reliable or if a question is of good quality, as well as much more complex questions like those listed below.  The goal of psychometrics […]

What are enemy items?

Enemy items lego

Enemy items is a psychometric term that refers to two test questions (items) which should not be on the same test version seen by a given examinee.  This can be linear forms, but also pertains to linear on the fly testing (LOFT) and computerized adaptive testing (CAT).  There are several reasons why two items might […]

Incremental Validity

incremental validity

Incremental validity is an aspect of validity that refers to what an additional assessment or predictive variable can add to the information provided by existing assessments or variables.  It refers to the amount of “bonus” predictive power by adding in another predictor.  In many cases, it is on the same or similar trait, but often the […]

Summative and Formative Assessment

formative and summative assessment paper

Summative and formative assessment are a crucial component of the educational process.  If you work in the assessment field, you have probably encountered these terms.  What do they mean? Summative Assessment Summative assessment refers to an assessment that is at the end (sum) of an educational experience.  The “educational experience” can vary widely.  Perhaps it […]

Test Score Reliability and Validity

scale-reliability-small

Test score reliability and validity are core concepts in the field of psychometrics and assessment.  Both of them refer to the quality of a test, the scores it produces, and how we use those scores.  Because test scores are often used for very important purposes with high stakes, it is of course paramount that the […]

Exam Development for Professional Credentialing

exam development psychometrics

The exam development for professional credentialing – Licensure & Certification Tests – is incredibly important.  Such exams serve as gatekeepers into many professions, often after people have invested a ton of money and years of their life in preparation.  Therefore, it is critical that they be developed well, and have the necessary supporting documentation.  So […]

Finding the Best Online Assessment Software

Remote Proctoring Software Providers

Online assessment software is a key business platform for many types of organizations, from K12 schools to universities to employment to certification/licensure. There are many possible solutions in the marketplace, so how can you tell them apart? This blog post provides a list of important functionalities and components that you should consider. How to Evaluate […]

What is the coefficient alpha index of reliability?

Coefficient cronbachs alhpa interpretation

Coefficient alpha, sometimes called Cronbach’s alpha, is a statistical index that is used to evaluate the internal consistency or reliability of an assessment. That is, it quantifies how consistent we can expect scores to be, by analyzing the item statistics. A high value indicates that the test is of high reliability, and a low value […]

Differential item functioning (DIF): Definition and examples

differential item functioning

Differential item functioning (DIF) is a term in psychometrics for the statistical analysis of assessment data to determine if items are performing in a biased manner against some group of examinees. Most often, this is based on a demographic variable such as gender, ethnicity, or first language. For example, you might analyze a test to […]