Meet high standards, then exceed them.

Can psychometric consulting help you get accredited?

Are your program registration numbers meeting your expectations? What kind of response are you getting from your examinees and stakeholders? Why do people want to take your assessment, anyway? Perhaps now might be as good a time as any to take a second look at your assessment program, and refocus your business goals. Perhaps what your assessment needs is the credibility from examinees and industry stakeholders that comes with accreditation.

We understand that the accreditation processes can be daunting, especially if you’re just getting started. Where do you begin? Our psychometricians have seen the accreditation process unfold many times before, and we’ve designed our consulting solutions to feed seamlessly into your accreditation goals.

No two projects are the same, and we can step in at virtually any point in your process and find the best way to meet your unique expectations. Our psychometric consultation approach (while relevant to many assessments) aligns with standards like APA/AERA/NCME, NCCA, and ITC, and can be just the boost you need to achieve accreditation for your certification and meet—nay, exceed— your business goals.

Our Approach:


What are we trying to measure? What is the point of our program or assessment? Are we going to certify just basket weavers or do we want to offer certification for underwater basket weavers, too? If we decide that we want to create certifications for multiple domains, we need to determine what overlap there is between the domains and what makes them unique/distinct.


Once we know what we’re trying to measure, we have to go out and learn everything we can about it. A job analysis (or Job Task Analysis, JTA) is a systematic approach to determining what makes up a job. We recommend certification programs perform a JTA about once every two years and that new forms are built at least once 3-5 years.


We take the report generated from the JTA and translate it into test specifications. What percent of the examination should be items from each of the KSAs identified? We use the rating data from SMEs to make these determinations. The output from this is another report, using the JTA as the justification for the decisions. This process is related to the practice of Form Assembly, which our psychometricians are very experienced at doing.


Now that we have content areas and we know roughly how many items we need in each KSA/domain, it’s time to write items. Sometimes this involves the training of SMEs, and other times psychometricians write the items, even in highly technical areas. Item writers draw on their expertise, or the material they’ve been provided, to write items that map back to the test blueprint defined above, which we document to fulfill accreditation standards. A comprehensive review of items happens next. Things we look for: possible partially correct answers, questions that are offensive or biased, items that aren’t relevant, items that are too hard or easy, and items that are too specific.


Ideally, before a certification program goes live, there would be some Beta testing of the items. This means that we would give them to some people in our target population (those attempting to be certified) and use the data from that administration to inform decisions about which items to use on the final test forms. We usually don’t report scores during pilot testing. Live testing usually has immediate score reporting.


Using the data from the beta administration, we do a thorough review of the item’s content and performance. Some items might be revised, others were thrown out completely. Once the live test administration window is done, it’s time to revisit the statistics. We run a test and item analysis, most often using Classical Test Theory, to evaluate the performance of the exam, items, and options. At this point, we may determine that some items were so poorly performing that they should be removed (ie: not count toward the candidates’ score) or we might identify that some items were miskeyed in the database.


Next, it’s time to set a cutscore or pass/fail point. The most common way that we do this is via the Modified Angoff Method. We call the SMEs back in and they give each item a rating between 0-100 estimating what percent of minimally competent candidates they believe would get the item correct. We use a panel of SMEs, aggregate their ratings and analyze. There are multiple discussions, usually surrounding the items over which the SMEs were in disagreement. In the end, we take all of this data into consideration when establishing a cut point.


Now that we’ve done some analysis, there’s likely equating that needs to be done. No matter how closely we worked to make the forms equivalent during form assembly, it’s inevitable that there will be differences post-delivery. It’s important to eliminate these so that test scores are comparable across the different forms (ie: so that Joe’s score of 110 on Form A is the same as Melissa’s score of 110 on Form B). This also means making sure that our classifications are consistent across forms. For instance, we want to be sure that a candidate who takes form A and fails, would also have failed if they’d gotten form B.


Now that we have everyone on the same scale, we can send out score information to test takers. This needs to include standards specified by the accrediting entity.

Our consultation is designed to help you conduct your program correctly, defensible, and ethically the first time around, making it that much easier to achieve accreditation and give your assessment program the boost it needs. At Assessment Systems, we want all assessments to be smarter, faster, and fairer.

That includes yours.

The following two tabs change content below.

Brian Long

Brian Long is the Director of Human Resources & Communication at Assessment Systems.

Latest posts by Brian Long (see all)