Certification Exam Development and Delivery
Content
- Certification Exam Development
- Job Analysis / Practice Analysis
- Test Specifications and Blueprints
- Item Development
- Pilot Testing
- Standard Setting
- Equating
- Psychometric Analysis & Reporting
- Exam Development: It’s a Vicious Cycle
- Certification Exam Delivery & Administration
- 1. Determine the best approach for certification exam administration and proctoring
- Timing: Cohorts/Windows vs Continuous Availability
- Mode: Paper vs Computer
- Location: Test centers vs Online proctored vs Events vs Multi-Modal
- Geography: State, National, or International
- Security: Low vs High
- Online proctoring: AI vs Recorded vs Live
- 2. Determine other technology, psychometric, and operational needs
- 3. Find a provider – or several!
- 4. Establish the new process with policies and documentation
- 5. Let Everyone Know
- 6. Roll Out
- Ready to start?
Certification exams are a critical component of workforce development for many professions and play a significant role in the global Testing, Inspection, and Certification (TIC) market, which was valued at approximately $359.35 billion in 2022 and is projected to grow at a compound annual growth rate (CAGR) of 4.0% from 2023 to 2030. As such, a lot of effort goes into exam development and delivery, working to ensure that the exams are valid and fair, then delivered securely yet with enough convenience to reach the target market. If you work for a certification organization or awarding body, this article provides a guidebook to that process and how to select a vendor.
Certification Exam Development
Certification exam development, is a well-defined process governed by accreditation guidelines such as NCCA, requiring steps such as job task analysis and standard setting studies. For certification, and other credentialing like licensure or certificates, this process is incredibly important to establishing validity. Such exams serve as gatekeepers into many professions, often after people have invested a ton of money and years of their life in preparation. Therefore, it is critical that the tests be developed well, and have the necessary supporting documentation to show that they are defensible.
So what exactly goes into developing a quality exam, sound psychometrics, and establishing the validity documentation, perhaps enough to achieve NCCA accreditation for your certification? Well, there is a well-defined and recognized process for certification exam development, though it is rarely the exact same for every organization. In general, the accreditation guidelines say you need to address these things, but leave the specific approach up to you. For example, you have to do a cutscore study, but you are allowed to choose Bookmark vs Angoff vs other method.
Job Analysis / Practice Analysis
A job analysis study provides the vehicle for defining the important job knowledge, skills, and abilities (KSA) that will later be translated into content on a certification exam. During a job analysis, important job KSAs are obtained by directly analyzing job performance of highly competent job incumbents or surveying subject-matter experts regarding important aspects of successful job performance. The job analysis generally serves as a fundamental source of evidence supporting the validity of scores for certification exams.
Test Specifications and Blueprints
The results of the job analysis study are quantitatively converted into a blueprint for the certification exam. Basically, it comes down to this: if the experts say that a certain topic or skill is done quite often or is very critical, then it deserves more weight on the exam, right? There are different ways to do this. My favorite article on the topic is Raymond & Neustel, 2006. Here’s a free tool to help.
Item Development
After important job KSAs are established, subject-matter experts write test items to assess them. The end result is the development of an item bank from which exam forms can be constructed. The quality of the item bank also supports test validity. A key operational step is the development of an Item Writing Guide and holding an item writing workshop for the SMEs.
Pilot Testing
There should be evidence that each item in the bank actually measures the content that it is supposed to measure; in order to assess this, data must be gathered from samples of test-takers. After items are written, they are generally pilot tested by administering them to a sample of examinees in a low-stakes context—one in which examinees’ responses to the test items do not factor into any decisions regarding competency. After pilot test data is obtained, a psychometric analysis of the test and test items can be performed. This analysis will yield statistics that indicate the degree to which the items measure the intended test content. Items that appear to be weak indicators of the test content generally are removed from the item bank or flagged for item review so they can be reviewed by subject matter experts for correctness and clarity.
Note that this is not always possible, and is one of the ways that different organizations diverge in how they approach exam development.
Standard Setting
Standard setting also is a critical source of evidence supporting the validity of professional credentialing exam (i.e. pass/fail) decisions made based on test scores. Standard setting is a process by which a passing score (or cutscore) is established; this is the point on the score scale that differentiates between examinees that are and are not deemed competent to perform the job. In order to be valid, the cutscore cannot be arbitrarily defined. Two examples of arbitrary methods are the quota (setting the cut score to produce a certain percentage of passing scores) and the flat cutscore (such as 70% on all tests). Both of these approaches ignore the content and difficulty of the test. Avoid these!
Instead, the cutscore must be based on one of several well-researched criterion-referenced methods from the psychometric literature. There are two types of criterion-referenced standard-setting procedures (Cizek, 2006): examinee-centered and test-centered.
The Contrasting Groups method is one example of a defensible examinee-centered standard-setting approach. This method compares the scores of candidates previously defined as Pass or Fail. Obviously, this has the drawback that a separate method already exists for classification. Moreover, examinee-centered approaches such as this require data from examinees, but many testing programs wish to set the cutscore before publishing the test and delivering it to any examinees. Therefore, test-centered methods are more commonly used in credentialing.
The most frequently used test-centered method is the Modified Angoff Method (Angoff, 1971) which requires a committee of subject matter experts (SMEs). Another commonly used approach is the Bookmark Method.
Equating
If the test has more than one form – which is required by NCCA Standards and other guidelines – they must be statistically equated. If you use classical test theory, there are methods like Tucker or Levine. If you use item response theory, you can either bake the equating into the item calibration process with software like Xcalibre, or use conversion methods like Stocking & Lord.
What does this process do? Well, if this year’s certification exam had an average of 3 points higher than last years, how do you know if this year’s version was 3 points easier, or this year’s cohort was 3 points smarter, or a mixture of both? Learn more here.
Psychometric Analysis & Reporting
This part is an absolutely critical step in the exam development cycle for professional credentialing. You need to statistically analyze the results to flag any items that are not performing well, so you can replace or modify them. This looks at statistics like item p-value (difficulty), item point biserial (discrimination), option/distractor analysis, and differential item functioning. You should also look at overall test reliability/precision and other psychometric indices. If you are accredited, you need to perform year-end reports and submit them to the governing body. Learn more about item and test analysis.
Exam Development: It’s a Vicious Cycle
Now, consider the big picture: in many cases, an exam is not a one-and-done thing. It is re-used, perhaps continually. Often there are new versions released, perhaps based on updated blueprints or simply to swap out questions so that they don’t get overexposed. That’s why this is better conceptualized as an exam development cycle, like the circle shown above. Often some steps like Job Analysis are only done once every 5 years, while the rotation of item development, piloting, equating, and psychometric reporting might happen with each exam window (perhaps you do exams in December and May each year).
ASC has extensive expertise in managing this cycle for professional credentialing exams, as well as many other types of assessments. Get in touch with us to talk to one of our psychometricians.
Certification Exam Delivery & Administration
Certification exam administration and proctoring is a crucial component of the professional credentialing process. Certification exams are expensive to develop well, so an organization wants to protect that investment by delivering the exam with appropriate security so that items are not stolen. Moreover, there is an obvious incentive for candidates to cheat. So, a certification body needs appropriate processes in place to deliver the certification exams. Here are some tips.
1. Determine the best approach for certification exam administration and proctoring
Here are a few of the considerations to take into account. These can be crossed with each other, such as delivering paper exams at Events vs. Test Centers.
Timing: Cohorts/Windows vs Continuous Availability
Do you have cohorts, where events make more sense, or do you need continuous? For example, if the test is tied to university training programs that graduate candidates in December and May each year, that affects your need for delivery. Alternatively, some certifications are not tied to such training; you might have to only show work experience. In those cases, candidates are ready to take the test continuously throughout the year.
Mode: Paper vs Computer
Does it make more sense to deliver the test on paper or on computer? This used to be a cost issue, but now the cost of computerized delivery, especially with online proctoring at home, has dropped significantly while saving so much time for candidates. Also, some exam types like clinical simulations can only be delivered on computers.
Location: Test centers vs Online proctored vs Events vs Multi-Modal
Some types of tests require events, such as a clinical assessment in an actual clinic with standardized patients. Some tests can be taken anywhere. Exam events can also coincide with other events; perhaps you have online delivery through the year but deliver a paper version of the test at your annual conference, for convenience.
Do you have an easy way to make your own locations, if you are considering that? One example is that you have quarterly regional conferences for your profession, where you could simply get a side room to deliver your test to candidates since they will already be there. Another is that most of your candidates are coming from training programs at universities, and you are able to use classrooms at those universities.
Geography: State, National, or International
If your exam is for a small US state or a small country, it might be easy to require exams in a test center, because you can easily set up only one or two test centers to cover the geography. Some certifications are international, and need to deliver on-demand throughout the year; those are a great fit for online.
Security: Low vs High
If your test has extremely high stakes, there is extremely high incentive to cheat. An entry-level certification on WordPress is different than a medical licensure exam. The latter is a better fit for test centers, while the former might be fine with online proctoring on-demand.
Online proctoring: AI vs Recorded vs Live
If you choose to explore this approach, here are three main types to evaluate.
A. AI only: AI only proctoring means that there are no humans. The examinee is recorded on video, and AI algorithms flag potential issues, such as if they leave their seat, then notify an administrator (usually a professor) of students with a high number of flags. This approach is usually not relevant for certifications or other credentialing exams, it is more for low-stakes exams like a Psychology 101 Midterm at your local university. The vendors for this approach are interested in large-scale projects, such as proctoring all midterms and finals at a university, perhaps hundreds of thousands of exams per year.
B. Record and Review: Record and review proctoring means that the examinee is recorded on video, but that video is watched by a real human and flagged if they think there is cheating, theft, or other issues. This is much higher quality, and higher price, but has one major flaw that might be concerning to certification tests: if someone steals your test by taking pictures, you won’t find out until tomorrow. But at least you know who it was and you are certain of what happened, with a video proof. Perhaps useful for microcredentials or recertification exams.
C. Live Online Proctoring: Live online proctoring (LOP), or what I call “live human proctoring” (because some AI proctoring is also “live” in real time!) means that there is a professional human proctor on the other side of the video from the examinee. They check the examinee in, confirm their identity, scan the room, provide instructions, and actually watch them take the test. Some providers like MonitorEDU even have the examinee make a second video stream on their phone, which is placed on a bookshelf or similar spot to see the entire room through the test. Certainly, this approach is a very good fit with certification exams and other credentialing. You protect the test content as well as the validity of that individual’s score; that is not possible with the other two approaches.
We have also prepared a list of the best online proctoring software platforms.
2. Determine other technology, psychometric, and operational needs
Next, your organization should establish any other needs for your exams that could impact the vendor selection.
- Do you require special item types, such that the delivery platform needs to support or integrate with them?
- Do you have simulations or OSCEs?
- Do you have specific needs around accessibility and accommodations for your candidates?
- Do you need adaptive testing or linear on the fly testing?
- Do you need extensive Psychometric consulting services?
- Do you need an integrated registration and payment portal? Or a certification management system to track expirations and other important information?
Write all these up so that you can use the list to shop for a provider.
3. Find a provider – or several!
While it might seem easier to find a single provider for everything, that’s often not the best solution. Look for those vendors that specifically fit your needs.
For example, most providers of remote proctoring are just that: remote proctoring. They do not have a professional platform to manage item banks, schedule examinees, deliver tests, create custom score reports, and analyze psychometrics. Some do not even integrate with such platforms, and only integrate with learning management systems like Moodle, seeing as their entire target market is only low-stakes university exams. So if you are seeking a vendor for certification testing or other credentialing, the list of potential vendors is smaller.
Likewise, there are some vendors that only do the exam development and psychometrics, but lack a software platform and proctoring services for deliver. In these cases, they might have very specific expertise, and often have lower costs due to lower overhead. An example is JML Testing Services.
Once you have some idea what you are looking for, start shopping for vendors that provide services for certification exam delivery, development, and scoring. In some cases, you might not settle on a certain approach right away, and that’s OK. See what is out there and compare prices. Perhaps the cost of Live Remote Proctoring is more affordable than you anticipated, and you can upgrade to that.
Besides a simple Google search some good places to start are the member listings of the Association of Test Publishers and the Institute for Credentialing Excellence.
4. Establish the new process with policies and documentation
Once you have finalized your vendors, you need to write policies and documentation around them. For example, if your vendor has a certain login page for proctoring (we have ascproctor.com), you should take relevant screenshots and write up a walkthrough so candidates know what to expect. Much of this should go into your Candidate Handbook. Some of the things to cover that are specific to exam day for the candidates:
- How to prepare for the exam
- How to take a practice test
- What is allowed during the exam
- What is not allowed
- ID needed and the check-in process
- Details on specific locations (if using locations)
- Rules for accessibility and accommodations
- Time limits and other practical considerations in the exam
Next, consider all the things that are impacted other than exam day.
- Eligibility pathways and applications
- Registration and scheduling
- Candidate training and practice tests
- Reporting: just to the candidates, or perhaps to training programs as well?
- Accounting and other operations: consider your business needs, such as how you manage money, monthly accounting reports, etc.
- Test security plan: What do you do if someone is caught taking pictures of the exam with their phone, or the other potential events?
5. Let Everyone Know
Once you have written up everything, make sure all the relevant stakeholders know. Publish the new Candidate Handbook and announce to the world. Send emails to all upcoming candidates with instructions and an opportunity for a practice exam. Put a link on your homepage. Get in touch with all the training programs or universities in your field. Make sure that everyone has ample opportunity to know about the new process!
6. Roll Out
Finally, of course, you can implement the new approach to certification exam delivery. You might launch a new certification exam from scratch, or perhaps you are moving one from paper to online with remote proctoring, or some other change. Either way, you need a date to start using it and a change management process. The good news is that, even though it’s probably a lot of work to get here, the new approach is probably going to save you time and money in the long run. Roll it out!
Also, remember that this is not a single point in time. You’ll need to update into the future. You should also consider the implementation of audits or quality control as a way to drive improvement.
Ready to start?
Certification exam delivery is the process of administering a certification test to candidates. This might seem straightforward, but it is surprisingly complex. The greater the scale and the stakes, the more potential threats and pitfalls. Assessment Systems Corporation is one of the world leaders in the development and delivery of certification exams. Contact us to get a free account in our platform and experience the examinee process, or to receive a demonstration from one of our experts.
Nathan Thompson, PhD
Latest posts by Nathan Thompson, PhD (see all)
- Situational Judgment Tests: Higher Fidelity in Pre-Employment Testing - November 30, 2024
- What is an Assessment-Based Certificate? - October 12, 2024
- What is Psychometrics? How does it improve assessment? - October 12, 2024