Since the first tests were developed 2000 years ago for entry into civil service of Imperial China, test security has been a concern. The reason is quite straightforward: most threats to test security are also threats to validity, and the decisions we make with test scores could therefore be invalid, or at least suboptimal. It is therefore imperative that organizations that develop or utilize tests should develop a Test Security Plan (TSP). The TSP is a document that helps an organization anticipate test security issues, establish deterrent and detection methods, and plan responses. In can also include validity threats not security-related, such as how to deal with examinees that have low motivation.
There are several reasons to develop a Test Security Plan. First, it drives greater security and therefore validity. The TSP will enhance the legal defensibility of the testing program. It helps to safeguard the content, which is typically an expensive investment for any organization that develops tests themselves. If incidents do happen, they can be dealt with more swiftly and effectively. It helps to manage all the security-related efforts.
The development of such a complex document requires a strong framework. We advocate a framework with three phases: planning, implementation, and response. In addition, the TSP should be revised periodically.
Phase 1: Planning
The first step in this phase is to list all potential threats to each assessment program at your organization. This could include harvesting of test content, preknowledge of test content from past harvesters, copying other examinees, proxy testers, proctor help, and outside help. Next, these should be rated on axes that are important to the organization; a simple approach would be to rate on potential impact to score validity, cost to the organization, and likelihood of occurrence. This risk assessment exercise will help the remainder of the framework.
Next, the organization should develop the TSP. The first piece is to identify deterrents and procedures to reduce the possibility of issues. This includes delivery procedures (such as a lockdown browser or proctoring), proctor training manuals, a strong candidate agreement, anonymous reporting pathways, confirmation testing, and candidate identification requirements. The second piece is to explicitly plan for psychometric forensics. This can rsange from complex collusion indices based on item response theory to simple flags, such as a candidate responding to a certain multiple choice option more than 50% of the time or obtaining a score in the top 10% but in the lowest 10% of time. The third piece is to establish planned responses. What will you do if a proctor reports that two candidates were copying each other? What if someone obtains a high score in an unreasonably short time? What if someone obviously did not try to pass the exam, but still sat there for the allotted time? If a candidate were to lose a job opportunity due to your response, it helps you defensibility to show that the process was established ahead of time with the input of important stakeholders.
Phase 2: Implementation
The second phase is to implement the relevant aspects of the Test Security Plan, such as training all proctors in accordance with the manual and login procedures, setting IP address limits, or ensuring that a new secure testing platform with lockdown is rolled out to all testing locations. There are generally two approaches. Proactive approaches attempt to reduce the likelihood of issues in the first place, and reactive methods happen after the test is given. The reactive methods can be observational, quantitative, or content-focused. Observational methods include proctor reports or an anonymous tip line. Quantitative methods include psychometric forensics, for which you will need software like SIFT. Content-focused methods include automated web crawling.
Both approaches require continuous attention. You might need to train new proctors several times per year, or update your lockdown browser. If you use a virtual proctoring service based on record-and-review, flagged candidates must be periodically reviewed. The reactive methods are similar: incoming anonymous tips or proctor reports must be dealt with at any given time. The least continuous aspect is some of the psychometric forensics, which depend on a large-scale data analysis; for example, you might gather data from tens of thousands of examinees in a testing window and can only do a complete analysis at that point, which could take several weeks.
Phase 3: Response
The third phase, of course, to put your planned responses into motion if issues are detected. Some of these could be relatively innocuous; if a proctor is reported as not following procedures, they might need some remedial training, and it’s certainly possible that no security breach occurred. The more dramatic responses include actions taken against the candidate. The most lenient is to provide a warning or simply ask them to retake the test. The most extreme methods include a full invalidation of the score with future sanctions, such as a five-year ban on taking the test again, which could prevent someone from entering a profession for which they spent 8 years and hundreds of thousands of dollars in educative preparation.
What does a test security plan mean for me?
It is clear that test security threats are also validity threats, and that the extensive (and expensive!) measures warrant a strategic and proactive approach in many situations. A framework like the one advocated here will help organizations identify and prioritize threats so that the measures are appropriate for a given program. Note that the results can be quite different if an organization has multiple programs, from a practice test to an entry level screening test to a promotional test to a professional certification or licensure.
Another important difference is that between test sponsors/publishers and test consumers. In the case of an organization that purchases off-the-shelf pre-employment tests, the validity of score interpretations is of more direct concern, while the theft of content might not be an immediate concern. Conversely, the publisher of such tests has invested heavily in the content and could be massively impacted by theft, while the copying of two examinees in the hiring organization is not of immediate concern.
In summary, there are more security threats, deterrents, procedures, and psychometric forensic methods than can be discussed in one blog post, so the focus here rather on the framework itself. For starters, start thinking strategically about test security and how it impacts their assessment programs by using the multi-axis rating approach, then begin to develop a Test Security Plan. The end goal is to improve the health and validity of your assessments.