Last night, I had the honor to sit on a panel discussing Current Themes in Educational Assessment at an Educelerate event. Educelerate “is a networking forum for people who are passionate about promoting innovation and entrepreneurship in education – particularly through the use of technology.” It is a national network, and the Twin Cities has an active chapter due to the substantial presence of the Education industry here. See the local MeetUp page for more information or to join up. There is also a national Twitter feed to follow.
I’d like to thank Sean Finn for organizing the event and serving as moderator. I’d also like to thank the other three panelists in addition to everyone that attended.
- Jennifer Dugan – Director of Assessment at the State of Minnesota
- Greg Wright – CEO at Naiku
- Steve Mesmer – COO at Write the World
After an overview of assessment at the State level by Ms. Dugan, each panelist was asked to provide a brief response to three questions regarding assessment. Here are mine:
- In your opinion, how do you perceive the role of technology in educational assessment?
I think this depends on the purpose of the assessment. In assessment of learning, from 3rd grade benchmark exams to medical licensure tests, the purpose of the test is to obtain an accurate estimate of student mastery. The greater the stakes, the more accuracy is needed. Technology should serve this goal.
In assessment for learning, the goal is more to engage the student and be integral to the learning process. Using complex psychometrics to gain more accurate scores is less important. Technology should explore ways to engage the student and enhance learning, such as simulations.
However, we must not lose sight of these purposes, and adding technology merely for the sake of appearing innovative is actually counterproductive. I’ve already seen this happen twice with a PARCC project. They have “two-part” items that are supposed to delve deeper into student understanding, but because the approach is purely pedagogical and not psychometric, the data they produce is unusable. PARCC also takes the standard multiple response item (choose two out of five checkboxes) and makes it into a drag and drop item; no difference whatsoever in data or psychometrics, just sleeker looking technology.
- What opportunities do you see for new technologies that can help improve educational assessment in the 21st century?
There are a few ways this can happen.
My favorite is adaptive testing, whereby we leverage the computing power of technology to make tests more accurate. The same is also true for more sophisticated psychometric modeling.
Another great idea is automated essay scoring, which is not safe as the ONLY scoring method, but improves accuracy when used appropriately. Given the massive back-end cost of scoring essay items, any alleviation on that front will allow for more use of constructed-response formats.
New item types that allow us to glean more information in a shorter amount of time will improve the efficiency and accuracy of assessment. But as I mentioned previously, development of new item types should always be done with the correct purpose in mind.
Big Data will likely improve the use of assessment data, but can also come into play in terms of the development and delivery of tests.
I’d also like to see Virtual Reality break into the assessment arena. Our company works with Crane Operator exams. Who WOULDN’T want to take an exam like that via virtual reality?
- Adaptive testing is a common term in the educational assessment world, especially given the focus of Smarter Balanced. What is the future of adaptive testing in your opinion, and how will that impact educational assessment?
The primary purpose of adaptive testing is to improve the efficiency of the assessment process. That is, research has generally shown that it can produce scores just as accurate as a linear test, but with half as many items. Moreover, the improvement in precision is typically more pronounced for students that are very high or low, because the typical exam does not provide many items for them; most items are of middle difficulty. Alternatively if we want to keep the same precision, we can cut time in half; this is extremely relevant for quick diagnostic tests as opposed to longer, high-stakes tests.
While the time savings are notable at an individual level, consider the overall time savings across hundreds of thousands of students – certainly relevant in an environment of “less testing!”
CAT also has secondary advantages, such as increasing student engagement because students are only presented with items of appropriate difficulty.
One major opportunity is for CAT to start using more sophisticated models, such as multidimensional item response theory, cognitive diagnostic models, and models that utilize item response time. This will improve its performance even further.
The future involves more widespread use of CAT as the cost of providing it continues to come down. While it will never be something that can be done at the classroom or school level since it requires a PhD, more companies will be able to provide it, and at a lower price point, which means it ends up being used more widely.
Want to improve the quality of your assessments?
Sign up for our newsletter and hear about our free tools, product updates, and blog posts first! Don’t worry, we would never sell your email address, and we promise not to spam you with too many emails.