|
News and events
How Much of a Problem is Guessing on
a Criterion-Referenced (Mastery) Test?*
by Steven Just Ed.D.
Anyone who gives multiple choice tests worries
about the effects of guessing on student scores. After all,
for a four choice per item test even a student who knows absolutely
nothing will, on average, get a score of 25%. Of course that’s
not even close to a passing score so we don’t worry
too much about students at the low end of the distribution.
The more important question is: For students at the higher
end of the distribution (at or above the passing score), how
dramatic are the effects of guessing? How many students who
otherwise would not have passed the test will actually pass
because they guessed some number of correct answers? Surprisingly,
on a mastery test with a high cut score, the problem is not
as serious as you might think. Let’s look at why.
Most of our clients somewhat arbitrarily set
test passing scores at 90% (that’s a separate problem,
worthy of a separate article). Since almost all students pass
each test, the average test score tends to be quite high (above
90%). So, let’s look at the “average” student
with a test score of 94% and the consequences of guessing
on that student’s score. To make our calculations easy
let’s assume the student took a 100 question multiple-choice
test, with four choices per question, so each question is
worth 1 point. This means that on this test this student got
6 questions wrong. Since the student did not know the correct
answer to these questions, it’s reasonable to assume
that he/she “guessed” on these questions.
So we know that the “average” student
guessed incorrectly on six questions. But the student must
also have guessed correctly on some questions, artificially
raising his/her score. How many correct guesses did this student
have? Well, it’s pretty easy to calculate. If we know
that guessing provides the incorrect answer 75% of the time
and the correct answer 25% of the time (on a four-choice item)
and the average student guessed incorrectly 6 times on this
test, then that student must have guessed on a total of 8
questions (6 incorrect guesses, 2 correct guesses). So even
removing the “correct guesses” and counting them
as incorrect the “average” student would still
have passed the test with a score of 92%.
What about the student who “just passed”
with a score of 90%? This student guessed incorrectly on 10
questions. Using our 75% formula this means that he/she guessed
correctly on 3.3 questions (let’s say 4) meaning his/her
“guess-adjusted" score would have been 86%. Doing
the math on scores of 91%, 92% and 93% gives “guess-adjusted”
scores of 88%, 89.4%, and 90.7%, respectively. So the bottom
line is that those students who received scores of 90%, 91%
or 92% (this could go either way) passed, at least in part
by guessing.
Is this a big problem? I think not. No test
is a “true” score; there is always a measurement
error. And having been through a number of cut score setting
processes using the Angoff method I can guarantee you that
the cut score arrived at is always a somewhat subjective process.
So I think it is unlikely that you can discern the difference
between an employee with a “guess-adjusted” score
of 86% and one with a “guess-adjusted” score of
90% (someone who would have passed even if he/she had not
artificially inflated his/her score by guessing.)
Let’s take a look at three other
cases of “guess-adjusted” scores:
- A test composed of all
true-false questions
- A test composed of
four choice multiple-choice questions, but where one distractor
can be easily eliminated (in effect a three choice question)
on all the questions.
- A test composed of
five choice multiple-choice questions where all choices
are plausible
The results, along with our original four plausible
choice calculations, are summarized in the table below:
| |
“Guess-
Adjusted” Scores |
| Original
Test Score |
True-False
Test |
Three
Choice Multiple-Choice Test |
Four
Choice Multiple-Choice Test |
Five
Choice Multiple-Choice Test |
| 90% |
80% |
85% |
86.7% |
87.5% |
|
91% |
82% |
86.5% |
88% |
88.75% |
|
92% |
84% |
88% |
89.4% |
90% |
|
93% |
86% |
89.5% |
90.7% |
91.25% |
|
94% |
88% |
91% |
92% |
92.5% |
|
95% |
90% |
92.5% |
93.3% |
93.75% |
There are four lessons here:
- Do not use True/False
questions. For a test of True/False questions a score of
95% would be “guess adjusted” to just passing
(90%) and anything below that would be failing.
- Make sure all distractors
are plausible. The ability to eliminate even one distractor
significantly raises the probability of a correct guess.
- Consider using five-choice
questions (assuming you can come up with four plausible
distractors for each question). They further reduce the
“guessing effect.”
- Except for all True/False
tests the “guessing effect” for high-average
score mastery tests is smaller than you might think.
|