# Probability problem - Test Cheaters

1. Jun 9, 2006

### carolyn2112

Hi All,
Here is my problem, there are 6 students suspected of cheating on a NJ Real Estate pre-licensing course exam. You must pass this course to take the NJ State exam, and for obvious ethical reasons we would rather not see these people get licenses! I need help figuring out the probability of them getting answers the same, regardless of right or wrong. Here is what I know:

There is one student, our control "Jane", suspected of changing other students' answers.
The test contains 110 questions with 4 answer choices per question
5 students have 94% to 98% of the same answers as Jane (Again, regardless of right or wrong). A fifth student has 85%
We are in the process of analyzing the other 30 exams for a frequency graph... so far they have roughly 50% of the same answers as Jane.

What is the probability of 2 students have 98% of the questions answered the same? 94% the same? We would like to be able to state it as a "one in x,000" sort of ratio.

I am somewhat proficient in stats, but I am hitting a brick wall on this one. If someone could show formulas and calculations I would be greatly appreciative!!

Thanks much!
Carolyn

2. Jun 9, 2006

### coalquay404

It's a meaningless question unless one has a precise measure of the difficulty of the questions on the test. Regardless, this is something which doesn't yield easily to analysis; it's something which should have been prevented by the examination invigilators.

3. Jun 9, 2006

### carolyn2112

Ok, assume that all questions are equally difficult and all students have same knowledge of material.

Yes, the instructor may have been able to prevent it, that is moot at this juncture.

4. Jun 9, 2006

### reilly

No matter what you do, the best outcome with be a measure of the significance of the difference between the a "suspect group" and the rest. And that significance test will vary with your initial assumptions about the statistics of the test -- binomial comparison or various non-parametric tests. Even if, as seems possible, the differences are significant at the .001 level, you cannot make any judgements about cheating, at least any that would stand up in court. All you can say is that the two groups differ, not why.The best you can do is to give the test again, and take better precautions. Sorry. (I'm sensitive to this issue, as I was offered a bribe while I was in grad school, to take the Professional Engineer's Licensing Exam, which I did not take.)

Regards,
Reilly Atkinson

5. Jun 9, 2006

### Tide

How, exactly, does any one student get the opportunity to change another student's answers? That invalidates the test from the outset.

6. Jun 13, 2006

### carolyn2112

Unfortunately the school giving the test crammed way too many people into one class. The students who took the answers will most likely never pass the state licensure exam.

We really didn't need it to hold up in court, more to be able to make a statement like "For two students to have 98% of the same answers would be a 30k to one coincidence". I do appreciate you all taking the time to look at it though. Thank you!

7. Jun 13, 2006

### Gokul43201

Staff Emeritus

What can be done however - so long as you have a reason to suspect the suspects that is independent of their scores (eg: they were seated near Jane in the test center) - is test the hypothesis that Jane's proximity affected the scores of the other test-takers. There are several ways to tackle this problem statistically - but in essence you want to do a test of independence (ie: find out whether similarity to Jane's score is independent of say, proximity). A chi-squared test would be one approach. Something like ANOVA may work as well.

You want to be careful with how you set up the calculation. For it to be legally useful, it will probably have to be verified by a professional statistician.

Take a look into it and if you tell us how you've set up the calculation, we could check it for rigor. Of course, this is all contingent upon the underlined condition above.

PS : A link to the chi-squared test - http://davidmlane.com/hyperstat/chi_square.html

Last edited: Jun 13, 2006
8. Jun 15, 2006

### Tide

It seems to me that the probability of someone's answers matching 98% of the answers of another person would be about the same as the probability of any person scoring 98% on the test.

9. Jun 15, 2006

### Gokul43201

Staff Emeritus
I don't believe that's true. It would be, if all the test takers were random guessers. But assuming they are with non-zero knowledge of the test material, the probability of choosing a correct answer is greater than the probability of choosing a specific wrong answer.

10. Jun 15, 2006

### Tide

Gokul,

I agree but I think it would be a decent approximation given the lack of specifics in this case.

11. Jun 15, 2006

### es

I agree with Gokul43201 and think that the chi-squared test would be the best way to go about measuring the similarity of tests to Jane. Then you could see if there is a correlation between closeness to Jane and similarity to Jane's test.

However, keep in mind this just says that there is a correlation (and your confidence in that correlation) not why. I mean, they could have been in the same study group or taken the same prep class and that's why they sat together.

Also, I suspect if most people scored well on the test then the final confidence in the correlation will be quite low.