Rolling a Fair Die: Analyzing Data & Results

  • I
  • Thread starter DaveC426913
  • Start date
  • Tags
    Rolling
In summary: It's not counterproductive, it's just that statistically significant results would only be achieved by doing a lot more trials.
  • #1
DaveC426913
Gold Member
22,432
6,106
I don't know how to analyze data proper-like.

What variance would be considered statistically significant, such that one might conclude a die is not fair?

I just rolled a six-sided die 192 times.

There are my results: 32, 28, 35, 37, 29, 31.
So, they vary from average by 0, -4, +3, +5, -3, -1.

Should I just keep rolling?
 
Physics news on Phys.org
  • #2
Are the numbers the number of occurrences? For example is 32 the number of times 1 came up and 31 the number of times 6 came up? One common way to do this is to do a hypothesis test and use the pearson chi-square test. I can say with reasonably high confidence that the dice is likely fair. I got a p-value of .96ish.
 
  • Like
Likes FactChecker
  • #3
MarneMath said:
Are the numbers the number of occurrences? For example is 32 the number of times 1 came up and 31 the number of times 6 came up?
Yes. No.

I mean yes the number is the number occurrences. But no, it is not 1 through 6. (though that is not relevant to the fairness)

MarneMath said:
One common way to do this is to do a hypothesis test and use the pearson chi-square test. I can say with reasonably high confidence that the dice is likely fair. I got a p-value of .96ish.
I ... guess I'll be Googling 'pearson chi-square test'.

Or, I suppose, I can just Google 'how to determine of a die is fair'...
 
  • #4
I think the numberphile channel on YouTube had a video on dice fairness or on creating a special dice with unique properties.

 
  • Like
Likes DrClaude
  • #5
The Chi-squared goodness of fit test that @MarneMath recommends is standard and routinely used for your type of question. There are several calculators on the internet. Here is one https://graphpad.com/quickcalcs/chisquared2/

For your data, it gave a two-tailed P value of 0.8662, which means that the odds of getting sample results from a fair die that differ from expected (32 per side) by that amount or more are 0.8662. So your results are very reasonable for a fair die.

Whether you should keep rolling the die is up to you. You can assume that your die is not perfectly fair since no die is perfect. But you may have to roll a huge number of times to detect it with any statistical certainty.
 
Last edited:
  • Like
Likes Floydd
  • #6
You could also use a Bayesian approach. You would probably model it with a Dirichlet distribution prior and posterior. Then you could either decide how far from fair you are willing to ignore or you could compare directly to the fair dice hypothesis.
 
  • #7
Too bad it'll be something I can only revel in alone.

dice.jpg


My gamer friends are all
"Hey that die has a nine."
"Hey that die has two threes."
"Hey that die has a zero."


stupid mathphobes...
 
  • #8

If it's a pitted die, as depicted, it is imperfect with respect to fairness by virtue of non-uniform density.

dice-jpg.113225.jpg


These flush-faced casino dice are closer to fair:

flush-spots-casino-dice.jpg

 
  • #9
sysprog said:
If it's a pitted die, as depicted, it is imperfect with respect to fairness by virtue of non-uniform density.​
Indeed. And worse than normal dice because of the large discrepancy in number of holes.
Also, I did not distribute the numbers around the faces. You'll notice that 7 and 9 are adjacent.
My plan is to fill the pits with epoxy, then retest to see if that fairs them up.(Also: Dude wth! My post is all centre-jusitifed cause-uh you! )​
 
  • Like
Likes Jamison Lahman
  • #10
I would be surprised if the difference shows up before millions of tosses. I think that the tiny difference would be lost in other random influences.
 
  • #11
FactChecker said:
I would be surprised if the difference shows up before millions of tosses. I think that the tiny difference would be lost in other random influences.
Well, that's why we do empirical observations. To surprise us (or more accurately, to disabuse us of our preconceptions).
 
Last edited:
  • Like
Likes FactChecker
  • #12
DaveC426913 said:
Well, that's why we do empirical observations. To surprise us (or more accurately, to disabuse us of our preconceptions).
I agree. It would be interesting to me if a reasonable number of trials could detect that they are not fair.
 
  • #13
Just for fun, I tested what the Chi-squared result would be if all your results were multiplied by 10 as though you had done 10 times more experiments and got those proportions. Those results would be statistically very significant. If the die were fair, results that "unfair" or worse would only happen 2 in every 1000 times.

With 5 times as many trials giving the same proportions, the odds are 1 in 10 of getting those results or worse from a fair die.
 
  • #14
FactChecker said:
Just for fun, I tested what the Chi-squared result would be if all your results were multiplied by 10 as though you had done 10 times more experiments and got those proportions.
Isn't that kind of counter-productive?

I mean, if I rolled a perfectly fair only 6 times in total, then multiplied the result by 333, I'd obviously get vastly biased results.
 
  • #15
DaveC426913 said:
Isn't that kind of counter-productive?

I mean, if I rolled a perfectly fair only 6 times in total, then multiplied the result by 333, I'd obviously get vastly biased results.
I wasn't interested in a perfectly fair die. Just to get some scope on the problem, I was trying to see of how many rolls it would take if the biased results were due to an unfair die and were continued in a larger experiment. Five times as many rolls would not be enough and 10 times would be more than enough.
 
  • #16
I wasn't sure if just multiplying the number by 10 would be a good way to calculate how my reps the OP would need to check if the results would be significant. So I ran a quick power calculation. With an effect size of w = .1, sig level of .05 and power at .95. The OP would need 1979 rolls. The practical problem here is obviously what you consider to be practically significantly different. I can always find something to be statistical significant if I just increase my sample size. By the nature of most statistical test, they become rather sensitive as our sample size increases.
 
  • #17
MarneMath said:
I wasn't sure if just multiplying the number by 10 would be a good way to calculate how my reps the OP would need to check if the results would be significant. So I ran a quick power calculation. With an effect size of w = .1, sig level of .05 and power at .95. The OP would need 1979 rolls. The practical problem here is obviously what you consider to be practically significantly different. I can always find something to be statistical significant if I just increase my sample size. By the nature of most statistical test, they become rather sensitive as our sample size increases.
The original data is not uniform, but the difference between it and a uniform, fair die is very insignificant. If the lack of uniformity is entirely due to an unfair die, then it would continue the same proportions in a larger experiment. So multiplying the results and the expected uniform results would be give an idea of how an unfair die like that would perform in a Chi-squared test. Using expected bin sizes of 320 and data 320,280, 350, 370, 290, 310 gave these results:
Code:
P value and statistical significance: 
  Chi squared equals 18.750 with 5 degrees of freedom. 
  The two-tailed P value equals 0.0021 
  By conventional criteria, this difference is considered to be very statistically significant.

The P value answers this question: If the theory that generated the expected values were correct, what is the probability of observing such a large discrepancy (or larger) between observed and expected values? A small P value is evidence that the data are not sampled from the distribution you expected.
Multiplying by 5 instead of 10 gives the expected bin size of 160 and data 160, 140, 175, 185, 145, 155. The Chi-squared results are:
Code:
P value and statistical significance: 
  Chi squared equals 9.375 with 5 degrees of freedom. 
  The two-tailed P value equals 0.0950 
  By conventional criteria, this difference is considered to be not quite statistically significant.

The P value answers this question: If the theory that generated the expected values were correct, what is the probability of observing such a large discrepancy (or larger) between observed and expected values? A small P value is evidence that the data are not sampled from the distribution you expected.
 
  • #18
Well, I understand what you're saying. I'm just not sure that's a method I would use to calculate the required number of reps needed. We essentially came up with same number of required reps though. At your particular sample size the effects your measuring is basically that a difference of .1. Perhaps in dice land that's a practically significant difference. I'm not sure. Either way, that's my only caution with regards to increasing sample size until you have a decent p-value.

*Disclosure I'm assuming a power of .9
 
  • #19
MarneMath said:
Well, I understand what you're saying. I'm just not sure that's a method I would use to calculate the required number of reps needed.
You're right. It would only work for an unfair die that gives exactly the biased results of that data sample. It could not be used for an unknown die. It's just an example that I thought was interesting.
At your particular sample size the effects your measuring is basically that a difference of .1. Perhaps in dice land that's a practically significant difference.
That is the result for 5 times as many trials. It is not significant enough. The result for 10 times as many is shown above. It is very significant (0.002).
 
Last edited:
  • #21
MarneMath said:
Maybe there's a misunderstanding when I say effects size. I'm referring to this https://en.wikipedia.org/wiki/Effect_size not the p-value.
Oh. I see what you mean. Good point. I stand corrected. Yes, I think that level of "unfairness" would be considered significant in a die.
 
  • #22
DaveC426913 said:
Indeed. And worse than normal dice because of the large discrepancy in number of holes.
Also, I did not distribute the numbers around the faces. You'll notice that 7 and 9 are adjacent.
My plan is to fill the pits with epoxy, then retest to see if that fairs them up.(Also: Dude wth! My post is all centre-jusitifed cause-uh you! )​
It's because of pairs of [center] and [/center] tags. You can see them if you click Edit on the post, and then click the little paper-looking icon at the upper right side of the window. However, since the post I replied to is several days old, the time window for being able to edit your post has elapsed. I have tweaked the two relevant tags to keep them from rendering.

At any rate, this is what the text above looks like in the BBCode editor (which you get by clicking the paper thingy I mentioned):
[CENTER]Indeed. And worse than normal dice because of the large discrepancy in number of holes.
Also, I did not distribute the numbers around the faces. You'll notice that 7 and 9 are adjacent.
My plan is to fill the pits with epoxy, then retest to see if that fairs them up.(Also: Dude wth! My post is all centre-jusitifed cause-uh you! )[/CENTER]
 
  • #23
I looked up what should be expected for die rolls.

Roll a 6-sided die [itex]N[/itex] times, and let [itex]f[/itex] be the fraction of times that you roll a 1 (for a fair dice, it should be the same for 1-6).

Then:

[itex]\langle f \rangle = 1/6[/itex] (naturally)
[itex]\sigma_N(f) = \sigma_1(f)/\sqrt{N}[/itex]

where [itex]\sigma_N[/itex] is the standard deviation of [itex]f[/itex] for [itex]N[/itex] rolls.

As for [itex]\sigma_1[/itex], we can compute it this way:

[itex]\sigma_1 = \sqrt{var_1}[/itex] where [itex]var_1[/itex] is the variance. By definition, the variance in [itex]f[/itex] is:

[itex]var_1 = \langle f^2 \rangle - \langle f \rangle^2[/itex]

For a single roll, [itex]f = 1[/itex] (with probability 1/6) and [itex]f=0[/itex] with probability 5/6. So [itex]\langle f^2 \rangle = 1/6[/itex] and [itex]\langle f \rangle = 1/6[/itex]. So [itex]var_1 = 1/6 - (1/6)^2 = 5/36[/itex]. So [itex]\sigma_1 = \frac{\sqrt{5}}{6}[/itex]

So the standard deviation in [itex]f[/itex] for [itex]N[/itex] rolls should be [itex]\frac{\sqrt{5/N}}{6}[/itex]

In your case, [itex]N = 192[/itex], so this gives [itex]\sigma_N = 0.027[/itex].

So you can expect that typically [itex] 1/6 - \sigma_N < f < 1/6 + \sigma_N[/itex]. So [itex]0.140 < f < 0.194[/itex]

Your particular frequencies were:
  • [itex]32/192 = 0.167[/itex]
  • [itex]28/192 = 0.146[/itex]
  • [itex]35/192 = 0.182[/itex]
  • [itex]37/192 = 0.193[/itex]
  • [itex]29/192 = 0.151[/itex]
  • [itex]31/192 = 0.161[/itex]
So all your frequencies are within 1 standard deviation of 1/6 (although that 37/192 is getting close to the limit)
 

1. What is the probability of rolling a specific number on a fair die?

The probability of rolling a specific number on a fair die is 1/6 or approximately 16.7%. This is because there are six possible outcomes (numbers 1-6) and each outcome has an equal chance of occurring.

2. How many times should a fair die be rolled to get accurate data?

Statistically, it is recommended to roll a fair die at least 30 times to get accurate data. However, the more times the die is rolled, the more reliable the data will be.

3. What is the expected distribution of results when rolling a fair die?

The expected distribution when rolling a fair die is a uniform distribution, meaning that each possible outcome has an equal chance of occurring. This means that over a large number of rolls, the results should be evenly distributed among all six numbers.

4. How can we analyze the data from rolling a fair die?

One way to analyze the data from rolling a fair die is to create a frequency table or graph. This will show the number of times each outcome occurred and can help identify any patterns or anomalies in the data.

5. Is it possible for a fair die to land on the same number multiple times in a row?

Yes, it is possible for a fair die to land on the same number multiple times in a row. Each roll is an independent event and the previous outcomes do not affect the next outcome. Therefore, it is possible for a fair die to land on the same number multiple times in a row, although the probability of this happening decreases with each additional roll.

Similar threads

  • Precalculus Mathematics Homework Help
2
Replies
53
Views
5K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
974
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
9K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
1K
Back
Top