# (Probabilities) Sum of n dice add up to 117. Estimate n, confidence interval of 90%.

1. May 2, 2012

### Perrault

1. The problem statement, all variables and given/known data

(I translated this from french)

Some n dice are thrown and the sum of all their face values is 117.
Estimate n through a confidence interval of 90%.
In other words, of all the possible rolls of n dice that sum up to 117, find a range in which n should be located 90% of the time.
For example, 20$\leq$n$\leq$117, 100% of the time because a 117-dice roll can sum up to 117, but not a 118-dice roll, and 19 dice can give a maximum sum of 114.

2. Relevant equations

To estimate the population average (here, the population is every possible roll of n dice that sums to 117) while the variance is unknown, and X is normally distributed (which it probably is), we use the following formula
$\frac{X^{―} - \mu}{\sqrt{\frac{S^{2}_{n-1}}{n}} }$ : T$_{n-1}$

Where :

X$^{―}$ is the sample average if that has anything to do with anything.
S^{2} _{n-1} would be equal to $\frac{n}{n-1}$ S$^{2}$ where S$^{2}$ would be the sample's variance and n the number of items in the sample.
T$_{n-1}$ is the symbol for Student's t-distribution.

There are five other, closely related, formulae, but this is the one that seems the most reasonable to use.

But I'm not even sure how that can be used to estimate n.

3. The attempt at a solution

What has been shown above probably shows at which point I am lost in this affair. We have to use some distribution, but I'm not even sure my choice is right.

Thanks!

2. May 2, 2012

### Ray Vickson

Re: (Probabilities) Sum of n dice add up to 117. Estimate n, confidence interval of 9

I don't think the student t-distribution has anything to do with this problem. We can regard N (the number of dice) as a random variable with some prior distribution, say uniform over some large interval. We toss N dice and observe a total = 117. We could use a computer algebra system to work out the probabilities that the sum of n dice is 117, for {N=n} between 20 and 117, but it is easier to work with a normal approximation.

Given n, we let f(x|n) = the normal density at x having mean m = (7/2) n and variance = (35/12)*n; these are the exact mean and variance of Sn = sum of n dice values. We can regard our observation as verifying {116.5 <= Sn <= 117.5}, and
$$P\{116.5 <= S_n <= 117.5 \} = \int_{116.5}^{117.5} f(x|n) \, dx \doteq f(117|n),$$
to good approximation. So, given the observation, we can regard N has having a posterior distribution
$$P(n) = \frac{ f(117|n)}{\sum_{k=20}^{117} f(117|k) }.$$
In this formula we use
$$f(117|n) = \frac{1}{\sqrt{2 \pi n}} \exp\left(-\frac{1}{2} \frac{(117 - (7/2) n)^2}{(35/12)n} \right).$$
You want to determine an interval $I=\{n_1, n_1 + 1, \ldots, n_2 \}$ such that $\sum_{n \in I} P(n) \geq 0.90,$ and presumably you would like I to be as short as possible, or nearly so.

RGV

3. May 2, 2012

### D H

Staff Emeritus
Re: (Probabilities) Sum of n dice add up to 117. Estimate n, confidence interval of 9

This probability isn't too hard to calculate exactly with a scripting language such as perl or python. Simply use the generating polynomial $(x+x^2+x^3+x^4+x^5+x^6)/6$. The probability of getting a sum of 117 with N rolls is the coefficient of $x^{117}$ of the polynomial $(x+x^2+x^3+x^4+x^5+x^6)^N/6^N$.

4. May 2, 2012

### Ray Vickson

Re: (Probabilities) Sum of n dice add up to 117. Estimate n, confidence interval of 9

Right, It is also straightforward in Maple---so one can find q(n) = P{Sn=117} for n from 20 to 117, and work with that array. However, the results are not very different from those obtained from the normal distribution. In particular, one gets the same confidence interval, but with very slightly different probabilities.

RGV

5. May 2, 2012

### Perrault

Re: (Probabilities) Sum of n dice add up to 117. Estimate n, confidence interval of 9

Hello, and thanks for your replies,

I have trouble understanding your work, I don't understand from here on :
If it's easier to use Maple, how would I do that?

Thanks again!

6. May 2, 2012

### Ray Vickson

Re: (Probabilities) Sum of n dice add up to 117. Estimate n, confidence interval of 9

For a given large n, say n = 80, we want to compute the probability that S80 = 117, which is the probability that the sum of 80 dice values equals 117. In principle we could use a computer algebra system to evaluate the probability exactly, but it is almost as good---and much easier---to use a normal approximation. So, S80 is a DISCRETE random variable taking values in the set {80, 81, 82, ... ,480}, and with mean ES80 = 80*3.5 and variance VS80 = (35/12)*80. We want to use the normal distribution with this same mean and variance, but the normal describes a CONTINUOUS random variable X, with P{X = 117} = 0. How can we make a continuous distribution approximate a discrete one? Well, would you not agree that for the discrete random variable S80, the two events {S80 = 117} and
{116.5 < S80 < 117.5} are exactly the same? (After all, S80 can only take integer values!) We cannot replace P{S80 = 17} by P{X = 117}, but we *can* replace P{116.5 < S80 < 117.5} by P{116.5 < X < 117.5}. In principle, we ought to use the exact interval probability (obtained by integrating the normal density f(x) from x = 116.5 to x = 117.5), but we can make a further approximation, and just replace the integral of f(x) over the interval by f(x) at the center of the interval (116.5,117.5), times the length (=1) of the interval; that is, the probability is approximately f(117)*1 = f(117). That is the form used in later calculations.

Note: there is nothing mysterious here: that is what we *always* do when we approximate a discrete probability by a continuous one.

I just used the formulas in my first response and evaluated them all in Maple. I had also done an exact analysis (first, before using the approximation), obtained by getting the exact discrete probabilities P{Sn = 117} for n from 20 to 117. Basically, the Maple commands were:
> f :=1/6*(x+x^2+x^3+x^4+x^5+x^6): #the generating function for 1 die
> #
> # the generating function for n dice is f^n, and the coefficient of x^117 is P{Sn=117}
> #
> for n from 20 to 117 do
> q[n]:=evalf(coeff(expand(f^n),x,117)): end do: n:='n':
> for n from 20 to 117 do
> P[n]:=q[n]/Tot: end do: n:='n':
Note: it is probably possible to speed this up considerably because we don't really need coefficients for x^k with k > 117, so at each n we can truncate at x^117, then just multiply by f and truncate again, etc. Also, one can keep expression swell down by using evalf at each stage, so the coefficients of x^k are all floats. However, the direct method was fast enough, so I did not bother.

RGV

Last edited: May 2, 2012
7. May 2, 2012

### D H

Staff Emeritus
Re: (Probabilities) Sum of n dice add up to 117. Estimate n, confidence interval of 9

What Ray did (and I would have approached this the same way) is to use Bayes' theorem,
$$P(A_i|E) = \frac{P(E|A_i)P(A_i)}{\sum_k P(E|A_k)P(A_k)}$$
where
• ${A_i}$ is a set of mutually exclusive events that collectively span the probability space. In this problem, the events are number of dice rolled N=1, N=2, ..., up to some rather large but finite number.
• $E$ is some observed event, or evidence. In this problem, the event E is the given fact that the sum of the N dice rolls was 117.
• $P(A_i|E)$is the probability of event $A_i$ given the observed event $E$. For example, what is the probability that the die was rolled 20 times to yield that total of 117? 21 times? These posterior probabilities are the desired quantities.
• $P(E|A_i)$ is the probability that the observed event $E$ given the event $A_i$.
• $P(A_i)$ is some estimate of the probability of event $A_i$ without that supporting evidence.
Without any prior supporting evidence, the principle of insufficient reason is about all one can go with: The priors are equiprobable. With this assumption of equal priors, Bayes' law reduces to
$$P(A_i|E) = \frac{P(E|A_i)}{\sum_k P(E|A_k)}$$

To illustrate, suppose you were told that the sum of the dice was seven. There are six possible values for N here, N=2 through 7. The probability of rolling seven with two dice (P(S=7|N=2)) is 6/36. Continuing, with this,
P(S=7|N=3)=15/216
P(S=7|N=4)=20/1296
P(S=7|N=5)=15/7776
P(S=7|N=6)=6/46656
P(S=7|N=7)=1/279936

These probabilities of course don't sum to one. There's no reason to expect them to do so. They instead sum to 70993/279936. This is in effect the normalization factor that lets us scale the posterior probabilities so they sum to one. With this scaling,

P(N=2|S=7)=0.6571915540968828
P(N=3|S=7)=0.2738298142070345
P(N=4|S=7)=0.06085106982378544
P(N=5|S=7)=0.00760638372797318
P(N=6|S=7)=0.0005070922485315454
P(N=7|S=7)=1.408589579254293e-05

So just N=2 and N=3 in this case alone give that 90% confidence interval (P=93.1%) in this case. This obviously is not the answer you want as there is no way to roll a sum of 117 with just 2 or 3 rolls of the dice.