Puzzling "roll X dice, choose Y highest" problem

by chawk
Tags: average, choose, dice, probability, roll
chawk is offline
Jul24-10, 05:31 PM
P: 4
Hi folks,

Over the past few days, a friend and I have been wracking our brains over a dice throw probability problem. As the post title describes, it's simply rolling X dice and picking the Y highest (we coined the notation 'xky' for 'roll x, keep y highest'), then pondering what the average value of the sum would be. The end goal is trying to develop a formula in terms of X dice throws of Z-sided die, keeping the highest Y, and then summing/averaging that.

We initially sought to figure out xk1 (roll x dice, keep the highest) and came up with a solution, but the methodology didn't seem to fit with xk2 and up.

Anyone have any pointers or guidance on how to mathematically represent the "choosing Y highest" step of this problem? That seems to be our biggest road block.

Phys.Org News Partner Science news on Phys.org
NASA's space station Robonaut finally getting legs
Free the seed: OSSI nurtures growing plants without patent barriers
Going nuts? Turkey looks to pistachios to heat new eco-city
techmologist is offline
Jul25-10, 05:04 PM
P: 254
Hi chawk,

I don't see a trick to make this problem easy. Maybe someone else does and can post it. Still, you don't have to compute the actual probability distribution for the sum of the Y highest to get its average. That would be very hard. To mathematically represent "choose the Y highest", you can break the average down into Z mutually exclusive cases: The (Y+1)st highest is 1, the (Y+1)st highest is 2, ...., the (Y+1)st highest is Z. You find the average of the sum for each of these cases (fairly easy), and then find the average of these averages, where each of these "conditional" averages is weighted by the probability that the (Y+1)st highest has a particular value. Finding the probability that the (Y+1)st highest has a particular value is the hard part.

It is doable, though. If you are interested in trying it I can give more help.
techmologist is offline
Aug25-10, 09:37 AM
P: 254
The approach I had in mind here doesn't work. I was thinking that given the value of the (Y+1)st highest roll (call it r), each of the Y highest take on values r, r+1, ... ,Z with equal probability. But that isn't true. So while finding averages, or expected values, by first finding conditional expected values and then averaging them works, it doesn't seem to make things easier here. On the bright side, calculating the actual probability distribution for the sum of the Y highest is not nearly so difficult as I originally thought it was.

bpet is offline
Aug29-10, 02:05 AM
P: 523

Puzzling "roll X dice, choose Y highest" problem

Yes, working out order statistics of discrete random variables can be tricky. We can label the sorted n-sided dicerolls as [tex]Z_1,\ldots,Z_x[/tex] where

[tex]n\ge Z_1\ge \ldots \ge Z_x\ge 1[/tex].

To work out the expectation


we just need to know the marginal distribution of each Z. In fact [tex]P[Z_k\le j][/tex] is the probability that at most k-1 of the dice are greater than j, so

[tex]P[Z_k\le j] = \sum_{i=0}^{k-1}(^xC_i)(1-j/n)^i (j/n)^{x-i}[/tex]

and with

[tex]P[Z_k= j] = P[Z_k\le j] - P[Z_k\le j-1][/tex]

we can work out [tex]E[Z_k][/tex] for each k.
techmologist is offline
Aug30-10, 04:12 PM
P: 254
Ah, so you can compute the average without having to find the distribution of Z1+...+Zy. That is very cool. I was hoping someone would post a way to do that. Thanks, bpet.

Finding the distribution of Z1+...+Zy is not so bad after all. I originally thought it could only be computed recursively, and not in a way that would lead to a general formula. But then I saw how to break the problem down into smaller parts.

For instance, assume that you roll x dice, each of which has z sides, and want the sum S of the y highest values rolled. Let r be the smallest value that occurs among the y highest. Suppose that exactly i of the x dice have values less than r, j are equal to r, and the rest are greater than r. If i+j = x, then each of the y highest takes on the value r, and S = y*r is the only possibility for the sum. Otherwise, if i+j < x, then that leaves x-i-j dice whose values are greater than r. Of the highest y, y - (x-i-j) of them take on the value r. So for the sum S to take on some particular value s, the remaining x-i-j dice need to sum to s-r*(y+i+j-x). Further, these values of these dice are restricted to r+1, r+2, ...., z.

The number of ways this can be done is the same as the number of ways to distribute s-r*(y+i+j-x) indistinguishable balls among x-i-j cells such that every cell has between r+1 and z balls. And this is the same as the number of ways to distribute s-r*(y+i+j-x)-(r+1)*(x-i-j) = s+i+j-x-r*y balls among x-i-j cells so that each cell has between 0 and z-r-1 balls.

Using the principle of inclusion-exclusion, the number of ways to distribute B indistinguishable balls among C cells so that each cell has between 0 and D balls is

[tex]N(B;C,D) = \sum_{k=0}^C (-1)^k \binom{C}{k} \binom{B-k(D+1) + C-1}{C-1}[/tex]

for C > 0. We take N(0;0,D) to be 1.

So for each pair i,j, there are


ways to separate the x dice into those that are less than, equal to, and greater than r. Then there are (r-1)i ways to choose the values of those less than r. And finally there are N(s+i+j-x-r*y; x-i-j, z-r-1) ways for the y highest to sum to a particular value s. So, summing the resulting expression over the allowed values of r, i, and j results in

[tex]\sum_{r=1}^{z}\sum_{i=0}^{x-y}\sum_{j=x-y-i+1}^{x-i}\frac{x!}{i!j!(x-i-j)!}(r-1)^i N(s+i+j-x-ry; x-i-j, z-r-1)[/tex]

ways for the sum S of the y highest of x z-sided dice to take on the value s.

Register to reply

Related Discussions
Highest "income to working-hour" ratio Academic Guidance 6
"God does not play dice" referred to what exactly? General Discussion 22
The Second Law of Thermodynamics says "god" doesn't throw dice... General Physics 0
The "Roll and grow" General Discussion 10
The Second Law of Thermodynamics says "god" doesn't throw dice... General Physics 0