Puzzling "roll X dice, choose Y highest" problemby chawk Tags: average, choose, dice, probability, roll 

#1
Jul2410, 05:31 PM

P: 4

Hi folks,
Over the past few days, a friend and I have been wracking our brains over a dice throw probability problem. As the post title describes, it's simply rolling X dice and picking the Y highest (we coined the notation 'xky' for 'roll x, keep y highest'), then pondering what the average value of the sum would be. The end goal is trying to develop a formula in terms of X dice throws of Zsided die, keeping the highest Y, and then summing/averaging that. We initially sought to figure out xk1 (roll x dice, keep the highest) and came up with a solution, but the methodology didn't seem to fit with xk2 and up. Anyone have any pointers or guidance on how to mathematically represent the "choosing Y highest" step of this problem? That seems to be our biggest road block. Thanks! 



#2
Jul2510, 05:04 PM

P: 254

Hi chawk,
I don't see a trick to make this problem easy. Maybe someone else does and can post it. Still, you don't have to compute the actual probability distribution for the sum of the Y highest to get its average. That would be very hard. To mathematically represent "choose the Y highest", you can break the average down into Z mutually exclusive cases: The (Y+1)st highest is 1, the (Y+1)st highest is 2, ...., the (Y+1)st highest is Z. You find the average of the sum for each of these cases (fairly easy), and then find the average of these averages, where each of these "conditional" averages is weighted by the probability that the (Y+1)st highest has a particular value. Finding the probability that the (Y+1)st highest has a particular value is the hard part. It is doable, though. If you are interested in trying it I can give more help. 



#3
Aug2510, 09:37 AM

P: 254

The approach I had in mind here doesn't work. I was thinking that given the value of the (Y+1)st highest roll (call it r), each of the Y highest take on values r, r+1, ... ,Z with equal probability. But that isn't true. So while finding averages, or expected values, by first finding conditional expected values and then averaging them works, it doesn't seem to make things easier here. On the bright side, calculating the actual probability distribution for the sum of the Y highest is not nearly so difficult as I originally thought it was.




#4
Aug2910, 02:05 AM

P: 523

Puzzling "roll X dice, choose Y highest" problem
Yes, working out order statistics of discrete random variables can be tricky. We can label the sorted nsided dicerolls as [tex]Z_1,\ldots,Z_x[/tex] where
[tex]n\ge Z_1\ge \ldots \ge Z_x\ge 1[/tex]. To work out the expectation [tex]E[Z_1+\ldots+Z_y][/tex] we just need to know the marginal distribution of each Z. In fact [tex]P[Z_k\le j][/tex] is the probability that at most k1 of the dice are greater than j, so [tex]P[Z_k\le j] = \sum_{i=0}^{k1}(^xC_i)(1j/n)^i (j/n)^{xi}[/tex] and with [tex]P[Z_k= j] = P[Z_k\le j]  P[Z_k\le j1][/tex] we can work out [tex]E[Z_k][/tex] for each k. 



#5
Aug3010, 04:12 PM

P: 254

Ah, so you can compute the average without having to find the distribution of Z_{1}+...+Z_{y}. That is very cool. I was hoping someone would post a way to do that. Thanks, bpet.
Finding the distribution of Z_{1}+...+Z_{y} is not so bad after all. I originally thought it could only be computed recursively, and not in a way that would lead to a general formula. But then I saw how to break the problem down into smaller parts. For instance, assume that you roll x dice, each of which has z sides, and want the sum S of the y highest values rolled. Let r be the smallest value that occurs among the y highest. Suppose that exactly i of the x dice have values less than r, j are equal to r, and the rest are greater than r. If i+j = x, then each of the y highest takes on the value r, and S = y*r is the only possibility for the sum. Otherwise, if i+j < x, then that leaves xij dice whose values are greater than r. Of the highest y, y  (xij) of them take on the value r. So for the sum S to take on some particular value s, the remaining xij dice need to sum to sr*(y+i+jx). Further, these values of these dice are restricted to r+1, r+2, ...., z. The number of ways this can be done is the same as the number of ways to distribute sr*(y+i+jx) indistinguishable balls among xij cells such that every cell has between r+1 and z balls. And this is the same as the number of ways to distribute sr*(y+i+jx)(r+1)*(xij) = s+i+jxr*y balls among xij cells so that each cell has between 0 and zr1 balls. Using the principle of inclusionexclusion, the number of ways to distribute B indistinguishable balls among C cells so that each cell has between 0 and D balls is [tex]N(B;C,D) = \sum_{k=0}^C (1)^k \binom{C}{k} \binom{Bk(D+1) + C1}{C1}[/tex] for C > 0. We take N(0;0,D) to be 1. So for each pair i,j, there are [tex]\frac{x!}{i!j!(xij)!}[/tex] ways to separate the x dice into those that are less than, equal to, and greater than r. Then there are (r1)^{i} ways to choose the values of those less than r. And finally there are N(s+i+jxr*y; xij, zr1) ways for the y highest to sum to a particular value s. So, summing the resulting expression over the allowed values of r, i, and j results in [tex]\sum_{r=1}^{z}\sum_{i=0}^{xy}\sum_{j=xyi+1}^{xi}\frac{x!}{i!j!(xij)!}(r1)^i N(s+i+jxry; xij, zr1)[/tex] ways for the sum S of the y highest of x zsided dice to take on the value s. 


Register to reply 
Related Discussions  
Highest "income to workinghour" ratio  Academic Guidance  6  
"God does not play dice" referred to what exactly?  General Discussion  22  
The Second Law of Thermodynamics says "god" doesn't throw dice...  General Physics  0  
The "Roll and grow"  General Discussion  10  
The Second Law of Thermodynamics says "god" doesn't throw dice...  General Physics  0 