Calculating Expectation Value of Winning Number in Probability Puzzle

Ken G
Gold Member
Messages
4,949
Reaction score
570
Find an elegant solution to this question: if 4 players are playing a game where they each are given a random number from 1 to a million, and the largest number wins, what is the expectation value of the winning number?
 
Physics news on Phys.org
take X as uniform random variable so winning number = 1,000,000 * x

the 50th percentile of the max of X after N trials is 0.5^(1/n) so for n=4 the expectation is 0.8409 or 840,900

i.e. there is a 50% chance of getting a max of .8409 in 4 trials
 
But that's not necessarily the expectation value of the winning number.
 
you are right, its the median

testing it in a spreadsheet the mean is about 350 -400 BP lower

subtract 1/2 the variance (1/12)(b-a)^2 and it seems to be in line with the average

which makes it kind of a negative lognormal distribution with the upper bound at 1
 
I think your solution is quantitatively precise, can you find an elegant proof? There are some ways to attack it that do come out simply.
 
Ken G said:
I think your solution is quantitatively precise, can you find an elegant proof? There are some ways to attack it that do come out simply.

Are you saying that the 1000 row spreadsheet of random numbers in excel that I used to prove this to myself is not elegant? ;)
 
He he, I'll leave that for you to decide!
 
Ken G said:
Find an elegant solution to this question: if 4 players are playing a game where they each are given a random number from 1 to a million, and the largest number wins, what is the expectation value of the winning number?

EWinningNumber = \frac 1 {1000000^4} ( \sum^{1000000}_{k=1} k^5 - \sum^{1000000}_{k=1} k \sum^{k-1}_{i=1}i^4 )

[EDIT]I haven't worked it out yet, but this can be calculated with the formulas from http://en.wikipedia.org/wiki/List_of_mathematical_series[/EDIT]
 
If you haven't even worked it out yourself, it's a bad sign that the solution is elegant!

Time for a hint: imagine what it means if we add one more player.
 
  • #10
I like Serena said:
EWinningNumber = \frac 1 {1000000^4} ( \sum^{1000000}_{k=1} k^5 - \sum^{1000000}_{k=1} k \sum^{k-1}_{i=1}i^4 )

[EDIT]I haven't worked it out yet, but this can be calculated with the formulas from http://en.wikipedia.org/wiki/List_of_mathematical_series[/EDIT]
That is a *huge* negative number, about -2.85713×1016
 
  • #11
D H said:
That is a *huge* negative number, about -2.85713×1016

Oops! Let me check my formulas again.

And btw, with a fifth player I only expect to see the expection go up in a polynomial manner.:confused:

[EDIT]Yes. I see my mistake.
To summarize with four players the winning number can only by 1 if all 4 players have 1.
That is a chance of 1/n^4.

The winning number can only be 2 if all four players have 1 or 2, but not all 4 have 1.
So the chance is (2^4-1)/n^4.

The winning number can only be 3 if all four players have 1, 2 or 3, but not all 4 have 1 or 2.
So the chance is (3^4-2^4)/n^4.

In general:

P(winning number is k) = (k^4 - (k-1)^4) / n^4

This makes the expectation:

E(WinningNumber) = \sum^{n}_{k=1} k (k^4 - (k-1)^4) / n^4
[/EDIT]
 
Last edited:
  • #12
Adding a fifth player let's you answer, and prove, the original puzzle in a single line.
 
  • #13
I like Serena *did* answer it in a single line, and this time she got it right.

Ken, you probably want something nice and concise like 4/5*1e6=800,000, which isn't quite right.

800,000.5 is better but still isn't quite right.
 
  • #14
Edited once or twice, but here is my simplest solution guided by solving for the chance of a fifth player winning:

1/5 = sum_n p(n) [1-n/106] = 1 - N/106 so N = 800,000 exactly.

In the above, p(n) is the probability distribution for the highest number n of the first 4 players, and N is the expectation of n, as desired.
 
Last edited:
  • #15
D H said:
Ken, you probably want something nice and concise like 4/5*1e6=800,000, which isn't quite right.
Edit: Ah, I see what you mean, you are dealing with the discreteness of the problem. I stand corrected, I am indeed taking the limit where 106 is effectively infinite. I should have let the puzzle be drawing real numbers between 0 and 1!
 
  • #16
Ken G said:
Edited once or twice, but here is my simplest solution guided by solving for the chance of a fifth player winning:

1/5 = sum_n p(n) [1-n/106] = 1 - N/106 so N = 800,000 exactly.

In the above, p(n) is the probability distribution for the highest number n of the first 4 players, and N is the expectation of n, as desired.
That might be elegant, but it is wrong.

With a fifth player the expected value of the winner's number is about 5/6*1e6. Here's an easy way to look at it: With one player, the expected value of the winning draw is about 500,000, or 1/2*1e6. With two players, the expected value is about 2/3*1e6, with three, about 3/4*1e6. In general, with n players the expected value is about n/(n+1)*1e6.

The exact value is

E(P_{(n)}(x)) = \frac{\sum_{k=1}^{1000000} k(k^n-(k-1)^n)}{1000000^n}Edit
This is easily derived from the order statistics for a discrete random variable drawn uniformly from 1 to k,

P(X_{(n)}(x) \le r) = \left(\frac r k\right)^n
 
Last edited:
  • #17
D H said:
That might be elegant, but it is wrong.

With a fifth player the expected value of the winner's number is about 5/6*1e6. Here's an easy way to look at it: With one player, the expected value of the winning draw is about 500,000, or 1/2*1e6. With two players, the expected value is about 2/3*1e6, with three, about 3/4*1e6. In general, with n players the expected value is about n/(n+1)*1e6.
That was the solution I gave.
The exact value is

E(P_{(n)}(x)) = \frac{\sum_{k=1}^{1000000} k(k^n-(k-1)^n)}{1000000^n}
This accounts for the discreteness that I overlooked. It's not really even correct either, because the puzzle doesn't stipulate what to do about ties. Presumably, if there's a tie, you either have to do it all over, or divide the win equally among those tied (say with a random tie-breaker). Any of those will change the answer-- discreteness is just a pain here, I should have said to solve for the continuous limit.
 
  • #18
Ken G said:
Edited once or twice, but here is my simplest solution guided by solving for the chance of a fifth player winning:

1/5 = sum_n p(n) [1-n/106] = 1 - N/106 so N = 800,000 exactly.

In the above, p(n) is the probability distribution for the highest number n of the first 4 players, and N is the expectation of n, as desired.

I don't quite understand what you write here.
What is the function of [1-n/106]?
And where does 1 - N/106 come from?

Either way, your example does not work out for a number between 1 and 2.
In that case we have an expectation of (1 x 1 + 15 x 2)/2^4 = 31/16.
According to your argument this should be N = (1 - 1/5) x 2 = 8/5 but it isn't.

I simplified my formula to: E(winning number) = \frac 4 5 n + \frac 1 2 - \frac 1 {3n} + \frac 1 {30n^3}

This works out for n = 1 and 2 (being 1 resp. 31/16).
For n=1000000 we get 800000.4999996666666666667

It is funny though that this is almost exactly 800000 or 800000.5 :approve:

It kind of reminds me of the Intel Pentium, which is actually the Intel 80584.9997932 :rolleyes:
 
  • #19
Ken G said:
discreteness is just a pain here, I should have said to solve for the continuous limit.
By that I presume you mean U(0,1000000) rather than U(1,1000000). The latter makes for a not quite so nice answer as the former.

In the case of U(0,r), the expected value of the kth order statistic is simply

E(X_{(k)}) = \frac {rk} {n+1}

The probability distribution for the winning ticket is that of the nth order statistic X(n), and thus for U(0,1000000)

E(\mbox{winner}) = \frac n {n+1}\,1000000
 
  • #20
So the "elegant" formulation would be the following?

Find an elegant solution to this question: if 4 players are playing a game where they each are given a random number from 0 to a million, and the largest number wins, what is the expectation value of the winning number?
Note: in case of a tie they play again.
 
  • #21
That's correct, assuming that
(a) These random numbers from 0 to a million includes numbers such as 1/2 and pi, and
(b) The numbers are drawn from a uniform distribution.

The probability of a tie is now zero, so we can essentially ignore this case.
 
  • #22
D H said:
By that I presume you mean U(0,1000000) rather than U(1,1000000).
Not just starting at 0, but also going to an arbitrarily large maximum, if we are using integers. Basically, machine precision. Then you can get as close to 4/5 of the maximum as you like.
 
  • #23
I like Serena said:
So the "elegant" formulation would be the following?
I had in mind going to such a nearly-continuous variable that ties are essentially impossible. Your solution accounts for both discreteness, and redoing ties, so is highly accurate, but it can't be made elegant because it requires a computer to make the calculation. It's the fault of how I set up the puzzle originally, I basically used 106 where I had in mind, say, 101000. Had I said the latter, we could be happy with the elegant solution that yields 4/5*101000, without much concern for discreteness or ties.
 
  • #24
I like Serena said:
I don't quite understand what you write here.
What is the function of [1-n/106]?
It's not a function, it's just multiplication. My apologies for the square brackets.
And where does 1 - N/106 come from?
N = sum_n p(n)*n, it is the expectation value of the highest of 4 players. I think you will see where the 1 - N/106 comes from immediately when you see what I wrote is much easier than you are imagining, because of those square brackets.
Either way, your example does not work out for a number between 1 and 2.
The solution I gave applies for an essentially continuous variable from 0 to some maximum, which might as well be 1 if we are using a continuous variable. Because it's not really possible to sample from a continuous variable, there has to be some machine precision in there somewhere, I gave a discrete version, but I didn't really mean to and it messes up the elegant solution.
It is funny though that this is almost exactly 800000 or 800000.5 :approve:
I hope you will see, once you understand my notation above, that the elegant solution leads immediately to the result 800,000, to within errors of discreteness and what to do with ties, which go away in the continuous limit.
 
  • #25
Ken G said:
It's not a function, it's just multiplication. My apologies for the square brackets.
N = sum_n p(n)*n, it is the expectation value of the highest of 4 players. I think you will see where the 1 - N/106 comes from immediately when you see what I wrote is much easier than you are imagining, because of those square brackets.

I guess I formulated my questions not sharp enough.
Let's break down your formula:

1/5 = sum_n p(n) [1-n/106] = 1 - N/106

1/5 is the chance of player 5 winning. I get that.

sum_n p(n) K is the calculation of an expectation

So apparently you calculate the expectation of 1-n/106.
No wait, I think you calculate a chance here.
That would be the chance that player 5 gets a number less than n.
Is that it? Or what?

sum_n p(n) n would be the expection of the winning number for 4 players

Now I get that this leads to your N.

N the expectation of the winning number for 4 players

That leaves the middle part. What did you do there?
 
  • #26
I like Serena said:
I guess I formulated my questions not sharp enough.
Let's break down your formula:

1/5 = sum_n p(n) [1-n/106] = 1 - N/106

1/5 is the chance of player 5 winning. I get that.
Yes, the left-hand side is the probability of the added player winning, and the expression on the right is another way to state that same probability.
sum_n p(n) K is the calculation of an expectation

So apparently you calculate the expectation of 1-n/106.
Don't think of it as an expectation, think of it as a probability of winning. p(n) is the probability that the score to beat is n. (1-n/106) is the probability of player #5 winning, given n. So sum_n p(n) *(1-n/106) is the full probability that player #5 wins, so =1/5. But when we evaluate the sum, we find it is 1 - N/106, where N is the expectation value of the score to beat. So we get an equation involving N that let's us solve for N.
No wait, I think you calculate a chance here.
That would be the chance that player 5 gets a number less than n.
Correct, but it is the chance of greater than n, and since p(n) is the probability that the score to beat is n, we get our result without ever knowing what p(n) actually is-- which is the work-saving part!
 
  • #27
Ken G said:
Yes, the left-hand side is the probability of the added player winning, and the expression on the right is another way to state that same probability.
Don't think of it as an expectation, think of it as a probability of winning. p(n) is the probability that the score to beat is n. (1-n/106) is the probability of player #5 winning, given n. So sum_n p(n) *(1-n/106) is the full probability that player #5 wins, so =1/5. But when we evaluate the sum, we find it is 1 - N/106, where N is the expectation value of the score to beat. So we get an equation involving N that let's us solve for N.
Correct, but it is the chance of greater than n, and since p(n) is the probability that the score to beat is n, we get our result without ever knowing what p(n) actually is-- which is the work-saving part!

Thanks!
I get it now.

And obviously it is the discrete sum that causes a discretization error.
Using an integral would eliminate that problem.

So \frac 1 5 = \int_0^{10^6}p(x)(1 - \frac x {10^6})dx = 1 - \frac N {10^6}
 
  • #28
Yes, that's really how I intended the puzzle, but I can't blame you for solving it correctly the way it was written!
 

Similar threads

Back
Top