Calculating Expectation Value of Winning Number in Probability Puzzle

Ken G · Mar 9, 2011

Find an elegant solution to this question: if 4 players are playing a game where they each are given a random number from 1 to a million, and the largest number wins, what is the expectation value of the winning number?

BWV · Mar 9, 2011

take X as uniform random variable so winning number = 1,000,000 * x

the 50th percentile of the max of X after N trials is 0.5^(1/n) so for n=4 the expectation is 0.8409 or 840,900

i.e. there is a 50% chance of getting a max of .8409 in 4 trials

Ken G · Mar 9, 2011

But that's not necessarily the expectation value of the winning number.

BWV · Mar 9, 2011

you are right, its the median

testing it in a spreadsheet the mean is about 350 -400 BP lower

subtract 1/2 the variance (1/12)(b-a)^2 and it seems to be in line with the average

which makes it kind of a negative lognormal distribution with the upper bound at 1

Ken G · Mar 9, 2011

I think your solution is quantitatively precise, can you find an elegant proof? There are some ways to attack it that do come out simply.

BWV · Mar 9, 2011

Ken G said:

I think your solution is quantitatively precise, can you find an elegant proof? There are some ways to attack it that do come out simply.

Are you saying that the 1000 row spreadsheet of random numbers in excel that I used to prove this to myself is not elegant? ;)

Ken G · Mar 9, 2011

He he, I'll leave that for you to decide!

I like Serena · Mar 22, 2011

Ken G said:

Find an elegant solution to this question: if 4 players are playing a game where they each are given a random number from 1 to a million, and the largest number wins, what is the expectation value of the winning number?

EWinningNumber = \frac 1 {1000000^4} ( \sum^{1000000}_{k=1} k^5 - \sum^{1000000}_{k=1} k \sum^{k-1}_{i=1}i^4 )

[EDIT]I haven't worked it out yet, but this can be calculated with the formulas from http://en.wikipedia.org/wiki/List_of_mathematical_series[/EDIT]

Ken G · Mar 22, 2011

If you haven't even worked it out yourself, it's a bad sign that the solution is elegant!

Time for a hint: imagine what it means if we add one more player.

D H · Mar 22, 2011

I like Serena said:

EWinningNumber = \frac 1 {1000000^4} ( \sum^{1000000}_{k=1} k^5 - \sum^{1000000}_{k=1} k \sum^{k-1}_{i=1}i^4 )

[EDIT]I haven't worked it out yet, but this can be calculated with the formulas from http://en.wikipedia.org/wiki/List_of_mathematical_series[/EDIT]

That is a *huge* negative number, about -2.85713×10¹⁶

I like Serena · Mar 22, 2011

D H said:

That is a *huge* negative number, about -2.85713×10¹⁶

Oops! Let me check my formulas again.

And btw, with a fifth player I only expect to see the expection go up in a polynomial manner.

[EDIT]Yes. I see my mistake.
To summarize with four players the winning number can only by 1 if all 4 players have 1.
That is a chance of 1/n^4.

The winning number can only be 2 if all four players have 1 or 2, but not all 4 have 1.
So the chance is (2^4-1)/n^4.

The winning number can only be 3 if all four players have 1, 2 or 3, but not all 4 have 1 or 2.
So the chance is (3^4-2^4)/n^4.

In general:

P(winning number is k) = (k^4 - (k-1)^4) / n^4

This makes the expectation:

E(WinningNumber) = \sum^{n}_{k=1} k (k^4 - (k-1)^4) / n^4
[/EDIT]

Ken G · Mar 22, 2011

Adding a fifth player let's you answer, and prove, the original puzzle in a single line.

D H · Mar 22, 2011

I like Serena *did* answer it in a single line, and this time she got it right.

Ken, you probably want something nice and concise like 4/5*1e6=800,000, which isn't quite right.

800,000.5 is better but still isn't quite right.

Ken G · Mar 22, 2011

Edited once or twice, but here is my simplest solution guided by solving for the chance of a fifth player winning:

1/5 = sum_n p(n) [1-n/10⁶] = 1 - N/10⁶ so N = 800,000 exactly.

In the above, p(n) is the probability distribution for the highest number n of the first 4 players, and N is the expectation of n, as desired.

Ken G · Mar 22, 2011

D H said:

Ken, you probably want something nice and concise like 4/5*1e6=800,000, which isn't quite right.

Edit: Ah, I see what you mean, you are dealing with the discreteness of the problem. I stand corrected, I am indeed taking the limit where 10⁶ is effectively infinite. I should have let the puzzle be drawing real numbers between 0 and 1!

D H · Mar 22, 2011

Ken G said:

Edited once or twice, but here is my simplest solution guided by solving for the chance of a fifth player winning:

1/5 = sum_n p(n) [1-n/10⁶] = 1 - N/10⁶ so N = 800,000 exactly.

In the above, p(n) is the probability distribution for the highest number n of the first 4 players, and N is the expectation of n, as desired.

That might be elegant, but it is wrong.

With a fifth player the expected value of the winner's number is about 5/6*1e6. Here's an easy way to look at it: With one player, the expected value of the winning draw is about 500,000, or 1/2*1e6. With two players, the expected value is about 2/3*1e6, with three, about 3/4*1e6. In general, with n players the expected value is about n/(n+1)*1e6.

The exact value is

E(P_{(n)}(x)) = \frac{\sum_{k=1}^{1000000} k(k^n-(k-1)^n)}{1000000^n}Edit
This is easily derived from the order statistics for a discrete random variable drawn uniformly from 1 to k,

P(X_{(n)}(x) \le r) = \left(\frac r k\right)^n

Ken G · Mar 22, 2011

D H said:

That might be elegant, but it is wrong.

With a fifth player the expected value of the winner's number is about 5/6*1e6. Here's an easy way to look at it: With one player, the expected value of the winning draw is about 500,000, or 1/2*1e6. With two players, the expected value is about 2/3*1e6, with three, about 3/4*1e6. In general, with n players the expected value is about n/(n+1)*1e6.

That was the solution I gave.

The exact value is

E(P_{(n)}(x)) = \frac{\sum_{k=1}^{1000000} k(k^n-(k-1)^n)}{1000000^n}

This accounts for the discreteness that I overlooked. It's not really even correct either, because the puzzle doesn't stipulate what to do about ties. Presumably, if there's a tie, you either have to do it all over, or divide the win equally among those tied (say with a random tie-breaker). Any of those will change the answer-- discreteness is just a pain here, I should have said to solve for the continuous limit.

I like Serena · Mar 22, 2011

Ken G said:

Edited once or twice, but here is my simplest solution guided by solving for the chance of a fifth player winning:

1/5 = sum_n p(n) [1-n/10⁶] = 1 - N/10⁶ so N = 800,000 exactly.

In the above, p(n) is the probability distribution for the highest number n of the first 4 players, and N is the expectation of n, as desired.

I don't quite understand what you write here.
What is the function of [1-n/10⁶]?
And where does 1 - N/10⁶ come from?

Either way, your example does not work out for a number between 1 and 2.
In that case we have an expectation of (1 x 1 + 15 x 2)/2^4 = 31/16.
According to your argument this should be N = (1 - 1/5) x 2 = 8/5 but it isn't.

I simplified my formula to: E(winning number) = \frac 4 5 n + \frac 1 2 - \frac 1 {3n} + \frac 1 {30n^3}

This works out for n = 1 and 2 (being 1 resp. 31/16).
For n=1000000 we get 800000.4999996666666666667

It is funny though that this is almost exactly 800000 or 800000.5 :approve:

It kind of reminds me of the Intel Pentium, which is actually the Intel 80584.9997932

D H · Mar 22, 2011

Ken G said:

discreteness is just a pain here, I should have said to solve for the continuous limit.

By that I presume you mean U(0,1000000) rather than U(1,1000000). The latter makes for a not quite so nice answer as the former.

In the case of U(0,r), the expected value of the k^th order statistic is simply

E(X_{(k)}) = \frac {rk} {n+1}

The probability distribution for the winning ticket is that of the n^th order statistic X_(n), and thus for U(0,1000000)

E(\mbox{winner}) = \frac n {n+1}\,1000000

I like Serena · Mar 22, 2011

So the "elegant" formulation would be the following?

Find an elegant solution to this question: if 4 players are playing a game where they each are given a random number from 0 to a million, and the largest number wins, what is the expectation value of the winning number?
Note: in case of a tie they play again.

D H · Mar 22, 2011

That's correct, assuming that
(a) These random numbers from 0 to a million includes numbers such as 1/2 and pi, and
(b) The numbers are drawn from a uniform distribution.

The probability of a tie is now zero, so we can essentially ignore this case.

Ken G · Mar 22, 2011

D H said:

By that I presume you mean U(0,1000000) rather than U(1,1000000).

Not just starting at 0, but also going to an arbitrarily large maximum, if we are using integers. Basically, machine precision. Then you can get as close to 4/5 of the maximum as you like.

Ken G · Mar 22, 2011

I like Serena said:

So the "elegant" formulation would be the following?

I had in mind going to such a nearly-continuous variable that ties are essentially impossible. Your solution accounts for both discreteness, and redoing ties, so is highly accurate, but it can't be made elegant because it requires a computer to make the calculation. It's the fault of how I set up the puzzle originally, I basically used 10⁶ where I had in mind, say, 10¹⁰⁰⁰. Had I said the latter, we could be happy with the elegant solution that yields 4/5*10¹⁰⁰⁰, without much concern for discreteness or ties.

Ken G · Mar 22, 2011

I like Serena said:

I don't quite understand what you write here.
What is the function of [1-n/10⁶]?

It's not a function, it's just multiplication. My apologies for the square brackets.

And where does 1 - N/10⁶ come from?

N = sum_n p(n)*n, it is the expectation value of the highest of 4 players. I think you will see where the 1 - N/10⁶ comes from immediately when you see what I wrote is much easier than you are imagining, because of those square brackets.

Either way, your example does not work out for a number between 1 and 2.

The solution I gave applies for an essentially continuous variable from 0 to some maximum, which might as well be 1 if we are using a continuous variable. Because it's not really possible to sample from a continuous variable, there has to be some machine precision in there somewhere, I gave a discrete version, but I didn't really mean to and it messes up the elegant solution.

It is funny though that this is almost exactly 800000 or 800000.5

I hope you will see, once you understand my notation above, that the elegant solution leads immediately to the result 800,000, to within errors of discreteness and what to do with ties, which go away in the continuous limit.

I like Serena · Mar 22, 2011

Ken G said:

It's not a function, it's just multiplication. My apologies for the square brackets.
N = sum_n p(n)*n, it is the expectation value of the highest of 4 players. I think you will see where the 1 - N/10⁶ comes from immediately when you see what I wrote is much easier than you are imagining, because of those square brackets.

I guess I formulated my questions not sharp enough.
Let's break down your formula:

1/5 = sum_n p(n) [1-n/10⁶] = 1 - N/10⁶

1/5 is the chance of player 5 winning. I get that.

sum_n p(n) K is the calculation of an expectation

So apparently you calculate the expectation of 1-n/10⁶.
No wait, I think you calculate a chance here.
That would be the chance that player 5 gets a number less than n.
Is that it? Or what?

sum_n p(n) n would be the expection of the winning number for 4 players

Now I get that this leads to your N.

N the expectation of the winning number for 4 players

That leaves the middle part. What did you do there?

Ken G · Mar 22, 2011

I like Serena said:

I guess I formulated my questions not sharp enough.
Let's break down your formula:

1/5 = sum_n p(n) [1-n/10⁶] = 1 - N/10⁶

1/5 is the chance of player 5 winning. I get that.

Yes, the left-hand side is the probability of the added player winning, and the expression on the right is another way to state that same probability.

sum_n p(n) K is the calculation of an expectation

So apparently you calculate the expectation of 1-n/10⁶.

Don't think of it as an expectation, think of it as a probability of winning. p(n) is the probability that the score to beat is n. (1-n/10⁶) is the probability of player #5 winning, given n. So sum_n p(n) *(1-n/10⁶) is the full probability that player #5 wins, so =1/5. But when we evaluate the sum, we find it is 1 - N/10⁶, where N is the expectation value of the score to beat. So we get an equation involving N that let's us solve for N.

No wait, I think you calculate a chance here.
That would be the chance that player 5 gets a number less than n.

Correct, but it is the chance of greater than n, and since p(n) is the probability that the score to beat is n, we get our result without ever knowing what p(n) actually is-- which is the work-saving part!

I like Serena · Mar 22, 2011

Ken G said:

Yes, the left-hand side is the probability of the added player winning, and the expression on the right is another way to state that same probability.
Don't think of it as an expectation, think of it as a probability of winning. p(n) is the probability that the score to beat is n. (1-n/10⁶) is the probability of player #5 winning, given n. So sum_n p(n) *(1-n/10⁶) is the full probability that player #5 wins, so =1/5. But when we evaluate the sum, we find it is 1 - N/10⁶, where N is the expectation value of the score to beat. So we get an equation involving N that let's us solve for N.
Correct, but it is the chance of greater than n, and since p(n) is the probability that the score to beat is n, we get our result without ever knowing what p(n) actually is-- which is the work-saving part!

Thanks!
I get it now.

And obviously it is the discrete sum that causes a discretization error.
Using an integral would eliminate that problem.

So \frac 1 5 = \int_0^{10^6}p(x)(1 - \frac x {10^6})dx = 1 - \frac N {10^6}

Ken G · Mar 22, 2011

Yes, that's really how I intended the puzzle, but I can't blame you for solving it correctly the way it was written!

Calculating Expectation Value of Winning Number in Probability Puzzle

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

A Does this computation satisfy LTL formulas?

A Prove that points which are indistinguishable from 0 exist (using logic)

A Mathematical Connection between Cosmic Expansion and Exponential Growth

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective