Runners in a race, probability paradox

Click For Summary

Discussion Overview

The discussion revolves around the probabilities of runners finishing first in a race, given their expected times and standard deviations. It explores the implications of these probabilities when the number of runners varies, particularly focusing on the case of two versus more than two runners.

Discussion Character

  • Exploratory
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • Some participants note that for two runners, if one runner has a lower expected time, they will always have a higher probability of finishing first, regardless of their standard deviations.
  • Others argue that with more than two runners, this relationship may not hold, as the probabilities can be influenced by the standard deviations of the runners.
  • A participant provides a specific example with three runners, illustrating that a runner with a lower expected time may not have the highest probability of winning if other runners have significant chances of finishing before them.
  • Concerns are raised about discrepancies between analytical results from integrals and results from Monte Carlo simulations, suggesting potential errors in the numerical methods used.
  • One participant shares specific values for expected times and standard deviations, comparing results from integrals and Monte Carlo simulations, highlighting differences in probabilities calculated.
  • There is a mention of the sensitivity of results to the number of integration steps used, indicating that small changes can affect the outcomes significantly.

Areas of Agreement / Disagreement

Participants express disagreement regarding the validity of the probability relationships as the number of runners increases. While some assert that the established property holds only for two runners, others challenge this view, suggesting that the situation becomes more complex with additional runners.

Contextual Notes

There are unresolved questions about the accuracy of numerical methods, particularly the Simpson's rule approximation and Monte Carlo simulations, which may introduce errors in the probability calculations.

cosmicminer
Messages
20
Reaction score
1
There are a number n of runners in a race.
We know their expected times from start to finish μ(i) and the corresponding standard deviations σ(i).
The probability of runner 0 to finish first is given by this integral:

integral1.png

It's from here:

https://www.untruth.org/~josh/math/normal-min.pdf

The 0 is one of the i's really but is suffixed as 0 in the above image of the formula.
I would write i instead of "0" and then in the product j ≠ i rather.

This can be computed easily using Simpson's rule and the approximation for erfc from Abramowitz-Stegun perhaps.

The strange thing is this:
If I choose n = 2 and any values for μ and σ then the following holds true:

if μ(1) < μ(2) then always P(1) > P(2) irrespective of the σ's ................. (1)

This is a property of the double normal distribution.
Thus if runner 1 has a delta function for a distribution (limiting normal with σ = 0) and runner b is close second but with big σ then the 1 has higher probability irrespective.

But if n > 2 the law (1) may or may not hold - depending on the sigmas.
So for n > 2 it is possible that one of the theoretically faster runners has lower probability than a slower runner with bigger σ.
How is this possible ?
 
Physics news on Phys.org
It's not just about being faster than each rival in a pairwise comparison, but about being faster than all of them simultaneously. Thus if (because of the sigmas) there are several with a significant probability of beating the one with lowest μ, that one may not be the most likely to finish first.

Consider an extension of your example: 3 runners, A, B and C. A has a delta function, while B and C have identical (but independent) distributions with μ and σ such that the probability of B finishing after A is 55%.
The probability of A winning is the probability that both B and C finish later, i.e. 0.552 ≈ 0.3.
The probability that at least one of B and C finishes before A is 0.7.
By the symmetry of the situation, P(B) = P(C) = 0.35.
So A is more likely to beat B than B is to beat A, likewise with A and C. But B and C are each more likely to finish first of the 3 than A.
 
mjc123 said:
It's not just about being faster than each rival in a pairwise comparison, but about being faster than all of them simultaneously. Thus if (because of the sigmas) there are several with a significant probability of beating the one with lowest μ, that one may not be the most likely to finish first.

Consider an extension of your example: 3 runners, A, B and C. A has a delta function, while B and C have identical (but independent) distributions with μ and σ such that the probability of B finishing after A is 55%.
The probability of A winning is the probability that both B and C finish later, i.e. 0.552 ≈ 0.3.
The probability that at least one of B and C finishes before A is 0.7.
By the symmetry of the situation, P(B) = P(C) = 0.35.
So A is more likely to beat B than B is to beat A, likewise with A and C. But B and C are each more likely to finish first of the 3 than A.

I 'm not sure of what you say.
Doing this with Monte Carlo random numbers (Box-Muller) does n't seem to help.
With n = 2 and 100,000 samples it even finds errors, P(2) > P(1) and we know that for n=2, P(2) < P(1).

The proof for n = 2 exists somewhere but I don't have the proof that the order 1 > 2 is strictly valid only for n = 2. But if they integral says so then it is so unless some error is introduced by the Simpson rule approximation.
 
I try

μ1 = 60, σ1 = 0.001
μ2 = 60.05, σ2 = 3

Integral says ok, P1 = 0.512884, P2 = 0.487116

Monte Carlo with 100,000 Box-Muller samples finds "error":

P1 = 0.48982, P2 = 0.51018

I add a μ3 = 60.05, σ3 = 3, so it's 3-way contest.
Integral finds P1 = 0.2598426, P2 = 0.3700787, P3 = 0.3700787
Monte Carlo with 100,000 Box-Muller samples again finds:
P1 = 0.27264, P2 = 0.35933, P3 = 0.36803

So it seems Monte Carlo finds it difficult even with 100,000 samples, while the integral says that the strict ordering is for n = 2 only.
27% to 36% looks like big difference to be caused by integration errors.

However when I increase the steps of integration from 200 to 1000 I find:

for n = 2, P(1)= 0.50519, P(2) = 0.4948101
for n = 3, P(1) = 0.2559455, P(2) = 0.3720273, P(3) = 0.3720273

So 200 to 1000 affects the second decimal digit.
 

Similar threads

  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K