Predicting World Record Mile Time with Limited Training and Population Data

  • Context: Undergrad 
  • Thread starter Thread starter futurebird
  • Start date Start date
  • Tags Tags
    Time
Click For Summary

Discussion Overview

The discussion revolves around predicting the potential world record mile time for women based on limited training and population data. Participants explore statistical models, assumptions about training, and historical context regarding women's running times.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant suggests that only 5% of the population trains for running, proposing a statistical approach to predict the world record mile time based on the average and best recorded times.
  • Another participant argues that most naturally talented runners likely train at some level, implying that including the remaining 95% may not significantly improve the world record.
  • Concerns are raised about the reliability of the statistical assumptions, particularly regarding self-selection among runners and the normality of the distribution of mile times.
  • A participant expresses skepticism about the predicted improvement of 16.4 seconds, suggesting a much smaller estimate of 0.1 seconds instead.
  • Historical context is introduced, noting that the women's world record was significantly faster in 1920, prompting questions about the factors contributing to changes in running performance over time.
  • Participants acknowledge that the prediction model does not account for various factors such as increased participation, social acceptance, nutrition, and healthcare improvements.

Areas of Agreement / Disagreement

Participants express differing views on the reliability of the statistical model and the assumptions it relies on. There is no consensus on the predicted improvement in world record times, with some participants suggesting a significant change while others propose a much smaller adjustment.

Contextual Notes

Limitations include assumptions about the normal distribution of mile times, the self-selection of runners, and the impact of various social and biological factors on running performance.

futurebird
Messages
270
Reaction score
0
predict the best mile time??

Now, most people don't run very much and only a few train at all. If you want to know your "personal best" mile time you need to do a lot of training. Let's say that only 1/20 people in a population bother to do this.

We have all of their mile times. Let's say the average is 7:30 and the best is 4:12. The SD is... I don't know, let's say, 1 min.

Now based on the fact that this is only 5% of the population can we predict the world record mile time if EVERY woman trained until she could run her personal best?
 
Physics news on Phys.org


I would imagine that most people who have a special ability to run do train -- at least at a local, e.g. high school level -- and so learn how good they are. I don't think adding the other 95% would even double the number of world-class runners. But if it did, you might expect with ~50% probability an improvement in the record. This would likely be a small improvement (probably less than a tenth of a second).

But if you want to treat it as a stats problem, let's assume that each person is chosen from a normal distribution with mean 7:30, and that the current runners were selected arbitrarily (that the other 95%, if trained, would be just as good on average, and equally distributed).

With ~325 million runners who now train, the world record would be 5.81 standard deviations below the mean. This makes the standard deviation ~34.1 seconds. With 6.5 billion people, the record would be 6.29 standard deviations instead. This would improve the time by ~16.4 seconds to an amazing 3:56.
 
Last edited:


What percentile did you give the top score to get an SD of 5.81 ?
 


Oh that's based on the population. I see.

How reliable is this? I mean if our data are good (unbiased sample)...
 


futurebird said:
How reliable is this? I mean if our data are good (unbiased sample)...

Not reliable at all. It's based on the assumption that runner's don't self-select for a propensity for speed, and I think that's a terrible assumption. The assumption of normality is also a bad assumption, though probably less of a cause of error than the first. Finally, we assume that 5% push their limits far enough to tell if they could be world-class runners; this could be high or low, I have no idea.

I would much sooner guess 0.1 seconds than 16.4 seconds.
 


CRGreathouse said:
I would much sooner guess 0.1 seconds than 16.4 seconds.


Still, the women's world record was like 6:15 in 1920.
 


futurebird said:
Still, the women's world record was like 6:15 in 1920.

And to what do you attribute that change?
 


CRGreathouse said:
And to what do you attribute that change?


Mostly? More women running. It became more socially acceptable.
 


futurebird said:
Mostly? More women running. It became more socially acceptable.

Yes, and my 'prediction' model doesn't take that into account. Nor does it take better nutrition, improved healthcare, gene mixing ('hybrid vigor'), artificial genetic selection, or a host of other things into account. It just takes a few questionable assumptions and takes them to their logical conclusion.
 

Similar threads

  • · Replies 33 ·
2
Replies
33
Views
3K
  • · Replies 10 ·
Replies
10
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K
Replies
4
Views
5K
  • · Replies 27 ·
Replies
27
Views
3K
  • · Replies 11 ·
Replies
11
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 131 ·
5
Replies
131
Views
11K