Probability one hitter is better

  • Context: Undergrad 
  • Thread starter Thread starter KenNKC
  • Start date Start date
  • Tags Tags
    Probability
Click For Summary

Discussion Overview

The discussion revolves around evaluating the true batting abilities of two baseball players based on their hit statistics. Participants explore statistical methods to determine the probability that one player has a higher true ability than the other, considering various assumptions and conditions related to their performance data.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • Some participants question whether the Student's t-test is appropriate for comparing the players' abilities given their sample sizes and distributions.
  • One participant suggests using the TrueSkill system for evaluating player abilities.
  • Another participant argues that player A's higher batting average suggests superiority, but acknowledges the independence of events in batting.
  • Some participants highlight that the context of the leagues (high school vs. major leagues) and the quality of pitching must be considered when assessing true ability.
  • A participant proposes a controlled experiment to eliminate variability from pitching quality, suggesting that player B might be outperformed under such conditions.
  • One participant provides a statistical analysis, estimating the ranges of true talent for both players and indicating that player B could still potentially be better despite player A's higher average.
  • Another participant discusses the use of normal approximation to the binomial distribution to derive probabilities related to the players' performances.
  • Some participants debate the appropriateness of different statistical tests, including a difference of means test and a one-sided test for equal proportions.
  • One participant clarifies the distinction between frequentist and Bayesian interpretations of probability in the context of the players' abilities.

Areas of Agreement / Disagreement

Participants express differing views on the appropriate statistical methods to use, the impact of league differences on player ability, and the interpretation of probability in relation to the players' performances. There is no consensus on a single method or conclusion regarding which player is definitively better.

Contextual Notes

Participants mention various assumptions, such as equal quality pitching and the distribution of hits, which may affect the validity of their analyses. The discussion includes unresolved mathematical steps and varying interpretations of statistical results.

KenNKC
Messages
11
Reaction score
1
Let's say baseball player A gets 90 hits in 300 at bats, and player B gets 25 hits in 100 at bats. The true ability of both players is unknown. What is the probability that player A has a higher true ability than player B? Would the Student's t-test be used to calculate this? Thanks.
 
Physics news on Phys.org
KenNKC said:
Let's say baseball player A gets 90 hits in 300 at bats, and player B gets 25 hits in 100 at bats. The true ability of both players is unknown. What is the probability that player A has a higher true ability than player B? Would the Student's t-test be used to calculate this? Thanks.

Student's t test is used on sample sizes of less than thirty.

Other than that, hits of baseball players follow which distribution? Once you know that, you can look up the test in a statistics text.
 
KenNKC said:
Let's say baseball player A gets 90 hits in 300 at bats, and player B gets 25 hits in 100 at bats. The true ability of both players is unknown. What is the probability that player A has a higher true ability than player B? Would the Student's t-test be used to calculate this? Thanks.
Don't you think that's obvious enough Player A hits (3/10) player B (2.5/10), so player A wins B. (0.5/10) difference B~A.
Even so that probability of hitting a ball per pitch is 50% and every event is independent with one over another. There's no way B would surpass A.
 
If player B is batting .250 in the major leagues and player A is batting .300 in their high school league, then player B is still probably better. The true ability required to hit depends also on the true ability of the pitcher.

Even within the same league, the difficulty of the pitchers must be accounted for. I use a similar approach for my pinewood derby.
 
DaleSpam said:
If player B is batting .250 in the major leagues and player A is batting .300 in their high school league, then player B is still probably better. The true ability required to hit depends also on the true ability of the pitcher.

Even within the same league, the difficulty of the pitchers must be accounted for. I use a similar approach for my pinewood derby.

Say, if we have to defeat the variability of the true ability of the pitcher for instance, and use one pitching machine for the two (A&B) instead, would that counts? I think B is defeated most probably.
 
Sure. If you are performing a controlled experiment then you could control for that. The OP wasn't clear what conditions the data were acquired under. But they sound like regular game play rather than experimentation. In regular play the skill of the competition matters greatly.
 
Thanks for the replies. To clarify, I'm assuming both players faced equal quality pitching. Assuming a normal distribution, the true talent of player A with a 95% certainty is between .248 and .352. For player B the range is from .165 to .335. So player A's true talent could be .270 for example, and he just had a lucky 300 at bats. And player B's true talent could be .320 and he had an unlucky 100 at bats. So although Player A is more likely to be better than B based on the sample data, Player B could also be the better one. The part I am struggling with is coming up with a formula that gives this probability.
 
So the normal approximation to the binomial distribution is ##\mu= np## and ##\sigma=np(1-p)##. You can take the PDF for two such normal variables and simply multiply them together to get a joint PDF.

Once you have the joint PDF, then you just integrate in the region ##X>Y## to get the probability.

I think.
 
  • #10
Isn't this a difference of means test?
 
  • #11
You could do it that way, but it would probably be better to do a one sided test for equal proportions.

Either way would be much simpler than my last suggestion.
 
  • #12
Also, remember that saying, "With 95% certainty we cannot reject the hypothesis that A is better than B," is not the same as saying, "There is a 95% probability that A is better than B."
 
  • #13
The test for equal proportions did the trick. I have a book with results of problems of the same type as my example, but the method to derive them isn't included. The equal proportions test got the right answers. For my example, the probability is about 83%. Thanks for everyone's input.
 
  • #15
insightful said:
Also, remember that saying, "With 95% certainty we cannot reject the hypothesis that A is better than B," is not the same as saying, "There is a 95% probability that A is better than B."
I think you are alluding to Bayesian vs frequentist methods? A frequentist approach would give the probability of the data given the hypothesis. The Bayesian approach would give the probability of the hypothesis given the data.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 12 ·
Replies
12
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
1
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 9 ·
Replies
9
Views
6K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 9 ·
Replies
9
Views
4K
  • · Replies 8 ·
Replies
8
Views
3K