Comparing two averages with different group sizes

  • Context: Undergrad 
  • Thread starter Thread starter Mohammad
  • Start date Start date
  • Tags Tags
    Group
Click For Summary
SUMMARY

This discussion centers on comparing averages of two groups with differing sizes, specifically using the Binomial distribution and Maximum Likelihood Estimation (MLE) to evaluate probabilities. The example provided contrasts Driver A, who wins 65 out of 161 races, with Driver B, who wins 68 out of 244 races. Despite similar win counts, the probabilities calculated indicate a counterintuitive result favoring Driver B, prompting a need for further analysis of the parameters used. The conversation highlights the importance of considering variance weighting when comparing groups of unequal sizes.

PREREQUISITES
  • Understanding of Binomial distribution
  • Familiarity with Maximum Likelihood Estimation (MLE)
  • Knowledge of probability theory
  • Basic statistics concepts, including averages and variances
NEXT STEPS
  • Research how to apply variance weighting in statistical comparisons
  • Learn about alternative statistical tests for comparing proportions
  • Explore the implications of sample size on statistical significance
  • Investigate advanced topics in probability distributions
USEFUL FOR

Statisticians, data analysts, and researchers interested in comparing group averages, particularly in scenarios involving unequal sample sizes and probability assessments.

Mohammad
Messages
4
Reaction score
0
Hi everyone,

I have been recently intrigued by a seemingly simple problem: How to compare the averages of two groups with different sizes.

For example: Suppose you have a driver A who wins 100 out of 200 races, and a driver B who wins 1 out 2 races. It is clear that although the average is the same, driver A's achievement is less likely to occur (so it can be considered more valuable?).

I worked out a solution based on the Binomial distribution with the MLE for each driver as the parameter.

Pr(X = 100|1/2) = 0.0563 (N = 200)
Pr(X = 1|1/2) = 0.5 (N = 2)

The results matches my expectation as it indicates that the first event is less likely to occur. The problem however comes when I have a situation like this:

Driver A wins 65 out of 161 races.
Driver B wins 68 out of 244 races.

By evaluating the probabilities in the same way I got:

Pr(X = 65|65/161) = 0.0640
Pr(X = 68|68/244) = 0.0569

Intuitively, I reject this result because it is clear that driver A did a better job (because both drivers won almost the same number of races). I know it is probably because of the parameter I am using, but I don't know how to fix it.

Any thoughts?
 
Physics news on Phys.org
http://geography.uoregon.edu/GeogR/topics/ttest.htm Different group sizes are factored into the problem through the weighting of the variances across the two groups.
 
Last edited by a moderator:

Similar threads

  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 11 ·
Replies
11
Views
4K
  • · Replies 53 ·
2
Replies
53
Views
8K
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K