Statistics: How to find mean of bins

  • Thread starter Thread starter Niles
  • Start date Start date
  • Tags Tags
    Mean Statistics
Click For Summary
SUMMARY

To calculate the mean of binned data, one must use the formula mean = (1/N) * Σ(n_j * x_j), where n_j represents the number of data points in each bin and x_j is the midpoint of each bin. For the bins provided (20-29, 30-39, 40-49, 50-59), the midpoints are 24.5, 34.5, 44.5, and 54.5, respectively. The midpoint is crucial as it represents a typical value within each bin, ensuring accurate calculations without over- or under-estimation. This method effectively transforms the binned data into a weighted mean problem.

PREREQUISITES
  • Understanding of binned data and its representation
  • Familiarity with the concept of midpoints in statistics
  • Knowledge of weighted mean calculations
  • Basic algebra for manipulating summation formulas
NEXT STEPS
  • Research the concept of midpoints in statistical analysis
  • Learn about weighted averages and their applications
  • Explore methods for handling binned data in statistical software
  • Study the implications of bin size on statistical results
USEFUL FOR

Students, statisticians, and data analysts who are working with binned data and need to accurately calculate means for analysis and reporting.

Niles
Messages
1,834
Reaction score
0

Homework Statement


Hi

Say I have the following bin sizes, where the number in paranthesis is the amount of data points contained in the bin:

20-29 : (2)
30-39 : (7)
40-49 : (12)
50-59 : (14)

How would I go about and find the mean for this binned data? I know that I should use

<br /> mean = \frac{1}{N}\sum\limits_j {n_j x_j },<br />

where bin j corresponds to a value xj and contains nj elements. But in my case, what are the values of the bins?
 
Physics news on Phys.org
You usually treat this as a weighted mean problem. Think this way: if you needed to select one value from inside each bin, what value (intuitively) would be the one to pick if you didn't want to over- or under-estimate typical values in the bin? That's the value you use for x.
 
statdad said:
You usually treat this as a weighted mean problem. Think this way: if you needed to select one value from inside each bin, what value (intuitively) would be the one to pick if you didn't want to over- or under-estimate typical values in the bin? That's the value you use for x.

I would use the average value of the data samples in that particular bin. Would you also do that?
 
Niles said:
I would use the average value of the data samples in that particular bin. Would you also do that?

No - you need to use a number that comes from the bins, not the collected data.
 
Then the average of the bin-size, i.e. for 20-29 it is 24.5?
 
Yes - it's called the midpoint of the bin.
 
Thanks, it is kind of you to help me.

Best wishes,
Niles.
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
6
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K