Statistics Formula: Where did it come from?

  • Context: High School 
  • Thread starter Thread starter Learning Curve
  • Start date Start date
  • Tags Tags
    Formula Statistics
Click For Summary
SUMMARY

The discussion centers on the derivation of the standard deviation formula in statistics, specifically addressing the formula's components and the significance of using (n-1) in the calculation. The mathematical expectation of s² represents the theoretical variance, while the standard deviation is calculated using the formula sqrt((n/(n-1))mean((x-mean(x))²). This adjustment with (n-1) accounts for sample variance, ensuring an unbiased estimate of the population variance. The user is encouraged to explore the concept of variance further through the provided Wikipedia link.

PREREQUISITES
  • Understanding of basic statistics concepts, including mean and variance
  • Familiarity with mathematical notation and operations
  • Knowledge of sample vs. population statistics
  • Basic algebra skills for manipulating formulas
NEXT STEPS
  • Study the derivation of the sample variance formula in detail
  • Learn about the Central Limit Theorem and its implications for statistics
  • Explore the differences between population variance and sample variance
  • Investigate the role of degrees of freedom in statistical calculations
USEFUL FOR

Students in AP Statistics, educators teaching statistical concepts, and anyone seeking to deepen their understanding of variance and standard deviation in data analysis.

Learning Curve
Messages
116
Reaction score
0
Last edited by a moderator:
Physics news on Phys.org
This may not answer your question completely. However the mathematical expectation (theoretical average) of s2 is the theoretical variance of the statistical average.
 
This will not be completely answer it but
In statistics the idea is to get a picture of how lots of numbers act by using a few numbers. The first statistic often used is the mean
mean=sum/number
If our data has mean 0 we might like to know are all the numbers zero, most, maybe half are 1000000000 and half -1000000000. We want an idea of spreadoutness. so we consider
mean(x-mean(x))
but it is zero we cure that with
mean((x-mean(x))^2)
but we are using n numbers like n+1 (mean(x) depends on x hence is not its own number)
so we do
(n/(n-1))mean((x-mean(x))^2)
but it is squarey so
sqrt((n/(n-1))mean((x-mean(x))^2))
which is the standard deviation we know and love
 
That helps thx. I now get most of the formula except why is it (n-1)?
 

Similar threads

  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 26 ·
Replies
26
Views
4K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K