Register to reply 
Difference between sample standard deviation and population standard deviation? 
Share this thread: 
#1
Jun2412, 12:17 AM

P: 446

1. The problem statement, all variables and given/known data
Just as the title suggests, although this is more to do with the formula. I know that for a sample, it implies it's a subset of a population. Why in the formula do you divide by n1, whereas for calculating standard deviation for a population you divide by the total amount of elements in it? 


#2
Jun2412, 01:55 AM

Homework
Sci Advisor
HW Helper
Thanks
P: 12,974

Short Answer:
The idea is to use the sample to estimate the population statistics. Dividing by n1 gives you a better estimator for standard deviation. Longer Answer: The reason that n1 is used instead of n in the formula for the sample variance is as follows: The sample variance can be thought of as a random variable, i.e. a function which takes on different values for different samples of the same distribution. Its use is as an estimate for the true variance of the distribution. In statistics, one typically does not know the true variance; one uses the sample variance to ESTIMATE the true variance. Since the sample variance is a random variable, it usually has a mean, or average value. One would hope that this average value is close to the actual value that the sample variance is estimating, i.e. close to the true variance. In fact, if the n1 is used in the defining formula for the sample variance, then it is possible to prove that the average value of the sample variance EQUALS the true variance. If we replace the n1 by an n, then the average value of the sample variance is ((n1)/n) times as large as the true variance. A random variable X which is used to estimate a parameter p of a distribution is called an unbiased estimator if the expected value of X equals p. Thus, using the n1 gives an unbiased estimator of the variance of a distribution. 


#3
Jun2412, 02:17 AM

P: 446

OK, thanks for the response. I understand what is being said there in general. However I don't quite understand why n  1 is still used. It's saying if n  1 is used in the defining formula then it's possible to prove etc... I do not understand that part, can you give me a short numerical example?



#4
Jun2412, 11:16 PM

Homework
Sci Advisor
HW Helper
Thanks
P: 12,974

Difference between sample standard deviation and population standard deviation?
The usual exercise is to get the student to work out the distribution of sample variances.
But I think the confusion arises over the terms used vis: the "sample variance" is a technical terms that does not quite mean the same thing as "the variance of the sample", but the "population variance" is the same thing as "the variance of the population". The sample variance is an approximation to the population variance which is agreed upon by convention. The division by (n1) gives a better approximation than the division by n (which would have given you the variance of the sample). To see what they are doing: remember that the idea is to figure out what the population mean and variance is without actually polling the entire population. You could take a sample of 1000 out of a population of several million ... what can you say, in general, about the entire population, from such a small number? You could find the mean and variance for the sample ... OK. But if you took another sample of 1000 tomorrow you will very likely get a different mean and variance from them. If you take a lot of samples, and they are all random, then you get the meanvaluetheorum giving you a distribution of means and variances which are, themselves, normal distributions. If the population was normally distributed, then the mean of the means will get closer to the population mean as the number of samples increases but the mean of the variances (of the sample) will be bigger than the population variance. You should be able to confirm that by working them out for just three or four random normal variables. You should know how to add random distributions by now. What the passage quoted is saying is that if you define the sample variance to divide by (n1) it is more convenient for estimating the population variance which is what we are after. We don't have to do it that way, it's a convention. 


#5
Jun2512, 02:37 AM

P: 446

Thanks for that Simon, I have a clearer understanding now.



Register to reply 
Related Discussions  
Sample standard deviation proof  Calculus & Beyond Homework  2  
Standard deviation of a new sample  Set Theory, Logic, Probability, Statistics  1  
Sample size without standard Deviation  Set Theory, Logic, Probability, Statistics  6  
Standard Deviation of a sample of a population's means  Set Theory, Logic, Probability, Statistics  7  
Standard deviation revised by removing a sample  Introductory Physics Homework  7 