Standard deviation

  • I
  • Thread starter tzx9633
  • Start date
  • #1
I know that the standard deviation of sample = standrad deviation of population divided by sqrt(n) ...

However , in the following question , i don't know how to identify whether the standard deviation given is standard deviation of sample or standard deviation of population ... Can anyone help me to idenitify it ?

At a large university , the mean age of students is 22.3 years and the standard deviation is 4 years . A random sample of 64 students is drawn . What is the probability that the average of these students is greater than 23 years ?

Based on the author , the 4 given is standard deviation of population .
Th standard deviation of mean is 4 / sqrt(64) ..

Why is it so ?
I think it's wrong because we only picked 64 students out of the population , so the standard deviation we get is the standard deviation of sample , not the standard deviation of population
 

Answers and Replies

  • #2
Orodruin
Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
20,004
10,658
I know that the standard deviation of sample = standrad deviation of population divided by sqrt(n) ...
I think you better unlearn that. You may be thinking of the standard deviation in the estimate of the population mean.

The sample itself will (on average) have the same standard deviation as the population.

In addition, the question gives too little information to be solved. It also depends on how student ages are distibuted. The author may implicitly imply that the distribution is Gaussian, but I see no reason why this would be the case.
 
  • #3
I think you better unlearn that. You may be thinking of the standard deviation in the estimate of the population mean.

The sample itself will (on average) have the same standard deviation as the population.

In addition, the question gives too little information to be solved. It also depends on how student ages are distibuted. The author may implicitly imply that the distribution is Gaussian, but I see no reason why this would be the case.
Why ? Can you explain further ?
 
  • #4
I think you better unlearn that. You may be thinking of the standard deviation in the estimate of the population mean.

The sample itself will (on average) have the same standard deviation as the population.

In addition, the question gives too little information to be solved. It also depends on how student ages are distibuted. The author may implicitly imply that the distribution is Gaussian, but I see no reason why this would be the case.
Isn't that this case is we need to estimate the population standard deviation from sample standard deviation ??
 
  • #5
Orodruin
Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
20,004
10,658
Isn't that this case is we need to estimate the population standard deviation from sample standard deviation ??
No. This thread should really go in the homework section, where you need to show what you have done and explain your own thought about the problem in detail.
 
  • #6
No. This thread should really go in the homework section, where you need to show what you have done and explain your own thought about the problem in detail.
What i think is :

We only take some random smaple from the population , it's quite impossible to take the whole population ...
So , the standard deviation 4 means standard deviation of the sample ...Not standard deviation of population
 
  • #7
Orodruin
Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
20,004
10,658
What i think is :

We only take some random smaple from the population , it's quite impossible to take the whole population ...
So , the standard deviation 4 means standard deviation of the sample ...Not standard deviation of population
As I said, the standard deviation of the sample is typically going to be the same as that of the population. Do not confuse it with the standard deviation in the estimate of the mean.
 
  • #8
As I said, the standard deviation of the sample is typically going to be the same as that of the population. Do not confuse it with the standard deviation in the estimate of the mean.
Why the standard deviation of the population is going to be the same as the standard deviation of the sample ?
 
  • #9
Orodruin
Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
20,004
10,658
Because the standard deviation of a sample is defined in such a way that it is a good estimator for the standard deviation of the stochastic variable being sampled. This should be in any basic textbook on statistics.
 
  • #10
fresh_42
Mentor
Insights Author
2022 Award
17,796
18,969
The standard deviation of a sample is in general an estimator of the standard deviation of the random variable. Here is where we need additional information about its distribution, since it is a random variable again. Best case it is identically distributed. There are relations between the number a sample needs in dependency on required confidence intervals and the overall standard deviation. I haven't checked your figures and didn't see, how the students' ages are distributed. Maybe these information are sufficient to get the required equation.
 
Last edited:
  • #11
WWGD
Science Advisor
Gold Member
6,326
8,382
What i think is :

We only take some random smaple from the population , it's quite impossible to take the whole population ...
So , the standard deviation 4 means standard deviation of the sample ...Not standard deviation of population
I think it is assumed in the problem -- don't know how realistically -- that the population s.d is somehow known to be 4.
 
  • #12
StoneTemplePython
Science Advisor
Gold Member
1,260
597
The sample itself will (on average) have the same standard deviation as the population.

It's a technical point, but this is subtly wrong. On average your variance estimate should line up with that of the population. (Why? Because you have linearity with respect to independent samples/trials and estimating ##X^2## and ##X##, and can then apply weak law of large numbers.)

Setting aside the zero variance case (and assuming a finite second moment):

The square root function ##g(v) = \sqrt{v}## is strictly negative convex over positive numbers and which means ##E\big[g(v)\big] \lt g\big(E[v]\big)##. Hence if variance is right on average, standard deviation cannot be.

As a note to OP: this is one of many reasons to generally work with variance, not standard deviation.
 
  • #13
Orodruin
Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
20,004
10,658
It's a technical point, but this is subtly wrong. On average your variance estimate should line up with that of the population. (Why? Because you have linearity with respect to independent samples/trials and estimating X2X2X^2 and XXX, and can then apply weak law of large numbers.)
I disagree, there is nothing subtle about it. I would agree that it is just "wrong". I glossed over this point based on the OP level, but errors should be corrected.
 
  • #14
ChrisVer
Gold Member
3,381
464
The square root function g(v)=√vg(v) = \sqrt{v} is strictly negative convex over positive numbers and which means E[g(v)]<g(E[v])E\big[g(v)\big] \lt g\big(E[v]\big). Hence if variance is right on average, standard deviation cannot be.
Hmm, why do you use < instead of <= ? In which case I don't quite agree with the "cannot be" (definite statement), and instead would say "may not be".
 
  • #15
StoneTemplePython
Science Advisor
Gold Member
1,260
597
Hmm, why do you use < instead of <= ? In which case I don't quite agree with the "cannot be" (definite statement), and instead would say "may not be".

Look up strict convexity, its implications for use of Jensen's Inequality.
 
  • #16
ChrisVer
Gold Member
3,381
464
Look up strict convexity, its implications for use of Jensen's Inequality.
OK makes sense now, thanks.
 
  • #17
Stephen Tashi
Science Advisor
7,783
1,541
Relating statistical problems to the theory of probability is often difficult because of ambiguities in terminology. (These ambiguities are traditional in the field of statistics and not the fault of students.)

One difficulty is that a statistic computed from a "sample" of a given size can also be considered to be a "population".

If we consider the means of samples of size n to be a "population" , that population has a certain distribution. This distribution has its own standard deviation. With that interpretation, the statement:

I know that the standard deviation of sample = standard deviation of population divided by sqrt(n) ...

is correct if "standard deviation of sample" signifies "standard deviation of the population of sample means".

However, keep in mind that the phrases "sample standard deviation" and "standard deviation of the sample" can have other interpretations. They might refer one of the various formulae for estimating the standard deviation of the population from the data in a sample. (Have you studied "estimators" yet?) They might also refer to a single number, such as in the statement "the standard deviation of the sample was 13.97".

(As an example of the tangles caused by ambiguous terminology, see the old thread: https://www.physicsforums.com/threads/standard-deviation-in-excel.371424/ )

The standard deviation of mean is 4 / sqrt(64) ..

Why is it so ?
I think it's wrong because we only picked 64 students out of the population , so the standard deviation we get is the standard deviation of sample , not the standard deviation of population

By itself, the term "mean" is ambiguous. It might signify the mean of the distribution of the ages of students or it might refer to the the mean of the population of means of samples of size n. You are correct that 4/sqrt(64) is not the standard deviation of the distribution of individual student's ages. The author is correct that 4/sqrt(64) is the standard deviation of the population of means of samples of size 64.
 
  • #18
Number Nine
813
25
The sample itself will (on average) have the same standard deviation as the population.

The "sample standard deviation" (the sum of squared deviations from the mean, divided by n) is a biased estimator of the population SD. It's actually very difficult to construct an unbiased estimator of the standard deviation, even for a normal distribution.
 
  • #19
Orodruin
Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
20,004
10,658
The "sample standard deviation" (the sum of squared deviations from the mean, divided by n) is a biased estimator of the population SD. It's actually very difficult to construct an unbiased estimator of the standard deviation, even for a normal distribution.
This was already discussed and sorted out. See posts #12 and #13.
 

Suggested for: Standard deviation

  • Last Post
Replies
2
Views
268
Replies
4
Views
620
  • Last Post
Replies
6
Views
1K
Replies
5
Views
617
Replies
5
Views
1K
  • Last Post
Replies
4
Views
447
Replies
1
Views
1K
Replies
4
Views
750
Replies
1
Views
716
Replies
24
Views
2K
Top