Standard Deviation Versus Sample Size & T-Distribution

Click For Summary

Discussion Overview

The discussion revolves around the relationship between standard deviation, sample size, and the t-distribution. Participants explore the implications of sample size on the accuracy of standard deviation estimates and the conditions under which sample standard deviation may underestimate population standard deviation.

Discussion Character

  • Technical explanation
  • Debate/contested

Main Points Raised

  • Some participants express confusion about why the standard deviation of a t-distribution decreases with increasing degrees of freedom and sample size, despite the sample standard deviation potentially underestimating the population standard deviation.
  • It is noted that more data generally leads to a more accurate estimate of the true population standard deviation.
  • Participants discuss the conditions under which the sample standard deviation underestimates the population standard deviation, particularly when using the sample mean and dividing by n.
  • There is mention of Bessel's correction, where the sum of squares of deviations from the sample mean is divided by (n-1) to provide an unbiased estimate.
  • A minor point is raised regarding the population standard deviation being a biased estimate, referencing Jensen's inequality and the challenges in finding an unbiased estimator.
  • One participant acknowledges a correction to their prior statement, indicating a willingness to refine their understanding based on the discussion.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the implications of the sample standard deviation and its relationship to the population standard deviation. Multiple viewpoints and clarifications are presented, indicating ongoing debate and exploration of the topic.

Contextual Notes

Limitations include the dependence on definitions of standard deviation and the conditions under which estimators are considered biased or unbiased. The discussion does not resolve the complexities surrounding these definitions and their implications.

OpheliaM
Messages
7
Reaction score
1
I don't understand why does the standard deviation of a t-Distribution decreases as the degree of freedom (and, thus, also the sample size) increases when the sample standard deviation underestimates the population standard deviation?
 
Physics news on Phys.org
OpheliaM said:
I don't understand why does the standard deviation of a t-Distribution decreases as the degree of freedom (and, thus, also the sample size) increases
More data tends to give a more accurate estimate of the true population standard deviation.
when the sample standard deviation underestimates the population standard deviation?
The sample standard deviation underestimates the population standard deviation if you use the sample mean and divide by n. If you use the true population mean and divide by n or use the sample mean and divide by (n-1) that is not true.(CORRECTION: it is still under-estimated. See @Number Nine 's post below) For the degree of the t-distribution, you should use the n or (n-1) that you divided by.

PS. Just to be more clear. The sample mean should always be the sum of the sample divided by n. When I say "use the sample mean and divide by (n-1)", I mean that the sum of squares of deviations from the sample mean are divided by (n-1). That is Bessel's correction. (see https://en.wikipedia.org/wiki/Bessel's_correction )
 
Last edited:
FactChecker said:
More data tends to give a more accurate estimate of the true population standard deviation.The sample standard deviation underestimates the population standard deviation if you use the sample mean and divide by n. If you use the true population mean and divide by n or use the sample mean and divide by (n-1) that is not true. For the degree of the t-distribution, you should use the n or (n-1) that you divided by.

PS. Just to be more clear. The sample mean should always be the sum of the sample divided by n. When I say "use the sample mean and divide by (n-1)", I mean that the sum of squares of deviations from the sample mean are divided by (n-1). That is Bessel's correction. (see https://en.wikipedia.org/wiki/Bessel's_correction )

A minor point: the "population standard deviation" (i.e. the square root of the sum of squared deviations from the mean, divided by n-1) is actually a biased estimate of the standard deviation. This follows from Jensen's inequality, since the square root is a concave function. It's fairly difficult to find an unbiased estimator of a normal standard deviation, and the corrections have no closed form -- see https://en.wikipedia.org/wiki/Unbia...deviation#Results_for_the_normal_distribution
 
  • Like
Likes   Reactions: FactChecker
Number Nine said:
A minor point: the "population standard deviation" (i.e. the square root of the sum of squared deviations from the mean, divided by n-1) is actually a biased estimate of the standard deviation. This follows from Jensen's inequality, since the square root is a concave function. It's fairly difficult to find an unbiased estimator of a normal standard deviation, and the corrections have no closed form -- see https://en.wikipedia.org/wiki/Unbia...deviation#Results_for_the_normal_distribution
I stand corrected. Thanks. I will correct my prior post.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 24 ·
Replies
24
Views
6K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 28 ·
Replies
28
Views
3K
  • · Replies 31 ·
2
Replies
31
Views
4K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 7 ·
Replies
7
Views
4K