I Standard Deviation Versus Sample Size & T-Distribution

AI Thread Summary
The discussion centers on the relationship between standard deviation, sample size, and the t-distribution. As the degree of freedom increases with larger sample sizes, the standard deviation of the t-distribution decreases, leading to more accurate estimates of the population standard deviation. Using the sample mean and dividing by n can result in an underestimation of the population standard deviation, while using (n-1) applies Bessel's correction to provide a better estimate. Additionally, the population standard deviation calculated using n-1 is still considered biased due to Jensen's inequality, complicating the search for an unbiased estimator. Overall, understanding these nuances is crucial for accurate statistical analysis.
OpheliaM
Messages
7
Reaction score
1
I don't understand why does the standard deviation of a t-Distribution decreases as the degree of freedom (and, thus, also the sample size) increases when the sample standard deviation underestimates the population standard deviation?
 
Physics news on Phys.org
OpheliaM said:
I don't understand why does the standard deviation of a t-Distribution decreases as the degree of freedom (and, thus, also the sample size) increases
More data tends to give a more accurate estimate of the true population standard deviation.
when the sample standard deviation underestimates the population standard deviation?
The sample standard deviation underestimates the population standard deviation if you use the sample mean and divide by n. If you use the true population mean and divide by n or use the sample mean and divide by (n-1) that is not true.(CORRECTION: it is still under-estimated. See @Number Nine 's post below) For the degree of the t-distribution, you should use the n or (n-1) that you divided by.

PS. Just to be more clear. The sample mean should always be the sum of the sample divided by n. When I say "use the sample mean and divide by (n-1)", I mean that the sum of squares of deviations from the sample mean are divided by (n-1). That is Bessel's correction. (see https://en.wikipedia.org/wiki/Bessel's_correction )
 
Last edited:
FactChecker said:
More data tends to give a more accurate estimate of the true population standard deviation.The sample standard deviation underestimates the population standard deviation if you use the sample mean and divide by n. If you use the true population mean and divide by n or use the sample mean and divide by (n-1) that is not true. For the degree of the t-distribution, you should use the n or (n-1) that you divided by.

PS. Just to be more clear. The sample mean should always be the sum of the sample divided by n. When I say "use the sample mean and divide by (n-1)", I mean that the sum of squares of deviations from the sample mean are divided by (n-1). That is Bessel's correction. (see https://en.wikipedia.org/wiki/Bessel's_correction )

A minor point: the "population standard deviation" (i.e. the square root of the sum of squared deviations from the mean, divided by n-1) is actually a biased estimate of the standard deviation. This follows from Jensen's inequality, since the square root is a concave function. It's fairly difficult to find an unbiased estimator of a normal standard deviation, and the corrections have no closed form -- see https://en.wikipedia.org/wiki/Unbia...deviation#Results_for_the_normal_distribution
 
  • Like
Likes FactChecker
Number Nine said:
A minor point: the "population standard deviation" (i.e. the square root of the sum of squared deviations from the mean, divided by n-1) is actually a biased estimate of the standard deviation. This follows from Jensen's inequality, since the square root is a concave function. It's fairly difficult to find an unbiased estimator of a normal standard deviation, and the corrections have no closed form -- see https://en.wikipedia.org/wiki/Unbia...deviation#Results_for_the_normal_distribution
I stand corrected. Thanks. I will correct my prior post.
 
I'm taking a look at intuitionistic propositional logic (IPL). Basically it exclude Double Negation Elimination (DNE) from the set of axiom schemas replacing it with Ex falso quodlibet: ⊥ → p for any proposition p (including both atomic and composite propositions). In IPL, for instance, the Law of Excluded Middle (LEM) p ∨ ¬p is no longer a theorem. My question: aside from the logic formal perspective, is IPL supposed to model/address some specific "kind of world" ? Thanks.
I was reading a Bachelor thesis on Peano Arithmetic (PA). PA has the following axioms (not including the induction schema): $$\begin{align} & (A1) ~~~~ \forall x \neg (x + 1 = 0) \nonumber \\ & (A2) ~~~~ \forall xy (x + 1 =y + 1 \to x = y) \nonumber \\ & (A3) ~~~~ \forall x (x + 0 = x) \nonumber \\ & (A4) ~~~~ \forall xy (x + (y +1) = (x + y ) + 1) \nonumber \\ & (A5) ~~~~ \forall x (x \cdot 0 = 0) \nonumber \\ & (A6) ~~~~ \forall xy (x \cdot (y + 1) = (x \cdot y) + x) \nonumber...
Back
Top