Normal Distribution v. Student's T Distribution

AI Thread Summary
The Empirical Rule indicates that for normally distributed data, 95.45% of values lie within two standard deviations of the mean, without considering sample size. In contrast, the Student's T Distribution adjusts for sample size, showing that for smaller samples, the critical values differ, such as 2.14 standard deviations for N=20. The key difference lies in the assumption that the normal distribution presumes known parameters, while the Student's T Distribution accounts for unknown population parameters, particularly when estimating the mean and standard deviation from sample data. This distinction is crucial in practical statistical applications, especially when sample sizes are small. Understanding these differences is essential for accurate statistical analysis and interpretation.
kimberley
Messages
14
Reaction score
0
The "Empirical Rule" states that if your data is normally distributed, 95.45% of that data should fall within "2" standard deviations of your Mean. There doesn't appear to be any reference to sample size in the literature regarding the Empirical Rule and a Normal Distribution.

By contrast, however, the Student's T Distribution table, for a two-tailed test, has multipliers that differ from the Empirical Rule. Although where N=10000, at 9999 degrees of freedom, the .0455 level is "2" sd like the Empirical Rule, where N=20, at 19 degrees of freedom, the .0455 level is "2.14" sd.

In sum, then, I don't understand the difference between the "normal distribution" and the "Student's T-Distribution". Is the difference that the Empirical Rule assumes that your data is both normal and "stationary" whereas the Student's T Distribution (i.e., degrees of freedom) assumes that your data is not stationary and that your Mean and Standard Deviations for any period of N will shift with the addition of new data? It's the only thing I can think of since the formulas for confidence intervals for Means and prediction intervals for individual outcomes use the numbers from the Student's T-Distribution.

Thanks in advance.

Kimberley
 
Physics news on Phys.org
Wikipedia said:
Student's distribution arises when (as in nearly all practical statistical work) the population standard deviation is unknown and has to be estimated from the data. Textbook problems treating the standard deviation as if it were known are of two kinds: (1) those in which the sample size is so large that one may treat a data-based estimate of the variance as if it were certain, and (2) those that illustrate mathematical reasoning, in which the problem of estimating the standard deviation is temporarily ignored because that is not the point that the author or instructor is then explaining.
http://en.wikipedia.org/wiki/T_distribution
http://en.wikipedia.org/wiki/Normal_distribution
 
I believe that the Student distribution does not assume that the sample mean is the true (underlying) mean. So it is not just the variance or SD that is taken from the data, and I would say that the fact that the sample mean is used is more important than that the sample standard deviation is estimated from the data.
 
Back
Top