Why can t statistic deal with small numbers ?

  • Context: Undergrad 
  • Thread starter Thread starter thrillhouse86
  • Start date Start date
  • Tags Tags
    Numbers Statistic
Click For Summary
SUMMARY

The discussion clarifies the distinction between t-tests and z-tests in statistical analysis, emphasizing that t-tests are appropriate for small sample sizes due to their reliance on sample estimates of population variance, while z-tests assume known population variance and require larger samples for normal distribution approximation. The historical context provided highlights that Gossett developed the t-procedure to address discrepancies observed with small samples in normal distribution methods. The central limit theorem confirms that as sample sizes increase, the estimate of population variance approaches the true population variance, validating the use of t-tests for smaller samples.

PREREQUISITES
  • Understanding of basic statistical concepts, including hypothesis testing
  • Familiarity with t-distribution and z-distribution
  • Knowledge of sample mean and standard deviation calculations
  • Awareness of the central limit theorem
NEXT STEPS
  • Study the derivation and application of the t-test and z-test in statistical analysis
  • Explore the implications of the central limit theorem in practical scenarios
  • Learn about confidence intervals and their calculation using both t and z distributions
  • Investigate the historical development of statistical methods, focusing on Gossett's contributions
USEFUL FOR

Statisticians, data analysts, researchers, and students seeking to deepen their understanding of statistical testing methods and their appropriate applications based on sample size.

thrillhouse86
Messages
77
Reaction score
0
Hi,

I've been trying to get my head around z and t statistics. and I almost have a matra in my head that "when the sample are small, use the t test, when the samples are big, use either the t or the z test".

Now As I understand it, the z test requires a large number of samples, because it assumes you have a normal distribution, and you need a certain number of samples before your samples will start to look like a normal distribution.

But Why does the t test allow us to deal with smaller samples ? what does it have (or what assumptions doesn't it have) which allow us to deal with smaller samples ?

Is it that in the z test the standard error of the mean distribution is determined from the KNOWN population variance, whereas the t test the standard error of the mean distribution is determined from the ESTIMATE of the population variance, and in the limit of large number of samples the ESTIMATE of the population variance will approach the TRUE population variance ?

If this is indeed the case, does the central limit theorem show us that in the limit of a large number of samples the estimate of the population variance will approach the true population variance ?

Thanks
 
Last edited:
Physics news on Phys.org
""when the sample are small, use the t test, when the samples are big, use either the t or the z test"." is really old advice, originated before calculators and even computers brought cheap and easy calculations to us.

The origin of the idea: Gossett (the developer of the t-procedure) found that when sample sizes were small, estimates based on the normal distribution (the Z-test and Z-confidence interval) gave results that didn't match experimental observations. (Statistics using th sample mean and standard deviation were more variable for small samples than the normal distribution "expected".) Methods based on the "t"-distribution were developed empirically to circumvent this. By the time samples sizes were around 30, results from the two procedures were in general agreement, and this observation grew into the "small sample size" vs "large sample size" distinction. That was convenient: suppose you always create a 95% confidence interval. When you use the t-distribution intervals, you need a different critical value for each sample size, and many years ago this required tables. When you use the Z-distribution intervals, a single critical value works no matter what the sample size.

"Is it that in the z test the standard error of the mean distribution is determined from the KNOWN population variance"
It doesn't have to be - that really was the point of the distinction. If you are performing a hypothesis test, and all you have are the sample size, sample mean, and sample standard deviation, the form of the test statistic is

<br /> \frac{\overline x - \mu_0}{\frac s {\sqrt{n}}}<br />

For "small samples" you would compare this to critical values from the appropriate t-distribution: for "large samples" you compare it to a value from the standard normal distribution.

If you actually had the true population standard devation, and you were sure the underlying population were normal , then the test statistic would be

<br /> \frac{\overline x - \mu_0}{\frac{\sigma}{\sqrt n}}<br />

and you would compare it to a critical value from the normal distribution.

"If this is indeed the case, does the central limit theorem show us that in the limit of a large number of samples the estimate of the population variance will approach the true population variance ?"

Yes.
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 27 ·
Replies
27
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
5K
Replies
5
Views
6K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 7 ·
Replies
7
Views
6K
  • · Replies 17 ·
Replies
17
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K