# How to decide between using the z and t tests?

• B
• songoku
In summary: Yes. Suppose you have a random variable, X, which is uniformly distributed, then you can also form a random variable, Y, which is the mean of 5 samples of X. Then it would be appropriate to use a t test on Y, but not on X.By "uniformly distributed", you mean that the distribution of X is the same for all samples?Yes. Suppose you have a random variable, X, which is uniformly distributed, then you can also form a random variable, Y, which is the mean of 5 samples of X. Then it would be appropriate to use a t test on Y, but not on X.Thank you so much for clarifying that!

#### songoku

Let say math score for students in a class is normally distributed. The teacher wants to check whether the average score of the students is below 6. Five students chosen at random and their math scores noted.

For case above, is z - test or t - test more appropriate? The population is normally distributed, then the sample also nornally distributed so z - test is used? Or because the variance of population is unknown and the sample size is small then t - test is more appropriate?

Thanks

Do you know the variance of the population or are you estimating it from the sample

songoku
Dale said:
Do you know the variance of the population or are you estimating it from the sample

I estimating it from sample. In this case, I should use t - rest because variance of population is unknown? Can I use z - test because the sample and population are normally distributed even though the number of sample is small?

Thanks

songoku said:
I estimating it from sample. In this case, I should use t - rest because variance of population is unknown?
Yes

songoku said:
Can I use z - test because the sample and population are normally distributed even though the number of sample is small?
No. The sample variance for the t test is not the same as the population variance for the z test.

songoku
Dale said:
No. The sample variance for the t test is not the same as the population variance for the z test.

So I can say that the factor to choose between z - test or t - test is whether the population variance is known or not? If population variance is known, I use z - test and if not, I use t - test, independent of the type of distribution of the population?

Thanks

Dale
songoku said:
So I can say that the factor to choose between z - test or t - test is whether the population variance is known or not? If population variance is known, I use z - test and if not, I use t - test, independent of the type of distribution of the population?

Thanks
Yes, almost. They both need to have a normal distribution. The z test is for a normal distribution with known variance and the t test is for normal distribution with unknown variance.

jim mcnamara and songoku
Dale said:
Yes, almost. They both need to have a normal distribution. The z test is for a normal distribution with known variance and the t test is for normal distribution with unknown variance.

If the population is not normally distributed, t - test and z - test both can not be used to test the mean and we should use another type of test?

Thanks

songoku said:
If the population is not normally distributed, t - test and z - test both can not be used to test the mean and we should use another type of test?
Yes, that is correct

songoku
Thank you very much

Dale
songoku said:
If the population is not normally distributed, t - test and z - test both can not be used to test the mean and we should use another type of test?

Let's hope that is not a general principle because we see the t-test used by respectable people on populations that aren't normally distsributed when large samples are involved.

According to the current Wikipedia article for the Student's T-test, it is the sample mean that is assumed to be normally distributed, not the population itself. If you have a large sample size, the sample mean will have an approximately normal distribution.

When not to use a t-test is discussed on the blog http://thestatsgeek.com/2013/09/28/the-t-test-and-robustness-to-non-normality/

(I, myself, haven't tried to verify the Wikipedia article or the blog. )

songoku
Stephen Tashi said:
Let's hope that is not a general principle because we see the t-test used by respectable people on populations that aren't normally distsributed when large samples are involved
It is a general principle, and such tests can indeed be misused. But ...

Stephen Tashi said:
it is the sample mean that is assumed to be normally distributed, not the population itself.
Yes. Suppose you have a random variable, X, which is uniformly distributed, then you can also form a random variable, Y, which is the mean of 5 samples of X. Then it would be appropriate to use a t test on Y, but not on X.

songoku
Stephen Tashi said:
According to the current Wikipedia article for the Student's T-test, it is the sample mean that is assumed to be normally distributed, not the population itself. If you have a large sample size, the sample mean will have an approximately normal distribution.
Ah I see, so that's what central limit theorem is for.

Dale said:
Yes. Suppose you have a random variable, X, which is uniformly distributed, then you can also form a random variable, Y, which is the mean of 5 samples of X. Then it would be appropriate to use a t test on Y, but not on X.

By "uniformly distributed", do you mean normally distributed or it can be any type of distributions?

Thanks

songoku said:
Ah I see, so that's what central limit theorem is for.
By "uniformly distributed", do you mean normally distributed or it can be any type of distributions?

Thanks

He meant that it follows the uniform distribution.

songoku and Dale
songoku said:
By "uniformly distributed", do you mean normally distributed or it can be any type of distributions?
The uniform distribution is not the normal distribution. The normal distribution is a Gaussian curve. The uniform distribution is a rectangle. But yes, it was just an example of a non-normal distribution

songoku and jim mcnamara
I am really sorry for late reply.

Dale said:
Yes. Suppose you have a random variable, X, which is uniformly distributed, then you can also form a random variable, Y, which is the mean of 5 samples of X. Then it would be appropriate to use a t test on Y, but not on X.
Dale said:
The uniform distribution is not the normal distribution. The normal distribution is a Gaussian curve. The uniform distribution is a rectangle. But yes, it was just an example of a non-normal distribution

So X is uniformly distributed and Y is mean of 5 samples of X. Will Y also have uniform distribution? If yes, it means that we can use t - test for other distribution besides normal?

Thanks

songoku said:
I am really sorry for late reply.

So X is uniformly distributed and Y is mean of 5 samples of X. Will Y also have uniform distribution? If yes, it means that we can use t - test for other distribution besides normal?

Thanks

Y would have the Irwin-Hall distribution (given that the Xi's are truly independent). And yes, you can use it for non-normal samples, as long as its mean is. For this to happen, usually it has to be a random sample (i.e. a sample made of independent random variables), with identically distributed random variables and the sample size has to be >= 30, so that the Central Limit Theorem applies (approximately). There are also other conditions such as the random variables having 1st and 2nd order moments (expected value and variance), both of which could be violated if you were dealing with distributions such as Cauchy for example. This is the simpler version of the CLT: there are also other versions of the CLT which don't require some of these conditions, but they're replaced with different ones instead.

Last edited:
songoku
songoku said:
Will Y also have uniform distribution?
No, Y will be approximately normally distributed. It will not be uniformly distributed

songoku
ZeGato said:
Y would have the Irwin-Hall distribution (given that the Xi's are truly independent). And yes, you can use it for non-normal samples, as long as its mean is. For this to happen, usually it has to be a random sample (i.e. a sample made of independent random variables), with identically distributed random variables and the sample size has to be >= 30, so that the Central Limit Theorem applies (approximately). There are also other conditions such as the random variables having 1st and 2nd order moments (expected value and variance), both of which could be violated if you were dealing with distributions such as Cauchy for example. This is the simpler version of the CLT: there are also other versions of the CLT which don't require some of these conditions, but they're replaced with different ones instead.
Dale said:
No, Y will be approximately normally distributed. It will not be uniformly distributed

Sorry I am confused. Y would have Irwin - Hall distribution or normal distribution?

Suppose I have a random variable X, which has poisson distribution, and Y, which is the mean of 5 samples of X. Will Y have normal distribution or poisson distribution?

Thanks

songoku said:
Sorry I am confused. Y would have Irwin - Hall distribution or normal distribution?
It will have a Bates distribution which is approximately normal for large n.

songoku
songoku said:
So X is uniformly distributed and Y is mean of 5 samples of X. Will Y also have uniform distribution?

Have you studied how to compute the probability distribution of the sum of two independent random variables in terms of their individual distributions? You need to understand the fundamental facts about that situation in order to understand the facts about the distribution of a sample mean.

songoku
Dale said:
It will have a Bates distribution which is approximately normal for large n.

By "for large n", you mean so that CLT can be applied?

Stephen Tashi said:
Have you studied how to compute the probability distribution of the sum of two independent random variables in terms of their individual distributions? You need to understand the fundamental facts about that situation in order to understand the facts about the distribution of a sample mean.
I am not sure what you mean. What I have learned is linear combination of random variables such as:

E(aX +b) = aE(X) + b
E(X + Y) = E(X) + E(Y)
Var(aX + b) = a2Var(X)

songoku said:
I am not sure what you mean.

The distribution of the sum of two independent random variables is computed by taking the "convolution" of their individual distributions. Perhaps you haven't studied that yet.

songoku
Stephen Tashi said:
The distribution of the sum of two independent random variables is computed by taking the "convolution" of their individual distributions. Perhaps you haven't studied that yet.

No i haven't studied about that yet so it means there are some parts I won't be able to understand right now. What parts are they?

Thanks

songoku said:
No i haven't studied about that yet so it means there are some parts I won't be able to understand right now. What parts are they?

Thanks

You won't be able to understand how the distribution of the sample mean is related to the distribution of the population from which the samples are taken. So you won't understand which statistical tests can be applied to the sample mean.

songoku
Stephen Tashi said:
You won't be able to understand how the distribution of the sample mean is related to the distribution of the population from which the samples are taken. So you won't understand which statistical tests can be applied to the sample mean.

I see. I always think like this: if X is random variable and Y is mean of sample taken from X, Y will always have same distribution as X. So this is wrong?

Thanks

Of course there are alternatives which you can use also for non-normally distributed data like Mann-Wilcoxon u-test.

songoku
songoku said:
I always think like this: if X is random variable and Y is mean of sample taken from X, Y will always have same distribution as X. So this is wrong?
Yes. Consider X is the roll of a standard dice. So the distribution is 1/6 probability to get 1,2,3,4,5,6 and 0 probability to get anything else. If Y is the mean of two samples of X then the most likely outcome is that Y=3.5. 3.5 has 0 probability for X and has the highest probability for Y, so clearly the distribution is different.

songoku
I think I get it for now.

Thank you very much for all the help (Dale, Stephen Tashi, ZeGato, DrDu)

## 1. What is the difference between the z-test and t-test?

The main difference between the z-test and t-test is the type of data that they are used for. The z-test is used for comparing means when the population standard deviation is known, while the t-test is used when the population standard deviation is unknown. Additionally, the t-test is more suitable for smaller sample sizes.

## 2. When should I use a z-test?

A z-test should be used when the population standard deviation is known, the sample size is large (typically over 30), and the data is normally distributed. This ensures that the assumptions of the z-test are met and the results will be accurate.

## 3. When should I use a t-test?

A t-test should be used when the population standard deviation is unknown, the sample size is small (typically less than 30), and the data is normally distributed. This is because the t-test is more robust for smaller sample sizes and does not require knowledge of the population standard deviation.

## 4. Can I use a z-test or t-test for non-parametric data?

No, the z-test and t-test are both parametric tests that require the data to be normally distributed. If the data is not normally distributed, non-parametric tests such as the Wilcoxon rank-sum test or Mann-Whitney U test should be used instead.

## 5. How do I determine which test is appropriate for my data?

The choice between a z-test and t-test depends on the characteristics of your data, specifically the sample size and whether the population standard deviation is known. It is important to assess these factors and choose the appropriate test to ensure accurate results.