How to decide between using the z and t tests?

songoku · Sep 12, 2018

Let say math score for students in a class is normally distributed. The teacher wants to check whether the average score of the students is below 6. Five students chosen at random and their math scores noted.

For case above, is z - test or t - test more appropriate? The population is normally distributed, then the sample also nornally distributed so z - test is used? Or because the variance of population is unknown and the sample size is small then t - test is more appropriate?

Thanks

Dale · Sep 12, 2018

Do you know the variance of the population or are you estimating it from the sample

songoku · Sep 12, 2018

Dale said:

Do you know the variance of the population or are you estimating it from the sample

I estimating it from sample. In this case, I should use t - rest because variance of population is unknown? Can I use z - test because the sample and population are normally distributed even though the number of sample is small?

Thanks

Dale · Sep 12, 2018

songoku said:

I estimating it from sample. In this case, I should use t - rest because variance of population is unknown?

Yes

songoku said:

Can I use z - test because the sample and population are normally distributed even though the number of sample is small?

No. The sample variance for the t test is not the same as the population variance for the z test.

songoku · Sep 12, 2018

Dale said:

No. The sample variance for the t test is not the same as the population variance for the z test.

So I can say that the factor to choose between z - test or t - test is whether the population variance is known or not? If population variance is known, I use z - test and if not, I use t - test, independent of the type of distribution of the population?

Thanks

Dale · Sep 13, 2018

songoku said:

So I can say that the factor to choose between z - test or t - test is whether the population variance is known or not? If population variance is known, I use z - test and if not, I use t - test, independent of the type of distribution of the population?

Thanks

Yes, almost. They both need to have a normal distribution. The z test is for a normal distribution with known variance and the t test is for normal distribution with unknown variance.

songoku · Sep 13, 2018

Dale said:

Yes, almost. They both need to have a normal distribution. The z test is for a normal distribution with known variance and the t test is for normal distribution with unknown variance.

If the population is not normally distributed, t - test and z - test both can not be used to test the mean and we should use another type of test?

Thanks

Dale · Sep 13, 2018

songoku said:

If the population is not normally distributed, t - test and z - test both can not be used to test the mean and we should use another type of test?

Yes, that is correct

songoku · Sep 13, 2018

Thank you very much

Stephen Tashi · Sep 14, 2018

songoku said:

If the population is not normally distributed, t - test and z - test both can not be used to test the mean and we should use another type of test?

Let's hope that is not a general principle because we see the t-test used by respectable people on populations that aren't normally distsributed when large samples are involved.

According to the current Wikipedia article for the Student's T-test, it is the sample mean that is assumed to be normally distributed, not the population itself. If you have a large sample size, the sample mean will have an approximately normal distribution.

When not to use a t-test is discussed on the blog http://thestatsgeek.com/2013/09/28/the-t-test-and-robustness-to-non-normality/

(I, myself, haven't tried to verify the Wikipedia article or the blog. )

Dale · Sep 14, 2018

Stephen Tashi said:

Let's hope that is not a general principle because we see the t-test used by respectable people on populations that aren't normally distsributed when large samples are involved

It is a general principle, and such tests can indeed be misused. But ...

Stephen Tashi said:

it is the sample mean that is assumed to be normally distributed, not the population itself.

Yes. Suppose you have a random variable, X, which is uniformly distributed, then you can also form a random variable, Y, which is the mean of 5 samples of X. Then it would be appropriate to use a t test on Y, but not on X.

songoku · Sep 14, 2018

Stephen Tashi said:

According to the current Wikipedia article for the Student's T-test, it is the sample mean that is assumed to be normally distributed, not the population itself. If you have a large sample size, the sample mean will have an approximately normal distribution.

Ah I see, so that's what central limit theorem is for.

Dale said:

Yes. Suppose you have a random variable, X, which is uniformly distributed, then you can also form a random variable, Y, which is the mean of 5 samples of X. Then it would be appropriate to use a t test on Y, but not on X.

By "uniformly distributed", do you mean normally distributed or it can be any type of distributions?

Thanks

ZeGato · Sep 14, 2018

songoku said:

Ah I see, so that's what central limit theorem is for.
By "uniformly distributed", do you mean normally distributed or it can be any type of distributions?

Thanks

He meant that it follows the uniform distribution.

Dale · Sep 15, 2018

songoku said:

By "uniformly distributed", do you mean normally distributed or it can be any type of distributions?

The uniform distribution is not the normal distribution. The normal distribution is a Gaussian curve. The uniform distribution is a rectangle. But yes, it was just an example of a non-normal distribution

songoku · Oct 7, 2018

I am really sorry for late reply.

Dale said:

Yes. Suppose you have a random variable, X, which is uniformly distributed, then you can also form a random variable, Y, which is the mean of 5 samples of X. Then it would be appropriate to use a t test on Y, but not on X.

Dale said:

The uniform distribution is not the normal distribution. The normal distribution is a Gaussian curve. The uniform distribution is a rectangle. But yes, it was just an example of a non-normal distribution

So X is uniformly distributed and Y is mean of 5 samples of X. Will Y also have uniform distribution? If yes, it means that we can use t - test for other distribution besides normal?

Thanks

ZeGato · Oct 7, 2018

songoku said:

I am really sorry for late reply.

So X is uniformly distributed and Y is mean of 5 samples of X. Will Y also have uniform distribution? If yes, it means that we can use t - test for other distribution besides normal?

Thanks

Y would have the Irwin-Hall distribution (given that the Xi's are truly independent). And yes, you can use it for non-normal samples, as long as its mean is. For this to happen, usually it has to be a random sample (i.e. a sample made of independent random variables), with identically distributed random variables and the sample size has to be >= 30, so that the Central Limit Theorem applies (approximately). There are also other conditions such as the random variables having 1st and 2nd order moments (expected value and variance), both of which could be violated if you were dealing with distributions such as Cauchy for example. This is the simpler version of the CLT: there are also other versions of the CLT which don't require some of these conditions, but they're replaced with different ones instead.

Dale · Oct 7, 2018

songoku said:

Will Y also have uniform distribution?

No, Y will be approximately normally distributed. It will not be uniformly distributed

songoku · Oct 11, 2018

ZeGato said:

Y would have the Irwin-Hall distribution (given that the Xi's are truly independent). And yes, you can use it for non-normal samples, as long as its mean is. For this to happen, usually it has to be a random sample (i.e. a sample made of independent random variables), with identically distributed random variables and the sample size has to be >= 30, so that the Central Limit Theorem applies (approximately). There are also other conditions such as the random variables having 1st and 2nd order moments (expected value and variance), both of which could be violated if you were dealing with distributions such as Cauchy for example. This is the simpler version of the CLT: there are also other versions of the CLT which don't require some of these conditions, but they're replaced with different ones instead.

Dale said:

No, Y will be approximately normally distributed. It will not be uniformly distributed

Sorry I am confused. Y would have Irwin - Hall distribution or normal distribution?

Suppose I have a random variable X, which has poisson distribution, and Y, which is the mean of 5 samples of X. Will Y have normal distribution or poisson distribution?

Thanks

Dale · Oct 12, 2018

songoku said:

Sorry I am confused. Y would have Irwin - Hall distribution or normal distribution?

It will have a Bates distribution which is approximately normal for large n.

Stephen Tashi · Oct 12, 2018

songoku said:

So X is uniformly distributed and Y is mean of 5 samples of X. Will Y also have uniform distribution?

Have you studied how to compute the probability distribution of the sum of two independent random variables in terms of their individual distributions? You need to understand the fundamental facts about that situation in order to understand the facts about the distribution of a sample mean.

songoku · Oct 14, 2018

Dale said:

It will have a Bates distribution which is approximately normal for large n.

By "for large n", you mean so that CLT can be applied?

Stephen Tashi said:

Have you studied how to compute the probability distribution of the sum of two independent random variables in terms of their individual distributions? You need to understand the fundamental facts about that situation in order to understand the facts about the distribution of a sample mean.

I am not sure what you mean. What I have learned is linear combination of random variables such as:

E(aX +b) = aE(X) + b
E(X + Y) = E(X) + E(Y)
Var(aX + b) = a²Var(X)

Stephen Tashi · Oct 14, 2018

songoku said:

I am not sure what you mean.

The distribution of the sum of two independent random variables is computed by taking the "convolution" of their individual distributions. Perhaps you haven't studied that yet.

songoku · Oct 16, 2018

Stephen Tashi said:

The distribution of the sum of two independent random variables is computed by taking the "convolution" of their individual distributions. Perhaps you haven't studied that yet.

No i haven't studied about that yet so it means there are some parts I won't be able to understand right now. What parts are they?

Thanks

Stephen Tashi · Oct 16, 2018

songoku said:

No i haven't studied about that yet so it means there are some parts I won't be able to understand right now. What parts are they?

Thanks

You won't be able to understand how the distribution of the sample mean is related to the distribution of the population from which the samples are taken. So you won't understand which statistical tests can be applied to the sample mean.

songoku · Oct 16, 2018

Stephen Tashi said:

You won't be able to understand how the distribution of the sample mean is related to the distribution of the population from which the samples are taken. So you won't understand which statistical tests can be applied to the sample mean.

I see. I always think like this: if X is random variable and Y is mean of sample taken from X, Y will always have same distribution as X. So this is wrong?

Thanks

DrDu · Oct 17, 2018

Of course there are alternatives which you can use also for non-normally distributed data like Mann-Wilcoxon u-test.

Dale · Oct 17, 2018

songoku said:

I always think like this: if X is random variable and Y is mean of sample taken from X, Y will always have same distribution as X. So this is wrong?

Yes. Consider X is the roll of a standard dice. So the distribution is 1/6 probability to get 1,2,3,4,5,6 and 0 probability to get anything else. If Y is the mean of two samples of X then the most likely outcome is that Y=3.5. 3.5 has 0 probability for X and has the highest probability for Y, so clearly the distribution is different.

songoku · Oct 17, 2018

I think I get it for now.

Thank you very much for all the help (Dale, Stephen Tashi, ZeGato, DrDu)

How to decide between using the z and t tests?

1. What is the difference between the z-test and t-test?

2. When should I use a z-test?

3. When should I use a t-test?

4. Can I use a z-test or t-test for non-parametric data?

5. How do I determine which test is appropriate for my data?

Similar threads

Hot Threads

Recent Insights