View Full Version : F-test / t-test
Maybe_Memorie
Jul24-11, 12:41 PM
1. The problem statement, all variables and given/known data
A study of weight gain in male rats compared two diets, A and B,
which were formulated in terms of level and source of protein.
A total of twenty two rats were randomly assigned to the diets, eleven to each diet.
Summary statistics for the weight gains (grams) by the end of the study are shown.
A B
Mean 99.5 78.7
St. Dev 10.9 16.5
n 11 11
(a) An F-test was carried out to compare the sample standard deviations.
Why might this have been done? Carry out the F-test: specify the null hypothesis,
the significance level and critical value you use in the test and intrepret the
test result.
(b) Carry out a t-test to compare the sample means. State ecplicitly the null hypothesis,
the degrees of freedom for the reference distribution, and the critical values for your test.
Interpret the results of the test in practical terms.
(c) It has been suggested that rather than carry out a test, it would be better to calculate
a confidence interval. Explain why this is either correct or not correct. Calculate and interpret an appropriate confidence interval.
(d) It was asserted that had the raw data been available, a paired t-test would be the appropriate test for analyzing the study results. Discuss.
(e) Before a study such as this one can be carried out, the researchers need to decide on a sample size for the study. Discuss the issues that need to be considered in arriving at a suitable sample size for
this type of study.
2. Relevant equations
3. The attempt at a solution
(a) I used F = (s1)^2/(s2)^2
This gave (16.5)^2/(10.9)^2 = 2.29
The null hypotheses, Ho: s1=s2
Is this correct?
I don't know how to get the significance level or critical value.
(b) I used t = (x1 - x2)/root[(s1^2 - s2^2)/n)
which gave me 3.49
Ho: u1 = u2
11-1=10 degrees of freedom
Is it okay to do a 95% confidence interval? That's what all the examples do, but I don't know why...
(c) Using (x1 - x2)+-t*(root[(s1^2 - s2^2)/n)
I need to get t* from the previous part.
For (d) and (e) any advice would be greatly appreciated as I'm lost.
I like Serena
Jul24-11, 05:45 PM
Hi Maybe_Memorie! :smile:
First I think you need some background information.
This should all be in your text book, but I'll try to help you along.
To do any statistical test you need to define a null hypothesis Ho, an alternative hypothesis Ha, and a significance level alpha.
An important choice is whether you select a Ha to be 1-sided or 2-sided.
The significance level alpha is usually chosen as 5%, meaning the chance that you draw the wrong conclusion should be less than 5%.
This is a typical experiment where two populations are compared.
To do that we'll do an "Independent two-sample t-test".
For this it matters whether the population variances are the same or not.
If they are the same, they must be "pooled".
So you would either do the "Independent two-sample t-test with equal variances" or the "Independent two-sample t-test with unequal variances".
Do you have the appropriate formulas available?
The formula you used for (b) is not quite the right formula.
To find out if the variances can be assumed equal the F-test is done.
To do an F-test you need to know the degrees of freedom of the populations, and the selected significance level alpha (pick alpha = 5%).
The critical value comes from looking these parameters up in the appropriate table, or use a statistical calculator.
Do you have such a table available?
Does this help you to find the answer to (a)?
For (b) you did not use quite the right formula.
And yes, its okay to do a 95% confidence interval.
That means that you have chosen a significance level alpha of 5%, which is usually chosen.
Btw, a program like SPSS does all these things for you.
Are you supposed to use for instance SPSS?
Maybe_Memorie
Jul24-11, 06:35 PM
Hello again :smile:
So you would either do the "Independent two-sample t-test with equal variances" or the "Independent two-sample t-test with unequal variances".
Do you have the appropriate formulas available?
The formula you used for (b) is not quite the right formula.
For unequal variances
[(x1 - x2) - (u1 - u2)]/root[(s1^2 - s2^2)/n)
I'm using n as a common denominator since the number in A and B is the same.
Equal variances, I don't have my book with me at the moment so I can't state the formula off-hand, but I know there is a formula for a pooled t-test.
Btw, I'm using An Introduction to the Practice of Statistics by Moore and McCabe.
So I'm assuming the appropriate formula to use would be determined by the F-test?
To find out if the variances can be assumed equal the F-test is done.
To do an F-test you need to know the degrees of freedom of the populations, and the selected significance level alpha (pick alpha = 5%).
The critical value comes from looking these parameters up in the appropriate table, or use a statistical calculator.
Do you have such a table available?
Does this help you to find the answer to (a)?
Each population as 11 members, so 10 degrees of freedom.
Yes, I have two available tables, one for 10% critical values and one for 5% critical values.
Using the 5% one, and 10 degrees of freedom, should my critical value be 3?
Btw, a program like SPSS does all these things for you.
Are you supposed to use for instance SPSS?
Unfortunately no. In my exam I have to get all information from the supplied tables.
I like Serena
Jul24-11, 07:19 PM
Btw, I'm using An Introduction to the Practice of Statistics by Moore and McCabe.
I'm sorry to say that I don't have a good book available, so I'm dependent on Google.
It's best if you pick the right formulas from your own book.
For unequal variances
[(x1 - x2) - (u1 - u2)]/root[(s1^2 - s2^2)/n)
My problem with your formula is that you subtract the sample variances.
Variances are never subtracted, they should be added.
So I'm assuming the appropriate formula to use would be determined by the F-test?
Yes.
Each population as 11 members, so 10 degrees of freedom.
Yes, I have two available tables, one for 10% critical values and one for 5% critical values.
Using the 5% one, and 10 degrees of freedom, should my critical value be 3?
An F-test has 2 parameters that are both degrees of freedom (df).
They are called the df of the nominator and the df of the denominator.
I believe in this case you need 10 and 10, which comes out at alpha=5% as F*=2.97.
So what is your conclusion for the F-test?
Ray Vickson
Jul24-11, 09:37 PM
1. The problem statement, all variables and given/known data
A study of weight gain in male rats compared two diets, A and B,
which were formulated in terms of level and source of protein.
A total of twenty two rats were randomly assigned to the diets, eleven to each diet.
Summary statistics for the weight gains (grams) by the end of the study are shown.
A B
Mean 99.5 78.7
St. Dev 10.9 16.5
n 11 11
(a) An F-test was carried out to compare the sample standard deviations.
Why might this have been done? Carry out the F-test: specify the null hypothesis,
the significance level and critical value you use in the test and intrepret the
test result.
(b) Carry out a t-test to compare the sample means. State ecplicitly the null hypothesis,
the degrees of freedom for the reference distribution, and the critical values for your test.
Interpret the results of the test in practical terms.
(c) It has been suggested that rather than carry out a test, it would be better to calculate
a confidence interval. Explain why this is either correct or not correct. Calculate and interpret an appropriate confidence interval.
(d) It was asserted that had the raw data been available, a paired t-test would be the appropriate test for analyzing the study results. Discuss.
(e) Before a study such as this one can be carried out, the researchers need to decide on a sample size for the study. Discuss the issues that need to be considered in arriving at a suitable sample size for
this type of study.
2. Relevant equations
3. The attempt at a solution
(a) I used F = (s1)^2/(s2)^2
This gave (16.5)^2/(10.9)^2 = 2.29
The null hypotheses, Ho: s1=s2
Is this correct?
I don't know how to get the significance level or critical value.
(b) I used t = (x1 - x2)/root[(s1^2 - s2^2)/n)
which gave me 3.49
Ho: u1 = u2
11-1=10 degrees of freedom
Is it okay to do a 95% confidence interval? That's what all the examples do, but I don't know why...
(c) Using (x1 - x2)+-t*(root[(s1^2 - s2^2)/n)
I need to get t* from the previous part.
For (d) and (e) any advice would be greatly appreciated as I'm lost.
For part (a): if the null hypothesis \sigma_1 = \sigma_2 holds then both s1^2/s2^2 and s2^2/s1^2 should have the so-called F(10,10) distribution. (Here, the '10s' are the "degrees of freedom" (n-1) used in computing the s^2 values.) In other words, we want to know how likely it would be to see a value of s1^2/s2^2 as large as 2.29, or a value of s2^2/s1^2 as small as 1/2.29 = 0.437, from a random variable with distribution F(10,10). You need to look at F-tables (or use an on-line calculator, or spreadsheet, or whatever) to look at the corresponding probabilities.
Note: you may be confusing yourself by using sloppy notation: s1 is a sample standard deviation, not a true "population" standard deviation. We are not testing whether s1 = s2 because this is obviously not true: one of them is about 1.5 times the other. However, the underlying standard deviations \sigma_1 and \sigma_2 may, or may not be equal, or nearly so. That is what we are trying to find out.
RGV
Maybe_Memorie
Jul25-11, 01:27 PM
An F-test has 2 parameters that are both degrees of freedom (df).
They are called the df of the nominator and the df of the denominator.
I believe in this case you need 10 and 10, which comes out at alpha=5% as F*=2.97.
So what is your conclusion for the F-test?
My book says to double F* obtaining 6. (in my tables the value for F(10,10) is 3 so I'm just going to use that.)
This is greater than the 5% level of significance, so is significant in our test.
So the null hypothesis fails?
I like Serena
Jul25-11, 01:35 PM
My book says to double F* obtaining 6. (in my tables the value for F(10,10) is 3 so I'm just going to use that.)
This is greater than the 5% level of significance, so is significant in our test.
So the null hypothesis fails?
Hmm, doubling F*?
I've never heard of that.
Aren't you mixing it up with the t-test?
There's doubling involved in the t-test..... but we'll get to that later.
Either way, your F was less then 3 (I assume this is a rounded value), so whatever the doubling, the F is less than the critical F*.
Now concerning the null hypothesis.
Suppose the null hypothesis was perfectly true, what F-value would you get?
And if the null hypothesis would be very untrue, what kind of F-value would you have then?
Maybe_Memorie
Jul25-11, 01:44 PM
Now concerning the null hypothesis.
Suppose the null hypothesis was perfectly true, what F-value would you get?
And if the null hypothesis would be very untrue, what kind of F-value would you have then?
If it was perfectly true, F would equal 1, if it was very untrue, F would be very large.
Since my F is smaller than 5, can I take it that the variances are equal and then in (b) use
t = (x1 - x2)/root[(s1^2 + s2^2)/n) ?
I like Serena
Jul25-11, 01:57 PM
If it was perfectly true, F would equal 1, if it was very untrue, F would be very large.
Since my F is smaller than 5, can I take it that the variances are equal and then in (b) use
t = (x1 - x2)/root[(s1^2 + s2^2)/n) ?
Yes. :smile:
To be a little more accurate, according to the F-test you do not have enough proof that you can reject the null hypothesis, so we'll assume equal variances.
And so yes, you can use the formula for the independent two-sample t-test for equal variances and equal sample sizes. :wink:
But usually, before you do that, you should first state the null hypothesis and the alternative hypothesis.
In particular you need to decide whether you're testing 1-sided or 2-sided...
Maybe_Memorie
Jul25-11, 02:03 PM
Yes. :smile:
To be a little more accurate, according to the F-test you do not have enough proof that you can reject the null hypothesis, so we'll assume equal variances.
And so yes, you can use the formula for the independent two-sample t-test for equal variances and equal sample sizes. :wink:
But usually, before you do that, you should first state the null hypothesis and the alternative hypothesis.
In particular you need to decide whether you're testing 1-sided or 2-sided...
The null hypothesis is that u1 = u2
The alternative is that u1 =/= u2.
My book only mentions and uses the 1-sided test, I can't find a mention of a 2-sided test...
I like Serena
Jul25-11, 02:04 PM
Since my F is smaller than 5
Btw, I noticed that now you are referring to 5.
Any particular reason?
Actually, I was not entirely sure about the degrees of freedom being 10 and 10.
They might well be 2-1 resp 11-1, in which case you would get 4.9647....
Pity I do not have a good reference book available. :(
I like Serena
Jul25-11, 02:08 PM
The null hypothesis is that u1 = u2
The alternative is that u1 =/= u2.
My book only mentions and uses the 1-sided test, I can't find a mention of a 2-sided test...
Well, you just defined a 2-sided test!
Basically you have three choices for the alternative hypothesis Ha.
1. u1 < u2
2. u1 > u2
3. u1 ≠ u2
The first and second are 1-sided, the third is 2-sided.
You should always carefully read the problem statement to find out what the alternative hypothesis should be.
Did you?
Anyway, let's assume a 2-sided test, which is what you correctly selected! :wink:
Maybe_Memorie
Jul25-11, 02:13 PM
Btw, I noticed that now you are referring to 5.
Any particular reason?
Actually, I was not entirely sure about the degrees of freedom being 10 and 10.
They might well be 2-1 resp 11-1, in which case you would get 4.9647....
Pity I do not have a good reference book available. :(
When I said 5 I was referring to my alpha.
I'm assuming this is wrong?
My book said when the populations have n1 and n2 members, then my degrees of freedom are n1 - 1 and n2 -1
I like Serena
Jul25-11, 02:20 PM
When I said 5 I was referring to my alpha.
I'm assuming this is wrong?
You are wrong if you were comparing your F-value with the significance level alpha of 5%.
In all these statistical theories you're dealing each and every time with two types of variables: test-statistics (like t-values and F-values) and probabilities (like significance level alpha and p-values).
Never mix these two types! :surprised
My book said when the populations have n1 and n2 members, then my degrees of freedom are n1 - 1 and n2 -1
Okay, then we were and are on track! :smile:
Maybe_Memorie
Jul25-11, 02:21 PM
Well, you just defined a 2-sided test!
Basically you have three choices for the alternative hypothesis Ha.
1. u1 < u2
2. u1 > u2
3. u1 ≠ u2
The first and second are 1-sided, the third is 2-sided.
You should always carefully read the problem statement to find out what the alternative hypothesis should be.
Did you?
Anyway, let's assume a 2-sided test, which is what you correctly selected! :wink:
Oh right I see! :smile:
So I should be using 2-sided because the question said "compare the sample means", not investigate which is bigger?
Maybe_Memorie
Jul25-11, 02:27 PM
You are wrong if you were comparing your F-value with the significance level alpha of 5%.
In all these statistical theories you're dealing each and every time with two types of variables: test-statistics (like t-values and F-values) and probabilities (like significance level alpha and p-values).
Never mix these two types! :surprised
Right... so what should I have been comparing my F-value to in order to not be able to reject Ho?
I like Serena
Jul25-11, 02:27 PM
Oh right I see! :smile:
So I should be using 2-sided because the question said "compare the sample means", not investigate which is bigger?
Almost! :smile:
As you formulate it, they would both be 2-sided tests.
"Investigate which is bigger" means you have no clue which one it is.
What you're looking for, is for instance a test to find out if people become more intelligent after taking a course.
That would be typically 1-sided.
Or for instance if the weight of rats becomes less after taking a diet.
That is, if the problem statement uses the word "more" or "less". :wink:
I like Serena
Jul25-11, 02:30 PM
Right... so what should I have been comparing my F-value to in order to not be able to reject Ho?
You should compare your F-value with the critical F-value F*, which you found was 3 (or 6?!?).
Alternatively you can look up your F-value in a table to find a so called p-value.
In that case you would compare your p-value with the significance level alpha of 5%.
Maybe_Memorie
Jul25-11, 02:53 PM
Oh! Right!
Sorry I got a little lost there. :redface:
Okay, I understand the F test. :smile:
So in my t-test, I'm getting t = 3.49.
My critical value is 2.26, using 10 degrees of freedom and alpha = 5%
Because my t is greater than t* I can reject the null hypothesis.
Is this right?
I like Serena
Jul25-11, 03:09 PM
Oh! Right!
Sorry I got a little lost there. :redface:
Okay, I understand the F test. :smile:
Good! :smile:
My critical value is 2.26, using 10 degrees of freedom and alpha = 5%
Wait! Stop! :smile:
Where did you find 2.26?
I can't find it.
Btw, this is the point where there is possible doubling or halving in the t-test.
For these calculations a 1-sided alpha of 2.5% corresponds to a 2-sided alpha of 5%.
Do you have references to that in your table?
Then the degrees of freedom.
I think you should not use 10 degrees of freedom.
Does your book say anything about the degrees of freedom in a t-test with equal variances assumed?
(I think it should be 20 degrees of freedom.)
Maybe_Memorie
Jul25-11, 03:21 PM
Good! :smile:
Wait! Stop! :smile:
Where did you find 2.26?
I can't find it.
Btw, this is the point where there is possible doubling or halving in the t-test.
For these calculations a 1-sided alpha of 2.5% corresponds to a 2-sided alpha of 5%.
Do you have references to that in your table?
Then the degrees of freedom.
I think you should not use 10 degrees of freedom.
Does your book say anything about the degrees of freedom in a t-test with equal variances assumed?
(I think it should be 20 degrees of freedom.)
Sorry, should have said 2.23.
Yes, in my tables I have a row for alpha and a column for degrees of freedom.
I can't find anything about that in my book, but if I used df=20 I would get 2.09 according to my tables..
I like Serena
Jul25-11, 03:28 PM
Sorry, should have said 2.23.
Yes, in my tables I have a row for alpha and a column for degrees of freedom.
I can't find anything about that in my book, but if I used df=20 I would get 2.09 according to my tables..
Ah well, I'm relying on wikipedia here:
http://en.wikipedia.org/wiki/T-value
It says for an Independent two-sample t-test with Equal sample sizes, equal variance:
"For significance testing, the degrees of freedom for this test is 2n − 2 where n is the number of participants in each group."
I hope it's reliable.
SPSS would know. :wink:
But yes, I also find 2.09.
Because my t is greater than t* I can reject the null hypothesis.
Is this right?
And yes, this is right! :smile:
Maybe_Memorie
Jul25-11, 06:37 PM
Okay, so for my confidence interval,
x1-x2 +- t*[root(s1^2 + s2^2)/n)
giving 20.8 +- (2.09)(5.96)
giving my confidence interval (8.34, 33.26)
However I don't know why it would be better to do so or what the result means. :confused:
I like Serena
Jul26-11, 03:28 AM
Let's start with the definition of a confidence interval.
From wikipedia:
"In statistics, a confidence interval (CI) is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval [...]".
Which "population parameter" did you "observe"?
What is the chance that the actual population parameter is within this interval?
How "significant" is this interval? Or rather, what is the significance of it?
Maybe_Memorie
Jul26-11, 08:09 AM
Which "population parameter" did you "observe"?
What is the chance that the actual population parameter is within this interval?
How "significant" is this interval? Or rather, what is the significance of it?
The population parameter was the mean, yes?
And since we used alpha = 0.05 theres a 95% chance the actual mean is in this interval.
I like Serena
Jul26-11, 10:52 AM
The population parameter was the mean, yes?
And since we used alpha = 0.05 theres a 95% chance the actual mean is in this interval.
Nope.
The population parameter is the difference in means.
And yes, there's a 95% chance the actual mean is in this interval.
So how can you tell from this confidence interval whether it is significant?
Actually, you already know that it is significant from the previous question.
But how would it look if it were not significant?
Maybe_Memorie
Jul28-11, 12:00 PM
So how can you tell from this confidence interval whether it is significant?
Actually, you already know that it is significant from the previous question.
But how would it look if it were not significant?
If it were not significant the boundaries of the interval would be very close?
Or even 2(standard deviation) would be the difference in the boundaries?
I like Serena
Jul28-11, 12:49 PM
If it were not significant the boundaries of the interval would be very close?
Or even 2(standard deviation) would be the difference in the boundaries?
Nope.
Now you have the interval (8.34, 33.26) for the difference in population means.
How would this interval look if the population means were the same?
Maybe_Memorie
Jul28-11, 01:09 PM
Nope.
Now you have the interval (8.34, 33.26) for the difference in population means.
How would this interval look if the population means were the same?
If the means were the same the difference should be zero?
I like Serena
Jul28-11, 01:12 PM
If the means were the same the difference should be zero?
Yes.......
Maybe_Memorie
Jul28-11, 01:15 PM
Yes.......
So the interval would be close to zero in the plus and minus direction, something like
(-3, 3)
Like that yeah?
I like Serena
Jul28-11, 01:16 PM
Yes! :smile:
So how can you see from your interval whether it is significant or not?
Maybe_Memorie
Jul28-11, 01:20 PM
Because my interval has both boundaries greater than zero, we're 95% certain the difference is between these boundaries, so that means we're 95% certain the difference is greater than zero. This is significant and forces us to reject the null hypothesis?
I like Serena
Jul28-11, 01:26 PM
Yep! :smile:
When you use a CI in a test to compare the means of two samples, the criterion is whether the CI contains zero.
Maybe_Memorie
Jul28-11, 01:29 PM
Ah I see! :smile:
So it would be better to do a confidence interval instead of a t-test?
I like Serena
Jul28-11, 01:37 PM
Ah, now we're getting into the murky stuff that is open questions and discussions.
Let me counter that by asking: what are the pro's and con's of a CI versus a t-test?
What's the difference anyhow between a t-test and this confidence interval?
And I'll ask one more question: can you do a 1-sided test with a confidence interval?
Maybe_Memorie
Jul28-11, 02:06 PM
With a CI, we know that if the interval doesn't contain zero the means can't be the same. With a t-test we're relying on probabilities and approximations.
I would sat yes, because if you test Ha: u1>u2, and find a CI for u1-u2, and if this doesn't contain zero we're 95% certain Ha is true
I like Serena
Jul28-11, 02:14 PM
With a CI, we know that if the interval doesn't contain zero the means can't be the same. With a t-test we're relying on probabilities and approximations.
Wow! Stop!
A CI does not give certainty!
Basically the CI is a t-test. It's just represented differently.
But the ultimate result (rejection or not) is the same.
I would sat yes, because if you test Ha: u1>u2, and find a CI for u1-u2, and if this doesn't contain zero we're 95% certain Ha is true
Hmm, suppose the CI is (-6, -1).
That does not contain 0.
Does that mean Ha is probably true?
vBulletin® v3.8.7, Copyright ©2000-2012, vBulletin Solutions, Inc.