The t-test and the central limit theorem

Click For Summary
SUMMARY

The discussion revolves around the application of the t-test and the central limit theorem in hypothesis testing. Participants clarify the correct formulation of the null hypothesis (H0) and the alternative hypothesis (Ha), emphasizing the importance of understanding statistical concepts rather than merely applying formulas. Key mistakes identified include the calculation of the standard error (SE) and the interpretation of critical t-values. The conversation highlights the necessity of grasping the underlying principles of statistical tests to avoid common errors.

PREREQUISITES
  • Understanding of t-test fundamentals, including null and alternative hypotheses.
  • Familiarity with the central limit theorem and its implications for statistical inference.
  • Knowledge of standard error calculation and its role in hypothesis testing.
  • Ability to interpret critical values from the t-distribution table.
NEXT STEPS
  • Study the derivation and application of the t-test in statistical analysis.
  • Learn about the central limit theorem and its significance in hypothesis testing.
  • Review the calculation of standard error (SE) and its correct formula: SE = s/√n.
  • Explore resources on statistical hypothesis testing, including recommended textbooks like "Statistics for Mathematicians" by D.J. Finney.
USEFUL FOR

Statisticians, data analysts, researchers in biology and medical fields, and anyone involved in hypothesis testing and statistical analysis.

TytoAlba95
Messages
132
Reaction score
19
Homework: No Effort - Member warned that some effort must be shown
Homework Statement
It is hypothesized that the mean dry weight of a female in a Drosophila population is 4.5 mg. In a sample of 16 female with ȳ =4.8 mg and s = 0.8 mg, what dry weight values would lead to rejection of the null hypothesis at p = 0.05 level?
(take t0.05=2.1)
Relevant Equations
1. Values lower than 4.0 and values higher than 5.6
2. Values lower than 3.20 and values higher than 6.40
3. Values lower than 4.38 and values higher than 5.22
4. Values lower than 3.22 and values higher than 6.48
Ans is 3.

I know basic t-test but I have no clue to solve this question.
Thanks.
 
Last edited by a moderator:
Physics news on Phys.org
It would appear that by "what dry weight values would lead...?" they mean "what hypothesised values of the mean dry weight would lead...?" Which seems odd when they've given you a hypothesised value already. It sounds as if they're talking about measured values, but that would make no sense. Mean, standard deviation and hypothesised mean are all you need for a t-test; if they were talking about rejecting outliers that would be a different test.
 
I'm sorry, I couldn't understand what you said.
The book I'm following has provided a solution to this question which is too complex for me. It talks about central limit theorem.
243739
 
What in particular don't you understand? You say you "know basic t-test". Do you mean you know a formula that you apply blindly, or do you understand where it comes from?
The question as you state it (have you copied it correctly?) is badly expressed and unclear. The answer also contains mistakes, e.g. in the last line "does not contain" should be "contains". The paragraph immediately after the diagram should read
"... there is a 95% probability that the interval μ ± 2.1(σ/√n) contains Y. Likewise, there is a 95% probability that the interval specified by Y ± 2.1(σ/√n) contains μ."
If the value of the hypothesised mean was greater than 5.22 or less than 4.38, you would reject the hypothesis on the basis of the data. This is what answer 3 must mean. But they tell you that the hypothesised mean is 4.5. It makes no sense.
 
Sorry, I understand it was too vague to have said 'I know basic t-test'. I only know how to apply formulae, I should have mentioned that.
I have copied the sum correctly (there's no mistake...).

My Attempt :
Ho= There's no difference between the sample mean (ȳ =4.8 mg) and the population mean μ=4.5.
Ha= There's difference between the sample mean and population mean. The sample doesn't belong to the population.

tcal= (ȳ -μ)/SE
here SE=s/√n-1.
tcal= {4.8-4.5}/(0.8/3.8) = 0.3/.21 = 0.38

My tcal is greater than the given t0.05 = 2.1 (though t0.05, df=15 =1.7), so the Ho is rejected. The sample does not belong to the population.

what dry weight values would lead to rejection of the null hypothesis at p = 0.05 level?

From the above quote it appears to me that the Ho should not have been rejected, and the question is asking the hypothetical μ for which it will be rejected.

Then again after checking the solution I got more confused with confidence interval and central limit theorem. (Could you suggest some easy-reads on these terms)

I hope I could make my current understanding of this sum more clear.
 
I'm afraid your post illustrates that "only knowing how to apply formulae" without understanding them means that you will sometimes apply them wrongly.
Your H0 is nonsensical - of course there is a difference between 4.8 and 4.5. Do you mean "the difference is not statistically significant at the 0.05 level"? Similarly with Ha: "There is a statistically significant difference..."
What do you mean by SE = s/√n - 1. As written it is ambiguous. Do you mean s/√(n - 1), or s/(√n - 1), or (s/√n) - 1? I suspect the first, but that is wrong - it should be s/√n.
0.3/.21 is not 0.38, it is 1.43 - which is still not greater than 2.1, so why do you reject the hypothesis? And "t0.05,15 = 1.7" is wrong - that is a single-tailed value, while you are looking for the two-tailed value.
SanjuktaGhosh said:
From the above quote it appears to me that the Ho should not have been rejected, and the question is asking the hypothetical μ for which it will be rejected.
I agree with you. That was what I was saying, I'm sorry if it wasn't clear.
There's no real substitute for understanding where the t distribution comes from and how the t test should be used. Unfortunately, I can't recommend a simple source for you, as the book I learned from is probably unobtainable now, and the Wikipedia article looks fearsomely mathematical. Perhaps someone else could help with suggestions?
 
mjc123 said:
I'm afraid your post illustrates that "only knowing how to apply formulae" without understanding them means that you will sometimes apply them wrongly.
Your H0 is nonsensical - of course there is a difference between 4.8 and 4.5. Do you mean "the difference is not statistically significant at the 0.05 level"? Similarly with Ha: "There is a statistically significant difference..."
What do you mean by SE = s/√n - 1. As written it is ambiguous. Do you mean s/√(n - 1), or s/(√n - 1), or (s/√n) - 1? I suspect the first, but that is wrong - it should be s/√n.

Oh! I didn't know it should be s/√n, I was taught SE=s/√(n-1).

0.3/.21 is not 0.38, it is 1.43 - which is still not greater than 2.1, so why do you reject the hypothesis? And "t0.05,15 = 1.7" is wrong - that is a single-tailed value, while you are looking for the two-tailed value.

Sorry, those were silly mistakes.

I agree with you. That was what I was saying, I'm sorry if it wasn't clear.
There's no real substitute for understanding where the t distribution comes from and how the t test should be used. Unfortunately, I can't recommend a simple source for you, as the book I learned from is probably unobtainable now, and the Wikipedia article looks fearsomely mathematical. Perhaps someone else could help with suggestions?

Yes, Wikipedia is too mathematical.
Can I create a post and ask other biologists for book recommendation?
 
I suggest posting a request in the biology forum, in case they aren't looking here.
 
  • Like
Likes TytoAlba95
The book I used was "Statistics for Mathematicians" by D.J.Finney; it appears that second-hand copies are available on Amazon. The same author also appears to have written a somewhat shorter "Statistics for Biologists", which costs about £100 new, but used copies are available much cheaper. I don't know how much easier it is.
 
  • Like
Likes TytoAlba95
  • #10
mjc123 said:
The book I used was "Statistics for Mathematicians" by D.J.Finney; it appears that second-hand copies are available on Amazon. The same author also appears to have written a somewhat shorter "Statistics for Biologists", which costs about £100 new, but used copies are available much cheaper. I don't know how much easier it is.
Thank you for helping me so much and bearing with me.
I'll post in Biology/Medical forum.
 

Similar threads

  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 25 ·
Replies
25
Views
1K
  • · Replies 1 ·
Replies
1
Views
983
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 31 ·
2
Replies
31
Views
3K
  • · Replies 10 ·
Replies
10
Views
2K
Replies
8
Views
1K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 6 ·
Replies
6
Views
4K