Qualitative Stats question (central limit theorem)

AI Thread Summary
The discussion focuses on the consequences of the Central Limit Theorem (CLT) in relation to various sampling scenarios. It concludes that samples II and III yield approximately normally distributed averages or proportions due to their sufficient sample sizes. However, sample IV is deemed inappropriate for normal approximation because the proportion of interest is too low, which affects the reliability of the CLT. The importance of the condition Np > 10 is emphasized, as it ensures that the approximation to the normal distribution is valid. Understanding these parameters is crucial for accurate statistical analysis.
habman_6
Messages
16
Reaction score
0
3) Which of the following are consequences of the Central Limit Theorem?
I) A SRS of resale house prices for 100 randomly selected transactions from all sale transactions in 2001 (in Toronto) will be obtained. Since the sample is large, we should expect the histogram for the sample to be nearly normal.
II) We will draw a SRS (simple random sample) of 100 students from all University of Toronto students, and measure each person’s cholesterol level. The average cholesterol level for the sample should be approximately normally distributed.
III) We want to estimate the proportion of Ontario voters who intend to vote for the Liberal party in the next election, and decide to draw a SRS of 400 voters. The percentage of the people in the sample who will say that they intend to vote Liberal is approximately normally distributed.
IV) We will draw a SRS of 100 adults from the Canadian military, and count the number who have the AIDS virus. The number of individuals in the sample who will be found to have the AIDS virus should be approximately normally distributed.
V) We are interested in the average income for all Canadian families for 2001. The mean income for all Canadian families should be approximately normal, due to the large number of families in the population.


The answer is II and III. I understand why I is wrong, and I understand why V is wrong. However, to me, IV seems exactly the same as III. Apparently its because the proportion is too low, but that does not make sense to me in terms of CLT. What should it matter what the proportion is, as long as the sample means have that same proportion?
 
Physics news on Phys.org
If the sample size or the proportion is too small the approximation is too poor to be useful. As an example, consider the Binomial distribution(N,p). The CLT applies to this case. The rule of thumb is that the normal approximation to the Binomial is poor unless Np > 10.

http://www.stat.yale.edu/Courses/1997-98/101/binom.htm

To convince yourself, look at the pdf of some binomial distributions that don't satisfy Np > 10, e.g. N = 100, p = 0.01.
 
Thanks Aleph, I understand now.
 
I picked up this problem from the Schaum's series book titled "College Mathematics" by Ayres/Schmidt. It is a solved problem in the book. But what surprised me was that the solution to this problem was given in one line without any explanation. I could, therefore, not understand how the given one-line solution was reached. The one-line solution in the book says: The equation is ##x \cos{\omega} +y \sin{\omega} - 5 = 0##, ##\omega## being the parameter. From my side, the only thing I could...
Back
Top