tzx9633 said:
Here's my notes
• pairs of observations are independent and
– the sample size is large or small and data normal then use the t-test.
– the sample size is small and the data not normal then use the Wilcoxon rank sum (Mann-Whitney U) test.
• pairs of observations are dependent and
– the sample size is large or small and data normal then use the paired t-test.
– the sample size is small and the data not normal then use the Wilcoxon signed rank test.
In the previous example , it's dependent test ( test whether the
the position on the plant affect the number of seeds in the pods or NOT )
, am i right ? For dependent test , there are only 2 choices , right ? Which are Mann's whiteny and t-test , am i right ? Why the author use z -test in the first example ?
There are many things wrong with this example.
(1) The data make little sense. How can the number of seeds be 5.2 in the top pod and 3.7 in the bottom pod of the exact same plant? Don't seeds come in integer numbers, 0,1,2,...? How can you have an "average number of seeds" for a single plant? I suppose there could be N1 pods on top and N2 pods on the bottom, with the "averages" being the average number per pod among the N1 top pods, etc. However, in that case, why bother with averages? Just look at the total number of seeds on top and on the bottom. It looks to me like a highly artificial problem using some arbitrary numbers, designed to give the illusion of a real problem. However, YOU are stuck with doing the example, whether it makes sense or not!
(2) How can the seed numbers in the top or bottom, or their difference, be Binomial(10, 0.5)? Here, the '10' looks like the number of plants tested, but the number of seeds in an individual plant will not depend on how many plants we choose to examine. The number of seeds per plant will be determined by the biology of the plant itself, and possibly by environmental factors, etc. Possibly, saying that the number of seeds is random with distribution ##\text{Binom}(N,p)## could be a good approximation, but there is no way to say a priori that ##N=10## and ##p = 1/2##. I guess it is possible that the person setting the problem really meant so say that "observation shows that the probability of seed numbers is approximately binomial with parameters ##N=10## and ##p = 0.5##", but if that is the case, that is the way they should have said it. Otherwise, they are likely to maximize the confusion of the student.
(3) Since the top/bottom numbers are paired (to a single plant), using a paired-sample test (such as a paired-sample t-test) might be appropriate. Certainly for a single plant the top and bottom numbers are dependent, but with careful experimental design or appropriate data-gathering, the numbers between different plants might, possibly, be independent. Even if a paired-sample t-test is not appropriate (because of non-normality of the data), it would make sense to use a non-parametric test for ##H_0: \mu=0## vs ##H_1: \mu \neq 0## for the mean ##\mu## of a sample ##X_1, X_1, \cdots, X_{10}##. Here ##X = \text{top number} - \text{bottom number}##. And, a two-sided test would be the way to go.
I looked only at your first posted image; as I said before, I won't look at posted images of solutions, only at typed work.