Is a Two-Tailed Test Appropriate for Analyzing Plant Position and Seed Count?

  • Thread starter tzx9633
  • Start date
  • Tags
    Sign Test
In summary, the conversation discusses the appropriate statistical test to use for a given hypothesis and data set. The topic of discussion is whether to use a one-tailed or two-tailed test, and the use of z-tests and other tests such as t-tests, Mann-Whitney U test, and Wilcoxon rank sum test. The example given in the conversation is not entirely clear and may have errors, making it difficult to determine the correct test to use.
  • #1
tzx9633

Homework Statement


In the photo above ,
H0 = the position of the plant doesn't affect the number of seeds in the pods .
H1 = position of the plant affect the number of seeds in the pods

The Attempt at a Solution


This is a two -tailed test , am i right ? Referring to the normal distribution table , z α=0.05 = 1.645 , z α/2=0.025 = 1.960 ... In the example , it's clear that the author use one -tailed test . I think it's wrong . He should use z α/2=0.025 = 1.960 , Am i right ? Correct me if i am wrong .

Homework Equations

 

Attachments

  • 771.PNG
    771.PNG
    29.6 KB · Views: 487
  • 772.PNG
    772.PNG
    22 KB · Views: 461
  • 773.PNG
    773.PNG
    20 KB · Views: 455
Physics news on Phys.org
  • #2
If the author intended you to use a one-tailed test, it's not very clear from the problem description.
 
  • Like
Likes tzx9633
  • #3
Mark44 said:
If the author intended you to use a one-tailed test, it's not very clear from the problem description.
then , it's one tail or 2 tailed test ? Based on your opinion ... Because the H1 = position of the plant affect the number of seeds in the pods , so i assume it's 2 tailed test
 
  • #4
I think you are right. The one tail test would be appropriate for these hypotheses:
H1 = Being on top increases the number of seeds
H0 = Being on top does not increase the number of seeds

The two tail test is appropriate for the stated hypotheses of the book, regarding any difference -- increase or decrease. The Z value of the book is for the 90% two-tail confidence level, (95% on each side). (see the second table, "Critical values", in http://pegasus.cc.ucf.edu/~pepe/Tables )
 
Last edited:
  • Like
Likes tzx9633
  • #5
FactChecker said:
I think you are right. The one tail test would be appropriate for these hypotheses:
H1 = Being on top increases the number of seeds
H0 = Being on top does not increase the number of seeds

The two tail test is appropriate for the stated hypotheses of the book, regarding any difference -- increase or decrease. The Z value of the book is for the 90% two-tail confidence level, (95% on each side). (see the second table, "Critical values", in http://pegasus.cc.ucf.edu/~pepe/Tables )

Here's my notes

pairs of observations are independent and

– the sample size is large or small and data normal then use the t-test.

– the sample size is small and the data not normal then use the Wilcoxon rank sum (Mann-Whitney U) test.

• pairs of observations are dependent and

– the sample size is large or small and data normal then use the paired t-test.

– the sample size is small and the data not normal then use the Wilcoxon signed rank test.

In the previous example , it's dependent test ( test whether the
the position on the plant affect the number of seeds in the pods or NOT )
, am i right ? For dependent test , there are only 2 choices , right ? Which are Mann's whiteny and t-test , am i right ? Why the author use z -test in the first example ?
 
  • #6
tzx9633 said:
Here's my notes

pairs of observations are independent and

– the sample size is large or small and data normal then use the t-test.

– the sample size is small and the data not normal then use the Wilcoxon rank sum (Mann-Whitney U) test.

• pairs of observations are dependent and

– the sample size is large or small and data normal then use the paired t-test.

– the sample size is small and the data not normal then use the Wilcoxon signed rank test.

In the previous example , it's dependent test ( test whether the
the position on the plant affect the number of seeds in the pods or NOT )
, am i right ? For dependent test , there are only 2 choices , right ? Which are Mann's whiteny and t-test , am i right ? Why the author use z -test in the first example ?

There are many things wrong with this example.
(1) The data make little sense. How can the number of seeds be 5.2 in the top pod and 3.7 in the bottom pod of the exact same plant? Don't seeds come in integer numbers, 0,1,2,...? How can you have an "average number of seeds" for a single plant? I suppose there could be N1 pods on top and N2 pods on the bottom, with the "averages" being the average number per pod among the N1 top pods, etc. However, in that case, why bother with averages? Just look at the total number of seeds on top and on the bottom. It looks to me like a highly artificial problem using some arbitrary numbers, designed to give the illusion of a real problem. However, YOU are stuck with doing the example, whether it makes sense or not!

(2) How can the seed numbers in the top or bottom, or their difference, be Binomial(10, 0.5)? Here, the '10' looks like the number of plants tested, but the number of seeds in an individual plant will not depend on how many plants we choose to examine. The number of seeds per plant will be determined by the biology of the plant itself, and possibly by environmental factors, etc. Possibly, saying that the number of seeds is random with distribution ##\text{Binom}(N,p)## could be a good approximation, but there is no way to say a priori that ##N=10## and ##p = 1/2##. I guess it is possible that the person setting the problem really meant so say that "observation shows that the probability of seed numbers is approximately binomial with parameters ##N=10## and ##p = 0.5##", but if that is the case, that is the way they should have said it. Otherwise, they are likely to maximize the confusion of the student.

(3) Since the top/bottom numbers are paired (to a single plant), using a paired-sample test (such as a paired-sample t-test) might be appropriate. Certainly for a single plant the top and bottom numbers are dependent, but with careful experimental design or appropriate data-gathering, the numbers between different plants might, possibly, be independent. Even if a paired-sample t-test is not appropriate (because of non-normality of the data), it would make sense to use a non-parametric test for ##H_0: \mu=0## vs ##H_1: \mu \neq 0## for the mean ##\mu## of a sample ##X_1, X_1, \cdots, X_{10}##. Here ##X = \text{top number} - \text{bottom number}##. And, a two-sided test would be the way to go.

I looked only at your first posted image; as I said before, I won't look at posted images of solutions, only at typed work.
 
Last edited:
  • Like
Likes StoneTemplePython
  • #7
tzx9633 said:
In the previous example , it's dependent test ( test whether the
the position on the plant affect the number of seeds in the pods or NOT )
, am i right ? For dependent test , there are only 2 choices , right ? Which are Mann's whiteny and t-test , am i right ? Why the author use z -test in the first example ?
It would require a separate statistical test to determine if the paired results are dependent or not. If an unpaired test is used where a paired test is possible, a lot of information is lost and the unpaired test may be much weaker. So I agree that a paired test would be better. There is no reason to assume that the paired results are independent of each other. The book answer is a paired test. Each plant result is a comparison of its pair of top versus bottom and turned into a single binomial result (top greater => +, top smaller => -). The total of the binomial results is then approximated by the normal distribution. This is the usual thing to do for a large binomial sample.
 
  • Like
Likes tzx9633
  • #8
FactChecker said:
It would require a separate statistical test to determine if the paired results are dependent or not. If an unpaired test is used where a paired test is possible, a lot of information is lost and the unpaired test may be much weaker. So I agree that a paired test would be better. There is no reason to assume that the paired results are independent of each other. The book answer is a paired test. Each plant result is a comparison of its pair of top versus bottom and turned into a single binomial result (top greater => +, top smaller => -). The total of the binomial results is then approximated by the normal distribution. This is the usual thing to do for a large binomial sample.
ok , thanks for your explanation , why t-test isn't used here ? Why z-test is used ?
 
  • #9
FactChecker said:
It would require a separate statistical test to determine if the paired results are dependent or not. If an unpaired test is used where a paired test is possible, a lot of information is lost and the unpaired test may be much weaker. So I agree that a paired test would be better. There is no reason to assume that the paired results are independent of each other. The book answer is a paired test. Each plant result is a comparison of its pair of top versus bottom and turned into a single binomial result (top greater => +, top smaller => -). The total of the binomial results is then approximated by the normal distribution. This is the usual thing to do for a large binomial sample.
It's stated in post #5 that pairs of observations are dependent and

– the sample size is large or small and data normal then use the paired t-test.
 
Last edited by a moderator:
  • #10
Rigorous application of the t-test would be difficult for this problem. The paired samples t-test needs paired samples from fixed populations. In this example, each number is an average of seeds from pods. The number of pods of each plant and each top/bottom may be different and so each average may be from a different distribution. By turning each plant into one binomial sample point (top larger or top smaller), those issues disappear. Once the problem is turned into a binomial, the approximation for large n (preferably n > 20, but this is just an example problem) is the normal distribution.
 
  • #11
FactChecker said:
Rigorous application of the t-test would be difficult for this problem. The paired samples t-test needs paired samples from fixed populations. In this example, each number is an average of seeds from pods. The number of pods of each plant and each top/bottom may be different and so each average may be from a different distribution. By turning each plant into one binomial sample point (top larger or top smaller), those issues disappear. Once the problem is turned into a binomial, the approximation for large n (preferably n > 20, but this is just an example problem) is the normal distribution.
So , the notes is wrong , when the test are dependent , we should use normal distribution (Z test) ? And not t-test ?
 
  • #12
tzx9633 said:
So , the notes is wrong , when the test are dependent , we should use normal distribution (Z test) ? And not t-test ?
In this case, I think so. There is more to using the t-test than just "dependent". (see https://en.wikipedia.org/wiki/Student's_t-test#Assumptions). The population distributions should also be the same normal distribution for each of the option of the pair. In other words, the population of the top pod numbers should be from one normal distribution and the population of the bottom pod numbers should be from another normal distribution. The two distributions (top and bottom) should have the same variances.

That being said, the Student's t-test is reasonably robust regarding violations of the required assumptions. So it may be acceptable to use. I have not studied violations of the assumptions. What the book did by approximating the binomial with a small sample of 10 with a normal distribution is also marginal. The recommendation is to have a sample size of at least 20 (see https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation ).
 
  • #13
FactChecker said:
I think you are right. The one tail test would be appropriate for these hypotheses:
H1 = Being on top increases the number of seeds
H0 = Being on top does not increase the number of seeds

The two tail test is appropriate for the stated hypotheses of the book, regarding any difference -- increase or decrease. The Z value of the book is for the 90% two-tail confidence level, (95% on each side). (see the second table, "Critical values", in http://pegasus.cc.ucf.edu/~pepe/Tables )
Can you help me to confirm again ? I read a several online notes , the alpha used is 0.05 not 0.025(alpha /2) for 2 tailed test ...
 
  • #14
1) The test should be a 2 tailed test because the null hypothesis you specified is "doesn't effect". That includes effects of higher or lower -- so 2 tails.
2) You specified a Zc value of 1.645 in the OP. You can see on the "Critical Values" table in the link that the Zc value of 1.645 is the .90 2 sided value.
3) A 2 tail test at 0.90 confidence has 0.05 on each side (tail). So the number you give is for a 90% confidence level.

If you want a 95% confidence, you need to either change the hypothesis to 1 sided "tops have more seeds than the bottoms", or change Zc value to 1.96.
 
  • Like
Likes tzx9633
  • #15
FactChecker said:
1) The test should be a 2 tailed test because the null hypothesis you specified is "doesn't effect". That includes effects of higher or lower -- so 2 tails.
2) You specified a Zc value of 1.645 in the OP. You can see on the "Critical Values" table in the link that the Zc value of 1.645 is the .90 2 sided value.
3) A 2 tail test at 0.90 confidence has 0.05 on each side (tail). So the number you give is for a 90% confidence level.

If you want a 95% confidence, you need to either change the hypothesis to 1 sided "tops have more seeds than the bottoms", or change Zc value to 1.96.
so , your conclusion is by saying at alpha = 0.05 , then the Zc should be Z0.025 = 1.96 , am i right ?
 
  • #16
FactChecker said:
1) The test should be a 2 tailed test because the null hypothesis you specified is "doesn't effect". That includes effects of higher or lower -- so 2 tails.
2) You specified a Zc value of 1.645 in the OP. You can see on the "Critical Values" table in the link that the Zc value of 1.645 is the .90 2 sided value.
3) A 2 tail test at 0.90 confidence has 0.05 on each side (tail). So the number you give is for a 90% confidence level.

If you want a 95% confidence, you need to either change the hypothesis to 1 sided "tops have more seeds than the bottoms", or change Zc value to 1.96.
https://www.physicsforums.com/threads/wilcoxon-sign-rank-test-rejection-region.935726/

Can you refer to this thread pls ? The is wilcoxon signe drank test ( it's 2 tailed test) , but the author use alpha instead of 0.5alpha ... I'm wondering is the sign test same as signed rank test , which use alpha for 2 tailed test instead of 0.5alpha ??
 

Related to Is a Two-Tailed Test Appropriate for Analyzing Plant Position and Seed Count?

1. What is a sign test distribution table?

A sign test distribution table is a table that shows the critical values for conducting a sign test, which is a non-parametric statistical test used to determine if there is a significant difference between two related samples.

2. How is a sign test distribution table used?

A sign test distribution table is used by finding the critical value that corresponds to the sample size and significance level of the sign test being performed. This critical value is then compared to the calculated test statistic to determine if there is a significant difference between the two samples.

3. What is the purpose of a sign test distribution table?

The purpose of a sign test distribution table is to make it easier for researchers to conduct a sign test by providing the critical values needed to determine if the results are statistically significant. It also helps to standardize the process of conducting a sign test.

4. Are there different sign test distribution tables for different sample sizes or significance levels?

Yes, there are different sign test distribution tables for different sample sizes and significance levels. This is because the critical values needed for the sign test vary based on these factors.

5. Can a sign test distribution table be used for any type of data?

Yes, a sign test distribution table can be used for any type of data as long as the assumptions of the sign test are met. These assumptions include having two related samples and a dichotomous outcome (e.g. success or failure).

Similar threads

  • Calculus and Beyond Homework Help
Replies
1
Views
1K
  • Calculus and Beyond Homework Help
Replies
28
Views
2K
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
1
Views
2K
  • Calculus and Beyond Homework Help
Replies
2
Views
2K
  • Calculus and Beyond Homework Help
Replies
21
Views
2K
  • Calculus and Beyond Homework Help
Replies
4
Views
1K
Replies
1
Views
1K
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
Back
Top