Testing Hypothesis (vaccine)

  • Context: High School 
  • Thread starter Thread starter Agent Smith
  • Start date Start date
  • Tags Tags
    hypothesis Vaccine
Click For Summary

Discussion Overview

The discussion revolves around the statistical analysis of vaccine efficacy using hypothesis testing. Participants explore different methods for comparing disease incidence between vaccinated and unvaccinated groups, including the use of Chi-square tests and a specific formula for calculating z-scores. The conversation includes questions about the validity of approaches and interpretations of results.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant presents a hypothesis testing framework with null and alternative hypotheses regarding disease incidence in vaccinated versus unvaccinated groups.
  • Another participant suggests that the proposed hypothesis can be tested using a formula that indicates significant differences.
  • Concerns are raised about discrepancies between P-values obtained from different statistical methods, specifically between Chi-square tests and the method outlined in the original post.
  • A participant questions the implications of known versus unknown variances in the context of the analysis.
  • Clarifications are sought regarding the components of the formula, particularly how proportions and standard deviations are calculated and combined.
  • Participants confirm that assigned values for proportions can be used in the analysis, leading to a reformulation of hypotheses.

Areas of Agreement / Disagreement

Participants express differing views on the equivalence of statistical methods and the interpretation of results, indicating that the discussion remains unresolved with multiple competing perspectives on the analysis.

Contextual Notes

Participants note potential limitations related to the assumptions of variance and the conditions under which the statistical tests are applied, but these remain unresolved.

Agent Smith
Messages
345
Reaction score
36
TL;DR
Vaccine efficacy hypothesis using a method that may/may not be appropriate
Placebo/control group has 800 people. They aren't given the vaccine. 60 of them develop the disease
Treatment group has 1000 people. They're given the vaccine. 15 develop the disease

It seems there's a formula viz ##\frac{p_1 - p_2}{\sqrt{p (1 - p)}\left(\frac{1}{n_1} + \frac {1}{n_2}\right)}## to do this.

However I tried something different. Would like to know if it's valid.

For ##p_1## = proportion of diseased in unvaccinated and ##p_2## = proportion of diseased in vaccinated, my hypotheses are:
##H_0: p_2 =p_1##
##H_a: p_2 < p_1##

Is this ok?
 
Physics news on Phys.org
It is equivalent. Your ##H_0## can be tested by seeing if the above formula is significantly different from 0.
 
  • Like
Likes   Reactions: Gavran and Agent Smith
@Dale when I do a Chi-square test, my ##\chi^2 \approx 40## for a degree of freedom = 1, that gives a P-value << 0.05,

BUT

when I use the method in the OP, the P-value = 0.0113. It is < 0.05, but not << 0.5.

How come?
 
Last edited:
I don’t know. I think they should be the same, but I have to admit that I have not calculated this test by hand in almost 30 years
 
  • Like
Likes   Reactions: Agent Smith
Capture.PNG


@Dale can you break the formula down for me? We seem to be "combining" the frequencies with p.
##p = \frac{15 + 60}{1000 + 800}## and then using it compute a standard deviation ##\sqrt {p (1 - p)}##. What does ##\left(\frac{1}{n_1} + \frac{1}{n_2}\right)## do?

A person has assigned values for ##p_1## and ##p_2## as follows:
Capture.PNG


Can I do ##p_1 = 60/800## and ##p_2 = 15/1000##?
 
Last edited:
In the original question, it says "variances unknown". What if variances are known? 🤔
All the relevant variances/standard deviations seem computable from givens.
 
Last edited:
Agent Smith said:
@Dale when I do a Chi-square test, my ##\chi^2 \approx 40## for a degree of freedom = 1, that gives a P-value << 0.05,
This is okay.

Agent Smith said:
BUT

when I use the method in the OP, the P-value = 0.0113. It is < 0.05, but not << 0.5.

How come?
I have got ## 6,3 ## for ## z ## value which gives ## P ## much less than ## 0,05 ##.

Agent Smith said:
View attachment 353603

@Dale can you break the formula down for me? We seem to be "combining" the frequencies with p.
##p = \frac{15 + 60}{1000 + 800}## and then using it compute a standard deviation ##\sqrt {p (1 - p)}##. What does ##\left(\frac{1}{n_1} + \frac{1}{n_2}\right)## do?
## \begin{align}
z&=\frac{p_1-p_2}{\sqrt{p(1-p)(\frac{1}{n_1}+\frac{1}{n_2})}}\nonumber\\
&=\frac{p_1-p_2}{\sqrt{\frac{p(1-p)}{n_1}+\frac{p(1-p)}{n_2}}}\nonumber
\end{align} ##

where $$ \sigma_1^2=\frac{p(1-p)}{n_1} $$ is the first population variance (800 people, 60 of them develop the disease ) and $$ \sigma_2^2=\frac{p(1-p)}{n_2} $$ is the second population variance (1000 people, 15 of them develop the disease).

So ## p_1-p_2 ## has a normal distribution ## \mathcal{N}(0,\sigma_1^2+\sigma_2^2) ##.

Agent Smith said:
A person has assigned values for ##p_1## and ##p_2## as follows:
View attachment 353604

Can I do ##p_1 = 60/800## and ##p_2 = 15/1000##?
Yes you can and now you will have: ## H_0:p_1-p_2=0 ## and ## H_A:p_1-p_2<0 ##. In this case ## z ## value will be ## -6,3 ## and ## P ## will be the same as for the case in the original post.
 
  • Like
Likes   Reactions: Dale and Agent Smith
@Gavran gracias. Very clear. I hope I can remember. :smile:
 
  • Like
Likes   Reactions: Gavran and Dale

Similar threads

Replies
20
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
1
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 516 ·
18
Replies
516
Views
38K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 3 ·
Replies
3
Views
5K