Statistics - Joint PMF / Hypothesis Testing

brojesus111
Messages
38
Reaction score
0

Homework Statement



We sample a population 50 times with replacement, with all individual sampled equally likely. We survey the gender and quality of life. The counts are: Male: 13 high quality, 11 low | female: 18 high, 8 low.

Let P_M, P_F, P_H, P_L to denote these population proportions for each of the categories and let population proportions for the combinations be denoted by P_mh, P_ml, P_fh, P_fl. Paramaters are denoted by θ=(P_mh, P_ml, P_fh, P_fl).

Null hypothesis: P_mh = P_m * P_h, P_ml = P_m * P_l... same for females
Alternative hypothesis: P_mh =/= P_m * P_h, P_ml =/= P_m * P_l... same for females1. Data counts are denoted by N_mh, N_ml, N_fh, N_fl. What is the joint probability mass function for these counts?
2. Write down the sets w_0 and w_A so that null hypothesis is θ∈w_0 and alternative hypothesis is θ∈w_A.

Homework Equations


The Attempt at a Solution



1. So I know understand the problem wants us to find P[(N_mh, N_ml, N_fh, N_fl)=(n_mh, n_ml, n_fh, n_fl)]. But I'm a bit confused on the notation and how to answer this problem.

2. Once again I'm not sure I understand exactly how to approach this problem.

Any help is greatly appreciated.
 
Physics news on Phys.org
The population falls into 4 groups, g1 to g4, say. According to the null hyp they have frequencies P_m * P_h, P_m * P_l etc. Call these p1 to p4. Suppose you take N samples from the population, and you get, in order, 2 1 2 4 4 3 2 1 4 ..., getting, in all, ni from group gi. What would the probability of that specific sequence be? How many sequences give the same total for each group?
 
haruspex said:
The population falls into 4 groups, g1 to g4, say. According to the null hyp they have frequencies P_m * P_h, P_m * P_l etc. Call these p1 to p4. Suppose you take N samples from the population, and you get, in order, 2 1 2 4 4 3 2 1 4 ..., getting, in all, ni from group gi. What would the probability of that specific sequence be? How many sequences give the same total for each group?

So it's a multinomial distribution.

Any hints or tips on the second question?
 
I think the key point about the second question is that θ is a vector of four unknowns, but:
- they must add up to 1
- the null hypothesis does not prescribe them all, only a relationship between them.
So I think it's looking for a vector, involving some free parameter, which generically describes elements of w_0.
 
haruspex said:
I think the key point about the second question is that θ is a vector of four unknowns, but:
- they must add up to 1
- the null hypothesis does not prescribe them all, only a relationship between them.
So I think it's looking for a vector, involving some free parameter, which generically describes elements of w_0.

So I think w_0 is just a point and w_A is a hyperplane in 4 dimensions that is missing the point that is in w_0. My problem is how would I write that in the context of this problem?
 
Last edited:
brojesus111 said:
So I think w_0 is just a point
No, that's exactly what I'm saying it isn't. The null hypothesis makes no claim about the gender ratio in the population. Therefore there is a continuum of values of the 4 vector which satisfy it. The answer will be something like {(f1(t), f2(t), f3(t), f4(t))} where fi(t) are functions (you need to determine) of some unknown parameter t. Or maybe there are two free parameters.
w_A will be everywhere in the 4-space except for that continuum.
 
There are two things I don't understand about this problem. First, when finding the nth root of a number, there should in theory be n solutions. However, the formula produces n+1 roots. Here is how. The first root is simply ##\left(r\right)^{\left(\frac{1}{n}\right)}##. Then you multiply this first root by n additional expressions given by the formula, as you go through k=0,1,...n-1. So you end up with n+1 roots, which cannot be correct. Let me illustrate what I mean. For this...

Similar threads

Back
Top