Statistics - Joint PMF / Hypothesis Testing

Click For Summary

Homework Help Overview

The discussion revolves around a statistical problem involving joint probability mass functions and hypothesis testing related to gender and quality of life in a sampled population. Participants are tasked with defining population proportions and formulating hypotheses based on observed data counts.

Discussion Character

  • Exploratory, Conceptual clarification, Mathematical reasoning, Assumption checking

Approaches and Questions Raised

  • Participants explore the joint probability mass function for the counts of different groups and question the notation used in the problem. There is discussion about the multinomial distribution and the implications of the null hypothesis on the parameters involved.

Discussion Status

Some participants have offered insights into the nature of the parameters and their relationships under the null hypothesis. There is an ongoing exploration of how to express the sets w_0 and w_A, with differing interpretations of their dimensionality and structure being discussed.

Contextual Notes

Participants note that the parameters must sum to one and that the null hypothesis does not specify all parameters, leading to a potential continuum of values for the parameter vector. This introduces complexity in defining the sets related to the hypotheses.

brojesus111
Messages
38
Reaction score
0

Homework Statement



We sample a population 50 times with replacement, with all individual sampled equally likely. We survey the gender and quality of life. The counts are: Male: 13 high quality, 11 low | female: 18 high, 8 low.

Let P_M, P_F, P_H, P_L to denote these population proportions for each of the categories and let population proportions for the combinations be denoted by P_mh, P_ml, P_fh, P_fl. Paramaters are denoted by θ=(P_mh, P_ml, P_fh, P_fl).

Null hypothesis: P_mh = P_m * P_h, P_ml = P_m * P_l... same for females
Alternative hypothesis: P_mh =/= P_m * P_h, P_ml =/= P_m * P_l... same for females1. Data counts are denoted by N_mh, N_ml, N_fh, N_fl. What is the joint probability mass function for these counts?
2. Write down the sets w_0 and w_A so that null hypothesis is θ∈w_0 and alternative hypothesis is θ∈w_A.

Homework Equations


The Attempt at a Solution



1. So I know understand the problem wants us to find P[(N_mh, N_ml, N_fh, N_fl)=(n_mh, n_ml, n_fh, n_fl)]. But I'm a bit confused on the notation and how to answer this problem.

2. Once again I'm not sure I understand exactly how to approach this problem.

Any help is greatly appreciated.
 
Physics news on Phys.org
The population falls into 4 groups, g1 to g4, say. According to the null hyp they have frequencies P_m * P_h, P_m * P_l etc. Call these p1 to p4. Suppose you take N samples from the population, and you get, in order, 2 1 2 4 4 3 2 1 4 ..., getting, in all, ni from group gi. What would the probability of that specific sequence be? How many sequences give the same total for each group?
 
haruspex said:
The population falls into 4 groups, g1 to g4, say. According to the null hyp they have frequencies P_m * P_h, P_m * P_l etc. Call these p1 to p4. Suppose you take N samples from the population, and you get, in order, 2 1 2 4 4 3 2 1 4 ..., getting, in all, ni from group gi. What would the probability of that specific sequence be? How many sequences give the same total for each group?

So it's a multinomial distribution.

Any hints or tips on the second question?
 
I think the key point about the second question is that θ is a vector of four unknowns, but:
- they must add up to 1
- the null hypothesis does not prescribe them all, only a relationship between them.
So I think it's looking for a vector, involving some free parameter, which generically describes elements of w_0.
 
haruspex said:
I think the key point about the second question is that θ is a vector of four unknowns, but:
- they must add up to 1
- the null hypothesis does not prescribe them all, only a relationship between them.
So I think it's looking for a vector, involving some free parameter, which generically describes elements of w_0.

So I think w_0 is just a point and w_A is a hyperplane in 4 dimensions that is missing the point that is in w_0. My problem is how would I write that in the context of this problem?
 
Last edited:
brojesus111 said:
So I think w_0 is just a point
No, that's exactly what I'm saying it isn't. The null hypothesis makes no claim about the gender ratio in the population. Therefore there is a continuum of values of the 4 vector which satisfy it. The answer will be something like {(f1(t), f2(t), f3(t), f4(t))} where fi(t) are functions (you need to determine) of some unknown parameter t. Or maybe there are two free parameters.
w_A will be everywhere in the 4-space except for that continuum.
 

Similar threads

  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K
Replies
6
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 43 ·
2
Replies
43
Views
6K
Replies
4
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 20 ·
Replies
20
Views
4K