- #1
- 23
- 0
Hi all,
I am using a Xi squared test to for independence of two sets of categorical data.
So let's say I have a vector x1 of 1s and 0s and x2 of 1s and 0s, and I am testing to see if x1 and x2 are independent.
and let's say, for my given data, with n = 200, I have
x1=1 x1=0
x2=1 40 80
x2=0 40 40
For this particular distribution, I get a p value of 0.0184.
How can I 'verify' this using monte carlo method?
I tired two ways so far.
First I calculated, from above, P(x2|x1) = 0.5
I then randomly generated 10000 above tables with p(x1=1) = 0.6 and p(x2=1) = 0.4.
I then looked for the number of groups which had P(x2|x1) > 0.5.
This didn't work...and I realized to I am not checking for the correct thing. But I am using Xi squared in the first place to see if the conditional probability is 'significant', so this should tell me something?
I tried another way in which I generated 10000 above tables, just as before.
The average of these tables is
48 72
32 48
so I looked for all the tables with
<40 >80
>40 <40
Now, one more related question is: if I find that I can reject the null hypothesis that x1 and x2 are independent. What do I use to measure accuracy of the calculated condition probability.
For example if I have x1 = [ zeros(1,998), 1,1] and x2 = [ zeros(1,998), 1,1] .
Then I find that I can reject the null hypothesis, but with what certainty can I say p(x1|x2) = 1?
Will
I am using a Xi squared test to for independence of two sets of categorical data.
So let's say I have a vector x1 of 1s and 0s and x2 of 1s and 0s, and I am testing to see if x1 and x2 are independent.
and let's say, for my given data, with n = 200, I have
x1=1 x1=0
x2=1 40 80
x2=0 40 40
For this particular distribution, I get a p value of 0.0184.
How can I 'verify' this using monte carlo method?
I tired two ways so far.
First I calculated, from above, P(x2|x1) = 0.5
I then randomly generated 10000 above tables with p(x1=1) = 0.6 and p(x2=1) = 0.4.
I then looked for the number of groups which had P(x2|x1) > 0.5.
This didn't work...and I realized to I am not checking for the correct thing. But I am using Xi squared in the first place to see if the conditional probability is 'significant', so this should tell me something?
I tried another way in which I generated 10000 above tables, just as before.
The average of these tables is
48 72
32 48
so I looked for all the tables with
<40 >80
>40 <40
Now, one more related question is: if I find that I can reject the null hypothesis that x1 and x2 are independent. What do I use to measure accuracy of the calculated condition probability.
For example if I have x1 = [ zeros(1,998), 1,1] and x2 = [ zeros(1,998), 1,1] .
Then I find that I can reject the null hypothesis, but with what certainty can I say p(x1|x2) = 1?
Will