- #1

- 23

- 0

## Main Question or Discussion Point

Hi all,

I am using a Xi squared test to for independence of two sets of categorical data.

So lets say I have a vector x1 of 1s and 0s and x2 of 1s and 0s, and I am testing to see if x1 and x2 are independent.

and let's say, for my given data, with n = 200, I have

x1=1 x1=0

x2=1 40 80

x2=0 40 40

For this particular distribution, I get a p value of 0.0184.

How can I 'verify' this using monte carlo method?

I tired two ways so far.

First I calculated, from above, P(x2|x1) = 0.5

I then randomly generated 10000 above tables with p(x1=1) = 0.6 and p(x2=1) = 0.4.

I then looked for the number of groups which had P(x2|x1) > 0.5.

This didn't work...and I realized to I am not checking for the correct thing. But I am using Xi squared in the first place to see if the conditional probability is 'significant', so this should tell me something?

I tried another way in which I generated 10000 above tables, just as before.

The average of these tables is

48 72

32 48

so I looked for all the tables with

<40 >80

>40 <40

Now, one more related question is: if I find that I can reject the null hypothesis that x1 and x2 are independent. What do I use to measure accuracy of the calculated condition probability.

For example if I have x1 = [ zeros(1,998), 1,1] and x2 = [ zeros(1,998), 1,1] .

Then I find that I can reject the null hypothesis, but with what certainty can I say p(x1|x2) = 1?

Will

I am using a Xi squared test to for independence of two sets of categorical data.

So lets say I have a vector x1 of 1s and 0s and x2 of 1s and 0s, and I am testing to see if x1 and x2 are independent.

and let's say, for my given data, with n = 200, I have

x1=1 x1=0

x2=1 40 80

x2=0 40 40

For this particular distribution, I get a p value of 0.0184.

How can I 'verify' this using monte carlo method?

I tired two ways so far.

First I calculated, from above, P(x2|x1) = 0.5

I then randomly generated 10000 above tables with p(x1=1) = 0.6 and p(x2=1) = 0.4.

I then looked for the number of groups which had P(x2|x1) > 0.5.

This didn't work...and I realized to I am not checking for the correct thing. But I am using Xi squared in the first place to see if the conditional probability is 'significant', so this should tell me something?

I tried another way in which I generated 10000 above tables, just as before.

The average of these tables is

48 72

32 48

so I looked for all the tables with

<40 >80

>40 <40

Now, one more related question is: if I find that I can reject the null hypothesis that x1 and x2 are independent. What do I use to measure accuracy of the calculated condition probability.

For example if I have x1 = [ zeros(1,998), 1,1] and x2 = [ zeros(1,998), 1,1] .

Then I find that I can reject the null hypothesis, but with what certainty can I say p(x1|x2) = 1?

Will