How to calculate the right chisquare

  • I
  • Thread starter BillKet
  • Start date
  • #1
84
8
Hello! I have some simulated background only data and some background plus signal simulated data. After some cuts I end up with a histogram for each of the 2 sets and I want to calculate the chisquare (and hence the p-value for a signal actually being present in the background plus signal simulated data). However, it seems like the p-value changes quite a lot when changing the number of bins I use to bin my data. How do I take the number of bins into account when calculating the chisquare? Thank you!
 

Answers and Replies

  • #2
Stephen Tashi
Science Advisor
7,403
1,372
How do I take the number of bins into account when calculating the chisquare?
Are you actually asking that question? The chi-square statistic is a funtion of the number of "cells" in the contingency table.

However, it seems like the p-value changes quite a lot when changing the number of bins I use to bin my data.
A usable null hypothesis requires that you be able to compute the expected number of observations that fall in each cell. How are you computing this expected number? How does it change when you change the number of bins? (Pehaps this calculation is done by using the simulated data. If so, exactly how is the simulated data used?)
 
  • #3
84
8
Are you actually asking that question? The chi-square statistic is a funtion of the number of "cells" in the contingency table.



A usable null hypothesis requires that you be able to compute the expected number of observations that fall in each cell. How are you computing this expected number? How does it change when you change the number of bins? (Pehaps this calculation is done by using the simulated data. If so, exactly how is the simulated data used?)
What do you mean by expected number? Isn't that the number of events?
 
  • #4
Stephen Tashi
Science Advisor
7,403
1,372
What do you mean by expected number? Isn't that the number of events?
I mean the "expected number" in the sense of the expected value of a random variable.

Have you read an article about Pearsons chi-square test or whatever variant of the chi-square test you want to use? For example, in the Wikipedia article https://en.wikipedia.org/wiki/Pearson's_chi-squared_test, the expected number of counts in a cell is denoted as "##E_i##".
 
  • #5
gleem
Science Advisor
Education Advisor
1,710
1,042
After some cuts I end up with a histogram for each of the 2 sets and I want to calculate the chisquare (and hence the p-value for a signal actually being present in the background plus signal simulated data).
What do you mean by the number of cuts? And what is the nature of the data?
 

Related Threads on How to calculate the right chisquare

Replies
5
Views
2K
  • Last Post
Replies
9
Views
796
  • Last Post
Replies
13
Views
2K
  • Last Post
2
Replies
28
Views
1K
  • Last Post
Replies
8
Views
137
  • Last Post
Replies
5
Views
2K
Replies
2
Views
2K
  • Last Post
Replies
0
Views
1K
Replies
1
Views
9K
Top