I How to calculate the right chisquare

  • I
  • Thread starter Thread starter BillKet
  • Start date Start date
BillKet
Messages
311
Reaction score
30
Hello! I have some simulated background only data and some background plus signal simulated data. After some cuts I end up with a histogram for each of the 2 sets and I want to calculate the chisquare (and hence the p-value for a signal actually being present in the background plus signal simulated data). However, it seems like the p-value changes quite a lot when changing the number of bins I use to bin my data. How do I take the number of bins into account when calculating the chisquare? Thank you!
 
Physics news on Phys.org
BillKet said:
How do I take the number of bins into account when calculating the chisquare?
Are you actually asking that question? The chi-square statistic is a funtion of the number of "cells" in the contingency table.

However, it seems like the p-value changes quite a lot when changing the number of bins I use to bin my data.

A usable null hypothesis requires that you be able to compute the expected number of observations that fall in each cell. How are you computing this expected number? How does it change when you change the number of bins? (Pehaps this calculation is done by using the simulated data. If so, exactly how is the simulated data used?)
 
Stephen Tashi said:
Are you actually asking that question? The chi-square statistic is a funtion of the number of "cells" in the contingency table.
A usable null hypothesis requires that you be able to compute the expected number of observations that fall in each cell. How are you computing this expected number? How does it change when you change the number of bins? (Pehaps this calculation is done by using the simulated data. If so, exactly how is the simulated data used?)
What do you mean by expected number? Isn't that the number of events?
 
BillKet said:
What do you mean by expected number? Isn't that the number of events?

I mean the "expected number" in the sense of the expected value of a random variable.

Have you read an article about Pearsons chi-square test or whatever variant of the chi-square test you want to use? For example, in the Wikipedia article https://en.wikipedia.org/wiki/Pearson's_chi-squared_test, the expected number of counts in a cell is denoted as "##E_i##".
 
BillKet said:
After some cuts I end up with a histogram for each of the 2 sets and I want to calculate the chisquare (and hence the p-value for a signal actually being present in the background plus signal simulated data).

What do you mean by the number of cuts? And what is the nature of the data?
 
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Back
Top