Goodness of Fit Test Without Uncertainties

Click For Summary

Discussion Overview

The discussion revolves around performing a goodness of fit test on a dataset representing an x-ray spectrum, specifically focusing on how to approach the test without uncertainties in the data. Participants explore statistical methods applicable to the dataset, which consists of energy bins and counts of photon detections.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant seeks guidance on conducting a goodness of fit test without uncertainties in the dataset, which consists of counts of photon detections in energy bins.
  • Another participant questions the definition of "uncertainties" and the methods used for fitting Gaussian distributions to the data.
  • A participant clarifies that the data is a histogram with energy bins and counts, and expresses confusion about performing a goodness of fit test without explicit uncertainties.
  • There is a suggestion that the uncertainty could be approximated using 1/sqrt(N), where N is the count for each bin, although this is not universally accepted.
  • One participant emphasizes the need to define "uncertainty" in the context of statistical tests, raising questions about its meaning related to standard deviation or precision of measurement.
  • A participant points out that the Pearson Chi-squared test can be applied to the fitted distribution, despite the lack of error bars.
  • Another participant expresses confusion regarding the tendency of the chi-squared value to infinity without uncertainties.
  • There is a discussion about the underlying physical model for the data generation, questioning whether photon energy is treated as a random variable or if a deterministic process is assumed.
  • One participant expresses a tentative inclination to consider photon energy as a random variable and references an experiment outline for further context.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the definition of "uncertainty" or how to appropriately handle it in the context of the goodness of fit test. Multiple competing views on the application of statistical methods remain, particularly regarding the use of Pearson's Chi-squared test.

Contextual Notes

The discussion highlights limitations in defining uncertainties and their implications for statistical tests. There is ambiguity regarding the assumptions made about the data generation process and the appropriateness of various statistical approaches.

nicholls
Messages
92
Reaction score
0
In what way can one do a goodness of fit test on a set of data that contains no uncertainties?

I am doing a simulated data analysis assignment for a physics lab course and they provided me with data for a supposed xray spectrum for some material. The data consists of energy bins with a count of photon detections for each bin (or something like that). There is no uncertainty on these values.

I can fit a couple gaussian distributions to it, but without uncertainties I am confused as to how to approach a goodness of fit test. I do not have much statistics knowledge so any help would be great!
 
Physics news on Phys.org
It isn't clear what you mean by "uncertainties". It also isn't clear why you can fit a couple of gaussian distributions to the data. Did you use two different methods of fitting?

The use of "bins" in a statistics problem hints at using a test like Pearson's Chi-Squared test. However, since this is a physics class, it would be wise to consult your course materials to see what they have in mind.
 
Basically, the data I am given is a histogram where the x values are energy bins of width 0.05 KeV, and the y values are the counts of photon detections per energy bin. There is no uncertainity given for the y values.

The data consists of two peaks, both gaussian in shape. Thus, I have managed to fit using python, two gaussian functions to the two peaks, along with a linear function describing the background noise.

I would now like to perform a goodness of fit test on this fit. However, without explicit y uncertainties I am not sure how to proceed. Someone told me, although they did not give me a good explanation, that for the y uncertainty I can use 1/sqrt(N) where N is the count or y value for each bin.

Does this make more sense?
 
You still haven't managed to define what "uncertainty" is. It may be obvious to physicists what it means in this context, but this is the math section and it isn't obvious to me what it means in the context of statistical tests. Are you talking about "standard deviation"? "probability"? "precision of measurement"?

If you have fit a distribution (which in your case is a mixture of distributions) to the data then the distribution you fit can be used to do the Pearson Chi-squared test or any other test which relies on computing the probability that [itex]y_i[/itex] values fell in bin [itex]i[/itex].
 
What I mean by uncertainty is that there are no error bars on the y values. I'm not exactly sure but I suppose you would just call them standard deviations. Thus with 0 uncertainty, the chi-squared value of the fit tends to infinite.
 
Look at the article on Pearson's Chi-square test in the wikipedia: http://en.wikipedia.org/wiki/Pearson's_chi-squared_test

The test statistic is [itex]\chi^2 = \sum_{i=1}^n \frac { (O_i - E_i)^2} { E_i}[/itex]

There is nothing in the computation of that test statistic that involves error bars, so I don't undestand why it would tend to be infinite.
 
One thing we need to clarify is physical model for how the data is generated.

Are we assuming that photon energy is a random variable and that each photon energy measurement is an independent realization of that random variable?

Or is the model some deterministic process?
 
Pearson's chi squared seems to do the trick. Thanks for showing me that.

I'm not quite sure whether to say that the photon energy is a random variable although I am inclined to say yes.

You could have a quick read of the outline of the experiment if you want:

http://www.physics.utoronto.ca/~phy326/simdata/SDA%20Assignment.pdf"

I am doing experiment A which is on the second page.
 
Last edited by a moderator:

Similar threads

  • · Replies 28 ·
Replies
28
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
4K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
28
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 10 ·
Replies
10
Views
5K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K