I What are the recommended tests for comparing two sets of data?

  • I
  • Thread starter Thread starter VVS2000
  • Start date Start date
  • Tags Tags
    Testing
VVS2000
Messages
150
Reaction score
17
So I have two columns of data, One containing experimental values and the other having expected values. So I read that chi-squared test and Anova Tests can be used to compare two set of data. My main aim is to quantitatively know how different these two sets of data are, so are these two tests enough or is there any other tests that you would suggest?
 
Physics news on Phys.org
The standard measure of distance would be the sum of square residuals. However, I am not sure what you are asking has real meaning as stated.

As I understand your statement, you don’t have two sets of data, you have one set of data and a model. Any set of data can plausibly come from any model given sufficiently uncertain measurements. So just asking about the sum of square residuals doesn’t say much.

If your model has any free parameters then you can meaningfully ask which parameters minimize the residuals. Or if you have two competing models you can ask which model minimizes the residuals. Or if you can predict the expected uncertainty in your measurements, then you can determine how likely the data is under the model.
 
  • Like
Likes WWGD, VVS2000 and FactChecker
Asking "how different" your data is from the expected values of a model is a vague question.
Are you asking how likely (probability) such data might fit the model like that (or worse)? The Chi-squared goodness of fit test is well suited for that.
Are you asking for some numerical measure of how great the differences are? The sum-squared-errors total is well suited for that.
 
  • Like
Likes VVS2000
Dale said:
The standard measure of distance would be the sum of square residuals. However, I am not sure what you are asking has real meaning as stated.

As I understand your statement, you don’t have two sets of data, you have one set of data and a model. Any set of data can plausibly come from any model given sufficiently uncertain measurements. So just asking about the sum of square residuals doesn’t say much.

If your model has any free parameters then you can meaningfully ask which parameters minimize the residuals. Or if you have two competing models you can ask which model minimizes the residuals. Or if you can predict the expected uncertainty in your measurements, then you can determine how likely the data is under the model.
ok sorry for not being clear. I have two sets of data. One column contains Observed or experimental focal length of a lens at different heights from the axis. The Other column contains the expected or theoretical value of focal length. so yeah I want to quantitatively know different these two sets of data are given the uncertainty and errors in the observed data
 
VVS2000 said:
The Other column contains the expected or theoretical value of focal length
Does this column have uncertainty associated with it?
 
  • Like
Likes WWGD
Dale said:
Does this column have uncertainty associated with it?
yes, but all values have the same uncertainty
 
Two possibilities are Chi-Squared goodness of fit, and linear regression. The first one is very general. It requires that the data be combined into categories. The second one requires that the model of the variable of interest is a linear function of the other variables.
 
Back
Top