Errors & Consistency: Finding Agreement

  • Context: Graduate 
  • Thread starter Thread starter Astudious
  • Start date Start date
  • Tags Tags
    Errors
Click For Summary

Discussion Overview

The discussion revolves around the question of whether a set of observed values for a constant (c) is consistent with an independently reported value based on error analysis. Participants explore methods for assessing agreement between experimental data and theoretical estimates, including statistical approaches and confidence intervals.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant questions the validity of comparing the mean of the observed data set to the estimated value of c using standard deviation, noting that some values fall outside one standard deviation.
  • Another participant suggests using hypothesis testing and confidence intervals to determine consistency, emphasizing that a value cannot be deemed consistent without considering its uncertainty.
  • There is a discussion about the appropriate method for estimating error from a data set, with one participant proposing a method based on the range of values, while others challenge this approach as inadequate.
  • Participants mention that the experimental value of c is reported with a specific uncertainty, and some values from the data set are consistent within the 68% confidence limit, while others are not.
  • One participant emphasizes that quoting values to many decimal places does not imply accuracy and that all measurements are estimations subject to various methods of error propagation.
  • There is a suggestion that the best way to assess consistency is to determine if the observed values could originate from the same distribution as the experimental value.

Areas of Agreement / Disagreement

Participants express differing views on how to assess the consistency of the observed values with the estimated value of c. There is no consensus on the best method for error estimation or the interpretation of the results.

Contextual Notes

Participants note limitations in the information available, such as the lack of uncertainty for each individual measurement in the data set and the potential differences in measurement methods used by different experimenters.

Who May Find This Useful

This discussion may be of interest to those involved in experimental physics, statistics, or anyone looking to understand the complexities of data consistency and error analysis in scientific measurements.

Astudious
Messages
61
Reaction score
0
The inspiration for this thread is the following question:

"Without further calculation, say whether the observations are consistent with
the set of values [1.215; 1.216; 1.209; 1.212] independently reported for c."

I had previously found that c = 1.150259067 = 1.15 (3sf) and from error analysis that STDEV(c) = 0.0631654689 = 0.06 (1sf).

Now, I am wondering what, fundamentally, we would do to find out whether data is in agreement with a value we have estimated by some other means. I am tempted to (as a general way of handling such queries) take the mean of the data set (here it would be 1.213) and compare to c +/- 1*STDEV(c) (here it would be 1.213424536 which is larger than 1.213, hence we would say "agreement" on this basis) but something does not ring right about this approach.

The data set has two values outside 1 standard dev of the predicted mean. Not only that, but the error of the data set is 3.5*10-3 and the mean value found is nowhere near this close to the c value found earlier. Just from inspection, we can see how odd these measurements would be if our value of c were correct. I suspect this would barely pass a 2% right-tail hypothesis test.

So how do we judge whether the calculated value is consistent with the newly measured data?
 
Physics news on Phys.org
Hint: read up about "hypothesis testing". Your intuition there is a good one.

Note, a value is consistent or not within a confidence interval ... it cannot be consistent all by itself.
A common confidence interval to call something "consistent" is 95% or 2 standard deviations from the mean.
If you are comparing two measurements, each with their own uncertainty, then you may want to look at the distribution of the difference... i.e. is the difference within 2sd of 0?

"the error on the data set" does not make sense in this context ... you have 4 independently quoted values ... maybe from 4 different experimenters using the same method as you but probably using different methods. You are not told what the uncertainty on each measurement was - you are not told that they all come from the same distribution (method). Implicitly, the uncertainty on each value in the data set is ##\pm0.0005## since they are quoted to 3dp.

By rule of thumb:

You have an experimental value ##1.20\pm0.06## units, so any value between 1.08 and 1.32 units will be consistent within 95% confidence limits.
(It is uncommon to quote a value to the same dp as the uncertainty unless the uncertainty significant figure is a 1 or a 2 - much the same reasoning as why you usually only report 1sig fig for the uncertainty.)

The closer the figures to the value you got, the better the agreement, so 68% confidence limits are better ... that would be values between 1.14 and 1.26 ... all the values fall within those.
 
Simon Bridge said:
"the error on the data set" does not make sense in this context ... you have 4 independently quoted values ... maybe from 4 different experimenters using the same method as you but probably using different methods. You are not told what the uncertainty on each measurement was - you are not told that they all come from the same distribution (method). Implicitly, the uncertainty on each value in the data set is ##\pm0.0005## since they are quoted to 3dp.

I thought that the error from a set of data (for variable x) was to be estimated as (1/2)(xmax-xmin) where xmax is the largest value of x and xmin the smallest value of x in the set?

Simon Bridge said:
You have an experimental value ##1.20\pm0.06## units, so any value between 1.08 and 1.32 units will be consistent within 95% confidence limits.
(It is uncommon to quote a value to the same dp as the uncertainty unless the uncertainty significant figure is a 1 or a 2 - much the same reasoning as why you usually only report 1sig fig for the uncertainty.)

The closer the figures to the value you got, the better the agreement, so 68% confidence limits are better ... that would be values between 1.14 and 1.26 ... all the values fall within those.

The experimental value here was actually ##1.15\pm0.06## units. Hence why I put in exact values (##1.150259067 \pm 0.0631654689## to be exact).

Some of the individual values are consistent within the 68% confidence limit (1 standard dev) but some are not. That's why the question arises, how to deal with this problem in general. As I noted in the OP, I tried looking at the mean of the data set, which is just about within 1 standard dev of the experimental value, but I'm not convinced. From the perspective of the data set, the experimentally determined value is nowhere near within 1 standard-dev of that (if you take the uncertainty from the data-set as (1/2)(cmax-cmin)).
 
Astudious said:
I put in exact values (1.150259067±0.0631654689 to be exact.)
Do not be fooled by the large number of decimal places, they are meaningless. These values are not exact and by quoting them you are not being exact.
You may keep 2dp if it makes you feel more comfortable.

I thought that the error from a set of data (for variable x) was to be estimated as (1/2)(xmax-xmin) where xmax is the largest value of x and xmin the smallest value of x in the set?
This is not correct. Well, I don't know what you were told to do but this would be a poor approach to use. (It is a common method taught to beginners to propagate errors through calculations, not how you estimate errors from data.)

Note. all measurements and their errors are estimations and there are many ways to get that estimation.

To get your value you probably got a data set from repeated measurements of the same thing by the same method. From that data you estimate the actual value by the mean of all the measurements in the data set. You estimate the uncertainty, how bad that estimate was, by the standard deviation of the data set divided by the square root of the number of data points.

If you want to know if four more data points are consistent with this, you are basically asking if they all could have come from the same distribution that gave rise to the data set you used. There are a number of different ways to go about this. The fastest is to compare each point. Which is appropriate depends on what you think the values in the data set represent.

I already told you how to deal with the problem in general last post... but don't take my word for it:
https://www2.southeastern.edu/Academics/Faculty/rallain/plab194/error.html
 
Last edited:

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 0 ·
Replies
0
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 29 ·
Replies
29
Views
4K
  • · Replies 28 ·
Replies
28
Views
3K