Errors & Consistency: Finding Agreement

  • Thread starter Astudious
  • Start date
  • Tags
    Errors
In summary, the conversation revolves around finding out whether a set of data is consistent with a previously estimated value, specifically the values [1.215, 1.216, 1.209, 1.212] for c. The speaker suggests using the mean of the data set and comparing it to c +/- 1*STDEV(c) to determine consistency. However, another speaker argues that this approach may not be accurate and suggests looking into hypothesis testing. They also discuss how to estimate the error from a set of data and come to the conclusion that using (1/2)(xmax-xmin) is not an appropriate method.
  • #1
Astudious
61
0
The inspiration for this thread is the following question:

"Without further calculation, say whether the observations are consistent with
the set of values [1.215; 1.216; 1.209; 1.212] independently reported for c."

I had previously found that c = 1.150259067 = 1.15 (3sf) and from error analysis that STDEV(c) = 0.0631654689 = 0.06 (1sf).

Now, I am wondering what, fundamentally, we would do to find out whether data is in agreement with a value we have estimated by some other means. I am tempted to (as a general way of handling such queries) take the mean of the data set (here it would be 1.213) and compare to c +/- 1*STDEV(c) (here it would be 1.213424536 which is larger than 1.213, hence we would say "agreement" on this basis) but something does not ring right about this approach.

The data set has two values outside 1 standard dev of the predicted mean. Not only that, but the error of the data set is 3.5*10-3 and the mean value found is nowhere near this close to the c value found earlier. Just from inspection, we can see how odd these measurements would be if our value of c were correct. I suspect this would barely pass a 2% right-tail hypothesis test.

So how do we judge whether the calculated value is consistent with the newly measured data?
 
Physics news on Phys.org
  • #2
Hint: read up about "hypothesis testing". Your intuition there is a good one.

Note, a value is consistent or not within a confidence interval ... it cannot be consistent all by itself.
A common confidence interval to call something "consistent" is 95% or 2 standard deviations from the mean.
If you are comparing two measurements, each with their own uncertainty, then you may want to look at the distribution of the difference... i.e. is the difference within 2sd of 0?

"the error on the data set" does not make sense in this context ... you have 4 independently quoted values ... maybe from 4 different experimenters using the same method as you but probably using different methods. You are not told what the uncertainty on each measurement was - you are not told that they all come from the same distribution (method). Implicitly, the uncertainty on each value in the data set is ##\pm0.0005## since they are quoted to 3dp.

By rule of thumb:

You have an experimental value ##1.20\pm0.06## units, so any value between 1.08 and 1.32 units will be consistent within 95% confidence limits.
(It is uncommon to quote a value to the same dp as the uncertainty unless the uncertainty significant figure is a 1 or a 2 - much the same reasoning as why you usually only report 1sig fig for the uncertainty.)

The closer the figures to the value you got, the better the agreement, so 68% confidence limits are better ... that would be values between 1.14 and 1.26 ... all the values fall within those.
 
  • #3
Simon Bridge said:
"the error on the data set" does not make sense in this context ... you have 4 independently quoted values ... maybe from 4 different experimenters using the same method as you but probably using different methods. You are not told what the uncertainty on each measurement was - you are not told that they all come from the same distribution (method). Implicitly, the uncertainty on each value in the data set is ##\pm0.0005## since they are quoted to 3dp.

I thought that the error from a set of data (for variable x) was to be estimated as (1/2)(xmax-xmin) where xmax is the largest value of x and xmin the smallest value of x in the set?

Simon Bridge said:
You have an experimental value ##1.20\pm0.06## units, so any value between 1.08 and 1.32 units will be consistent within 95% confidence limits.
(It is uncommon to quote a value to the same dp as the uncertainty unless the uncertainty significant figure is a 1 or a 2 - much the same reasoning as why you usually only report 1sig fig for the uncertainty.)

The closer the figures to the value you got, the better the agreement, so 68% confidence limits are better ... that would be values between 1.14 and 1.26 ... all the values fall within those.

The experimental value here was actually ##1.15\pm0.06## units. Hence why I put in exact values (##1.150259067 \pm 0.0631654689## to be exact).

Some of the individual values are consistent within the 68% confidence limit (1 standard dev) but some are not. That's why the question arises, how to deal with this problem in general. As I noted in the OP, I tried looking at the mean of the data set, which is just about within 1 standard dev of the experimental value, but I'm not convinced. From the perspective of the data set, the experimentally determined value is nowhere near within 1 standard-dev of that (if you take the uncertainty from the data-set as (1/2)(cmax-cmin)).
 
  • #4
Astudious said:
I put in exact values (1.150259067±0.0631654689 to be exact.)
Do not be fooled by the large number of decimal places, they are meaningless. These values are not exact and by quoting them you are not being exact.
You may keep 2dp if it makes you feel more comfortable.

I thought that the error from a set of data (for variable x) was to be estimated as (1/2)(xmax-xmin) where xmax is the largest value of x and xmin the smallest value of x in the set?
This is not correct. Well, I don't know what you were told to do but this would be a poor approach to use. (It is a common method taught to beginners to propagate errors through calculations, not how you estimate errors from data.)

Note. all measurements and their errors are estimations and there are many ways to get that estimation.

To get your value you probably got a data set from repeated measurements of the same thing by the same method. From that data you estimate the actual value by the mean of all the measurements in the data set. You estimate the uncertainty, how bad that estimate was, by the standard deviation of the data set divided by the square root of the number of data points.

If you want to know if four more data points are consistent with this, you are basically asking if they all could have come from the same distribution that gave rise to the data set you used. There are a number of different ways to go about this. The fastest is to compare each point. Which is appropriate depends on what you think the values in the data set represent.

I already told you how to deal with the problem in general last post... but don't take my word for it:
https://www2.southeastern.edu/Academics/Faculty/rallain/plab194/error.html
 
Last edited:

Related to Errors & Consistency: Finding Agreement

1. What is the importance of finding agreement in scientific research?

Finding agreement is crucial in scientific research because it ensures the accuracy and reliability of the results. It allows for the confirmation of findings by multiple researchers and reduces the likelihood of errors or biases affecting the conclusions drawn from the data.

2. What are the common types of errors in scientific research?

The common types of errors in scientific research include human errors, measurement errors, sampling errors, and systemic errors. Human errors can occur due to mistakes in data collection or analysis, while measurement errors can arise from faulty instruments or techniques. Sampling errors occur when the sample selected for the study is not representative of the entire population, and systemic errors can be caused by external factors that affect the study's results.

3. How can consistency be achieved in scientific research?

Consistency can be achieved in scientific research by following established protocols and methods, using reliable and standardized measurement tools, and ensuring that data is collected and analyzed consistently throughout the study. It is also essential to have multiple researchers replicate the study to confirm the consistency of the results.

4. Why is it important to address errors and inconsistencies in scientific research?

Addressing errors and inconsistencies in scientific research is crucial because they can significantly impact the validity and credibility of the study's findings. Ignoring or overlooking these issues can lead to incorrect conclusions and potentially harm the progress and advancement of knowledge in the field.

5. How can researchers reduce errors and increase consistency in their studies?

Researchers can reduce errors and increase consistency in their studies by implementing rigorous quality control measures, conducting thorough data checks, and having multiple researchers review and validate the results. It is also crucial to be transparent and report any potential errors or inconsistencies in the study's methods or results to ensure the accuracy and integrity of the research.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
993
  • Chemistry
Replies
1
Views
908
  • Introductory Physics Homework Help
Replies
2
Views
1K
  • Programming and Computer Science
Replies
29
Views
2K
  • Special and General Relativity
Replies
28
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
2K
Replies
10
Views
2K
Replies
16
Views
12K
  • Engineering and Comp Sci Homework Help
Replies
3
Views
1K
Back
Top