Error propagation question - when do we combine repeated measurements?

Click For Summary

Discussion Overview

The discussion revolves around the statistical methods for calculating a derived quantity A from multiple measurements of a variable x, each with associated measurement errors. Participants explore different approaches to error propagation and the implications of using averages versus individual measurements, particularly in the context of non-linear functions.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant proposes three methods for calculating A from measurements of x, questioning which is statistically sound.
  • Another participant suggests that under certain assumptions, such as Gaussian distribution and small relative errors, Gaussian error propagation can be applied.
  • A later reply clarifies that the exact method to estimate the mean of B(C(x)) is to take B(C(x)) for each measurement and then average the results, noting that linearity of B and C affects the outcome.
  • Concerns are raised about the non-linearity and potential discontinuity of functions B and C, which may complicate the averaging approach.
  • One participant expresses uncertainty about their goal, indicating a desire for a result that approximates a noiseless case, and reflects on the trade-offs between using smaller-error measurements versus averaging larger-error measurements.
  • Another participant suggests that the original question may not be relevant if the context involves curve fitting, indicating a potential shift in focus.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the best method for calculating A, with multiple competing views on the appropriateness of different approaches and the implications of non-linearity in the functions involved.

Contextual Notes

Participants note limitations related to the assumptions of Gaussian distribution, the nature of the functions B and C, and the impact of measurement errors on the derived calculations.

mikeph
Messages
1,229
Reaction score
18
Hi,

say I measure something ten times and get x+d1x, x+dx2, ... where dx1, dx2.. are the measurement errors.

Now, say I want to calculate something from these measurements according to:

A = B(C(x)),

Where A is what I want, and B and C are known functions.

Which is statistically the most "sound" way of calculating it?

take the average of the errors, then find B(C(avg))?
take C(x) then take the average and then find B(avg)?
take A(B(x)) of each measurement, then average the result?

Thanks
 
Physics news on Phys.org
If you assume a Gaussian distribution, have small relative errors and B and C are not completely weird, you can propage the relative uncertainty through your functions with Gaussian error propagation (i. e. see how A changes if x changes a little bit, and multiply with your uncertainty).
If those assumptions are bad (or if you just have 10 measurements), you can calculate A for all measurements, and do the analysis on A only.
 
MikeyW said:
Which is statistically the most "sound" way of calculating it?
You didn't say what "it" is. It appears you want to estimate the mean value of B(C(x)).

The mathematically exact way to compute the usual estimator of the mean of B(C(x)) is:
take B(C(x)) of each measurement, then average the result
(You meant to say "B(C(x))" instead of "A(B(x))" )

If B and C are linear functions (such as B(w) = 3w + 2 , C(y) = 6y - 4 ) then you will get the same result as above by using the other two methods:

take the average of the errors, then find B(C(avg))
take C(x) then take the average and then find B(avg)

If B and C are differentiable functions and the errors dx are small enough so that they are well approximated by the linear functions defined by their tangent lines in the vicinity of x, you will get approximately the same answer from all 3 methods. On the other hand if the curvature of B and C is "large" over distances that are typical of your measurement errors, the latter two methods should not be used.

That's the story if the goal is "Estimate the mean value of B(C(x))?" If the question is "Estimate the value of B(C(w)) evaluated at w = the mean of x?" then B(C(avg)) seems like the natural answer. So it's important that you ask a specific question. Say what you are trying to estimate.
 
That's a good question, I don't know exactly what I want. The data is noisy, so I'd like to get a result that's closest to the noiseless case. What's confusing me is that I thought taking the average and then doing the analysis would give the best result, but it doesn't.

B and C: both non-linear, non-analytic and I'd afraid I'm not 100% certain that they're continuous, but they are deterministic. They're non-linear regression fit parameters, but they do tend to amplify "seemingly" random errors in the data.

I suppose the trade off lies between having a value of A that is due to a single smaller-error measurement, or an average of As due to many larger-error measurements, and the fact that A is not necessarily continuous might mean the latter is a better option...
 
It sounds like you are doing a curve fit of some kind. If that's the case you should explain the details. Your original question seems irrelevant to doing a curve fit.
 

Similar threads

  • · Replies 22 ·
Replies
22
Views
2K
  • · Replies 19 ·
Replies
19
Views
8K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
Replies
15
Views
2K
  • · Replies 56 ·
2
Replies
56
Views
4K
  • · Replies 18 ·
Replies
18
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K