# Error propagation question - when do we combine repeated measurements?

## Main Question or Discussion Point

Hi,

say I measure something ten times and get x+d1x, x+dx2, ... where dx1, dx2.. are the measurement errors.

Now, say I want to calculate something from these measurements according to:

A = B(C(x)),

Where A is what I want, and B and C are known functions.

Which is statistically the most "sound" way of calculating it?

take the average of the errors, then find B(C(avg))?
take C(x) then take the average and then find B(avg)?
take A(B(x)) of each measurement, then average the result?

Thanks

Related Set Theory, Logic, Probability, Statistics News on Phys.org
mfb
Mentor
If you assume a Gaussian distribution, have small relative errors and B and C are not completely weird, you can propage the relative uncertainty through your functions with Gaussian error propagation (i. e. see how A changes if x changes a little bit, and multiply with your uncertainty).
If those assumptions are bad (or if you just have 10 measurements), you can calculate A for all measurements, and do the analysis on A only.

Stephen Tashi
Which is statistically the most "sound" way of calculating it?
You didn't say what "it" is. It appears you want to estimate the mean value of B(C(x)).

The mathematically exact way to compute the usual estimator of the mean of B(C(x)) is:
take B(C(x)) of each measurement, then average the result
(You meant to say "B(C(x))" instead of "A(B(x))" )

If B and C are linear functions (such as B(w) = 3w + 2 , C(y) = 6y - 4 ) then you will get the same result as above by using the other two methods:

take the average of the errors, then find B(C(avg))
take C(x) then take the average and then find B(avg)
If B and C are differentiable functions and the errors dx are small enough so that they are well approximated by the linear functions defined by their tangent lines in the vicinity of x, you will get approximately the same answer from all 3 methods. On the other hand if the curvature of B and C is "large" over distances that are typical of your measurement errors, the latter two methods should not be used.

That's the story if the goal is "Estimate the mean value of B(C(x))?" If the question is "Estimate the value of B(C(w)) evaluated at w = the mean of x?" then B(C(avg)) seems like the natural answer. So it's important that you ask a specific question. Say what you are trying to estimate.

That's a good question, I don't know exactly what I want. The data is noisy, so I'd like to get a result that's closest to the noiseless case. What's confusing me is that I thought taking the average and then doing the analysis would give the best result, but it doesn't.

B and C: both non-linear, non-analytic and I'd afraid I'm not 100% certain that they're continuous, but they are deterministic. They're non-linear regression fit parameters, but they do tend to amplify "seemingly" random errors in the data.

I suppose the trade off lies between having a value of A that is due to a single smaller-error measurement, or an average of As due to many larger-error measurements, and the fact that A is not necessarily continuous might mean the latter is a better option...

Stephen Tashi