Error propagation question - when do we combine repeated measurements?

mikeph
Messages
1,229
Reaction score
18
Hi,

say I measure something ten times and get x+d1x, x+dx2, ... where dx1, dx2.. are the measurement errors.

Now, say I want to calculate something from these measurements according to:

A = B(C(x)),

Where A is what I want, and B and C are known functions.

Which is statistically the most "sound" way of calculating it?

take the average of the errors, then find B(C(avg))?
take C(x) then take the average and then find B(avg)?
take A(B(x)) of each measurement, then average the result?

Thanks
 
Physics news on Phys.org
If you assume a Gaussian distribution, have small relative errors and B and C are not completely weird, you can propage the relative uncertainty through your functions with Gaussian error propagation (i. e. see how A changes if x changes a little bit, and multiply with your uncertainty).
If those assumptions are bad (or if you just have 10 measurements), you can calculate A for all measurements, and do the analysis on A only.
 
MikeyW said:
Which is statistically the most "sound" way of calculating it?
You didn't say what "it" is. It appears you want to estimate the mean value of B(C(x)).

The mathematically exact way to compute the usual estimator of the mean of B(C(x)) is:
take B(C(x)) of each measurement, then average the result
(You meant to say "B(C(x))" instead of "A(B(x))" )

If B and C are linear functions (such as B(w) = 3w + 2 , C(y) = 6y - 4 ) then you will get the same result as above by using the other two methods:

take the average of the errors, then find B(C(avg))
take C(x) then take the average and then find B(avg)

If B and C are differentiable functions and the errors dx are small enough so that they are well approximated by the linear functions defined by their tangent lines in the vicinity of x, you will get approximately the same answer from all 3 methods. On the other hand if the curvature of B and C is "large" over distances that are typical of your measurement errors, the latter two methods should not be used.

That's the story if the goal is "Estimate the mean value of B(C(x))?" If the question is "Estimate the value of B(C(w)) evaluated at w = the mean of x?" then B(C(avg)) seems like the natural answer. So it's important that you ask a specific question. Say what you are trying to estimate.
 
That's a good question, I don't know exactly what I want. The data is noisy, so I'd like to get a result that's closest to the noiseless case. What's confusing me is that I thought taking the average and then doing the analysis would give the best result, but it doesn't.

B and C: both non-linear, non-analytic and I'd afraid I'm not 100% certain that they're continuous, but they are deterministic. They're non-linear regression fit parameters, but they do tend to amplify "seemingly" random errors in the data.

I suppose the trade off lies between having a value of A that is due to a single smaller-error measurement, or an average of As due to many larger-error measurements, and the fact that A is not necessarily continuous might mean the latter is a better option...
 
It sounds like you are doing a curve fit of some kind. If that's the case you should explain the details. Your original question seems irrelevant to doing a curve fit.
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top