Math with experimental uncertainties

1. Feb 26, 2010

Identity

Why is it that if you have two data points $$a \pm b$$ and $$c \pm d$$ whose uncertainties are symmetrically distributed, the sum of the points is

$$a+c \pm \sqrt{b^2+d^2}$$

Also, another separate question, suppose I have many uncertain data points: $$x_1 \pm y_1$$, $$x_2 \pm y_2+...$$. And I have a function that acts on all of them: $$f(x_1 \pm y_1, x_2 \pm y_2,...,x_n \pm y_n)$$

Is the following reasoning valid:

Choose $$x_i \pm y_i$$ in order to maximize f.

(For instance, if I had f(\frac{1}{x \pm y}) you would choose $$x -y$$ to maximize f.)

Next, you choose $$x_i \pm y_i$$ in order to minimize f.

Once you have f_{max} and f_{min}, you find the average of the two, so you have:

f(x_1 \pm y_1, x_2 \pm y_2,..., x_n \pm y_n) = \frac{f_{max}+f_{min}}{2} \pm \frac{f_{max}-f_{min}}{2}

(SORRY FOR NO LATEX, THE LATEX CODE IS GIVING A COMPLETELY DIFFERENT EQUATION!)

Thanks!!

Last edited: Feb 26, 2010
2. Feb 27, 2010

vanesch

Staff Emeritus
It is a property of standard deviations of distributions that when they are convoluted, the standard deviation of the result is the square root of the sum of the squares of the standard deviations.

In other words: consider a random variable B giving your first error (with standard deviation b), and consider another random variable D giving your second error (with standard deviation d).

If we assume that B and D are statistically independent, we can consider the distribution of the random variable F = B + D, and we know then that the probability distribution of F will be the convolution of the one of B and the one of D (if they ARE statistically dependent, this is not true anymore).

F is the error on the sum of course. So the probability distribution of the error of the sum (namely of F) is the convolution of the distributions of B and of D. It is a property of the convolution that the standard deviation of the distribution of F, say, f, is given by sqrt(b^2 + d^2).

Hence, f = sqrt(b^2 + d^2).

As to your second question, what you do is a kind of heuristic guessing, which might give an answer that is not too far from the right answer, but it is not a correct technique (although, as I said, heuristically maybe useful, say in a computer program that has to give you some rough estimate of the error).

The correct way to to, at least if the errors are small so that the function f can be linearized over the range of the errors, is to calculate sqrt( (df/dx1 * y1)^2 + (df/dx2 * y2)^2 +.... (df/dxn * yn)^2 )

The explanation is close to that of the first question: you construct a new random variable (the error on the outcome), which will in this case be a weighted sum of the random variables representing the errors y1, y2, ...yn.
This is obtained by linearizing f around f(x1,x2,...xn).

Last edited: Feb 27, 2010
3. Mar 1, 2010

Identity

Thanks vanesch :)