Math with experimental uncertainties

Click For Summary
SUMMARY

The discussion focuses on the derivation of error propagation in the context of experimental uncertainties, specifically addressing the formula for the sum of two data points with uncertainties: a ± b and c ± d, resulting in a + c ± √(b² + d²). It also critiques a heuristic approach to maximizing and minimizing a function f acting on multiple uncertain data points, suggesting that while this method may yield rough estimates, it is not mathematically rigorous. The correct approach involves linearizing the function and calculating the combined uncertainty using the formula √((df/dx₁ * y₁)² + (df/dx₂ * y₂)² + ... + (df/dxₙ * yₙ)²).

PREREQUISITES
  • Understanding of basic statistics, particularly standard deviation and convolution of distributions.
  • Familiarity with error propagation techniques in experimental physics.
  • Knowledge of calculus, specifically differentiation and linearization of functions.
  • Experience with statistical independence and its implications in error analysis.
NEXT STEPS
  • Study the principles of error propagation in experimental data analysis.
  • Learn about convolution of probability distributions and its applications in statistics.
  • Explore linearization techniques for functions with multiple variables in uncertainty analysis.
  • Investigate the use of numerical methods for estimating uncertainties in complex functions.
USEFUL FOR

Researchers, experimental physicists, and statisticians who deal with data analysis involving uncertainties and error propagation in their work.

Identity
Messages
151
Reaction score
0
Why is it that if you have two data points [tex]a \pm b[/tex] and [tex]c \pm d[/tex] whose uncertainties are symmetrically distributed, the sum of the points is

[tex]a+c \pm \sqrt{b^2+d^2}[/tex]

Can someone please help me with this derivation.



Also, another separate question, suppose I have many uncertain data points: [tex]x_1 \pm y_1[/tex], [tex]x_2 \pm y_2+...[/tex]. And I have a function that acts on all of them: [tex]f(x_1 \pm y_1, x_2 \pm y_2,...,x_n \pm y_n)[/tex]

Is the following reasoning valid:

Choose [tex]x_i \pm y_i[/tex] in order to maximize f.

(For instance, if I had f(\frac{1}{x \pm y}) you would choose [tex]x -y[/tex] to maximize f.)

Next, you choose [tex]x_i \pm y_i[/tex] in order to minimize f.

Once you have f_{max} and f_{min}, you find the average of the two, so you have:

f(x_1 \pm y_1, x_2 \pm y_2,..., x_n \pm y_n) = \frac{f_{max}+f_{min}}{2} \pm \frac{f_{max}-f_{min}}{2}

(SORRY FOR NO LATEX, THE LATEX CODE IS GIVING A COMPLETELY DIFFERENT EQUATION!)


Thanks!
 
Last edited:
Physics news on Phys.org
It is a property of standard deviations of distributions that when they are convoluted, the standard deviation of the result is the square root of the sum of the squares of the standard deviations.

In other words: consider a random variable B giving your first error (with standard deviation b), and consider another random variable D giving your second error (with standard deviation d).

If we assume that B and D are statistically independent, we can consider the distribution of the random variable F = B + D, and we know then that the probability distribution of F will be the convolution of the one of B and the one of D (if they ARE statistically dependent, this is not true anymore).

F is the error on the sum of course. So the probability distribution of the error of the sum (namely of F) is the convolution of the distributions of B and of D. It is a property of the convolution that the standard deviation of the distribution of F, say, f, is given by sqrt(b^2 + d^2).

Hence, f = sqrt(b^2 + d^2).

As to your second question, what you do is a kind of heuristic guessing, which might give an answer that is not too far from the right answer, but it is not a correct technique (although, as I said, heuristically maybe useful, say in a computer program that has to give you some rough estimate of the error).

The correct way to to, at least if the errors are small so that the function f can be linearized over the range of the errors, is to calculate sqrt( (df/dx1 * y1)^2 + (df/dx2 * y2)^2 +... (df/dxn * yn)^2 )

The explanation is close to that of the first question: you construct a new random variable (the error on the outcome), which will in this case be a weighted sum of the random variables representing the errors y1, y2, ...yn.
This is obtained by linearizing f around f(x1,x2,...xn).
 
Last edited:
Thanks vanesch :)
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 30 ·
2
Replies
30
Views
4K
  • · Replies 20 ·
Replies
20
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
24
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K