# How do you find the uncertainty in a function of two variables?

1. Sep 30, 2012

### JDoolin

I'm trying to make up an automatic spreadsheet for my Physics students so they can enter the data and see everything calculated for them, but I ran into a snag.

I have two random variables, h and t with uncertainties Δh, and Δt, respectively.

The uncertainty in h is simply estimated, based on the precision and technique used to get the data.

The uncertainty in t is taken as the standard deviation of 4 trials of an experiment.

With these two numbers, h and t, I'm generating the function $a=2h/t^2$. (Derived from $h=\frac{1}{2} a t^2$.

Here's my question: What would be the standard method for figuring the uncertainty in a?

I'm thinking the uncertainty in $t^2$ is $2 t \Delta t$ and the uncertainty in $h$ is $\Delta h$, and then you'd multiply these together to get the uncertainty in $a$.

Then to get an idea of uncertainty, percentage-wise, you just take that and divide by the average value of $a$ which is $2 h / \bar t^2$.

Does that sound like the right approach? I'm troubled because I feel like there's an implied factor, $\Delta h/h$ in the final result is going to always decrease the uncertainty. If this is done correctly, the uncertainty in the height should increase the total uncertainty.

2. Sep 30, 2012

### JDoolin

Another thought:

Using the product rule from calculus, $\Delta (2 h t^{-2})= (2 h) (-2 t^{-3}) \Delta t + 2 (\Delta h) (t^{-2})$, right?

3. Sep 30, 2012

### chiro

Hey JDoolin.

When you want to look at what a function of a random variable is and then talk about it's uncertainty, I'm assuming you are referring to either its variance or some kind of probability interval.

In the case that you are talking about variance, then what you want to calculate is Var[t^2] which is given as E[t^4] - E[t^2]^2 and you can calculate E[g(t)] in the standard way even if you have a sample of data with no known distribution as opposed to one with a distribution of an analytic form.

If this is truly a random variable, then this is the only real way you can quantify uncertainty with a single number (and it's typically what is done when quoting variance or standard deviation which is the square root of variance).

4. Sep 30, 2012

### chiro

You can't do this and it's one of the reasons why stochastic calculus needs all the higher mathematics tools and why all the "quants" for lack of a better word, need to know all this math (or at least be able to learn it quick enough to know what the theory says).

In the stochastic calculus model, you have a measure that is with respect to a random variable. In normal calculus your measure (if it's Riemannian) is the infinitesimal one and if it's Lebesgue it's the Lebesgue one.

The thing about the measure for random variables (the starting point is typically the Weiner process which you can think of as a Normal distribution) is that not only is this measure random, but it can also be negative as well as positive.

Because of these attributes, the chain rule that you take for granted doesn't work because these measures are completely different (the dWt is not the same as dt) and because of this, people needed to come up with a way to differentiate functions with respect to these random variables.

The big step for this after Weiner introduced his process was with Ito who showed a formula for derivatives of processes based on Weiner processes and this became useful for mathematical finance.

Now you can use approximation methods to get approximate values of stochastic integrals in the exact same way you use the trapazoidal method or simpsons rule to approximate a normal integral, but instead of the dt you use the difference in the values of the random numbers generated which means you will get all kinds of values (and this can help illustrate why you can't use calculus in the same way).

5. Sep 30, 2012

### JDoolin

Well, let's see, I think I've broken down the uncertainties into

(1) precision uncertainty--introduced by either limitations in the measurement device, or just a lack of precision in measurement on the part of the experimenter. Don't have any data on this one; the experimenter just has to just be honest about how closely he measured it, although with a meter-stick, you can probably put a minimum of around 1 millimeter.
(2) random uncertainty--introduced by limitations on the ability of the experimenter to start a timer as an object is released, and stop the timer when it gets to a certain mark.
(3) systematic uncertainty--introduced by many different things... For instance, (a) if the experimenter's eye isn't lined up with the mark, so the object actually doesn't fall all the way to the mark, (b) if the experimenter consistently starts the timer late or early because of the way the object is dropped.

I'm trying to get an estimate of uncertainty based on (1) and (2) and figure out how uncertainty in the two variables (height, and time) translate into the uncertainty in acceleration.

My end goal is to calculate the acceleration due to gravity. If the actual acceleration of gravity lies within the uncertainty, then there are probably not any significant systematic errors. But if the actual acceleration of gravity lies well outside the uncertainty, then the students need to explain what might have happened during their measurements to create systematic errors.

6. Sep 30, 2012

### chiro

Well precision uncertainty is usually something bounded (i.e. a constant) with a plus/minus value. You could give this a distribution if you wanted to (like a uniform distribution for example) but you don't necessarily have to.

In terms of the random error, a lot of this kind of error uses a normal distribution and for the kind of error you are describing, a normal distribution is a good model not only mathematically but physically as well (since you expect most values to cluster around a mean and get other values that decay as things get further). It may not be exact, but it is a good model to use (Remember: all models aren't correct but some are more useful than others).

In terms of 3 and the final error, you introduce random variables for each "error" and the total error becomes the sum of those errors. You can figure out the mean and variance of the sum of these errors using standard expectation and variance operator identities.

You can actually correlate these errors if you want to and make them dependent: it's up to you. Also you can obtain confidence intervals for the final random variable (remember they are just sums of individual ones) or get variance/standard-deviations and use that as well.

7. Sep 30, 2012

### JDoolin

Alright, let me tell you another way I thought of doing it.

Instead of dealing with the uncertainty in the time, directly, I calculate all four accelerations from each individual time. Then at the end, I calculate the standard-deviation in the acceleration. However, this does not include the uncertainty in the height.

In order to fix that, I take the uncertainty in the height as a fraction and add 1. $1+\frac{\Delta h}{h}$. Then I multiply this height uncertainty factor by the uncertainty in acceleration found earlier.

Just for clarification though, I'm not sure this is a stochastic process... The true quantity that the students are trying to measure is not random at all. However, the experimental results they have; the numbers that are written on the paper--that may be a random variable, after all. So I'm basically trying to get an estimate of the true number from a set of random numbers generated from that true number. If the experiment were done perfectly, I would expect the time to be exactly the same for each trial, within the precision capabilities of the measurement devices. Almost all of the randomness in the experiment comes from a lack of precision and accuracy in the measurements of the heights and times.

The experiment itself might be a stochastic process, but the number we're trying to estimate from it is a constant, if that helps.

8. Sep 30, 2012

### JDoolin

Let's quantify that a little more... Of course I can't directly add the uncertainty in h to the uncertainty in t or $t^2$, because they don't have the same units.

But the factor of Δh/h as a percentage uncertainty in the acceleration can be added on to the uncertainty in the acceleration due to the timing as a percentage increase.

That's what I was doing in my last post, by multiplying by $1+\frac{\Delta h}{h}$.

9. Sep 30, 2012

### chiro

I'll comment on the other points later, but this problem of a true parameter being a constant but being embedded in a sample is exactly what estimation in statistics is all about.

The one that most people are familiar with is estimating a population mean: the population mean is something that is fixed and doesn't change at all yet with a limited sample, we can't actually determine this with certainty so we need to construct an estimator that is the best estimator that we can use and this is exactly what has happened with t-tests, ANOVA's, chi-squared tests, F-tests, and all the results related to estimation and hypothesis testing (which is quite a lot).

So if for example your process was modelled by a normal distribution with the true value of the experiment being the mean, and the actual realizations that account for error being the possible values in the distribution, then given this assumption, you can take your sample and estimate either of these values or you could even assume that one of these values takes a specific value (for example if you know theoretically what the value should be for the real result you can set the population mean to that value and it becomes a known quantity).

In the case of the above, the only thing now is to estimate the variance (I'm still assuming normality, but this doesn't have to be the case: it's easy for introducing the methods and thinking).

So you have a known mean, you assume an underlying Normal distribution: you then check if you meet the assumptions for a chi-square test in which you can get a distribution involving the population variance and the sample variance which is a function of your observations.

This is an extremely simple model to account for random errors and is used in many statistical models as well as in error analysis (which is what physicists and scientists use to understand errors).

So point being is that you can take a sample and you can assume some distribution constraint (like normal). If you do you should check it meets the criteria by looking at the Histogram and getting some fit-test parameters (like Shapiro-Wilk, but also use other tests like the Jacque-Vera, Kolmogorov-Smirnov and anything to give indicators to support evidence for normality).

You can then assume a population parameter being equal to something (you might do this from a theoretical estimate using physics calculations) or you can take your sample and make an inference on what that parameter should be and then given some significance level, see what the values of this parameter are given a specific interval corresponding to a specific probability and then make up your mind on whether this makes sense physically.

The parameters for a normal are mean and variance and you can estimate both from the sample, or set any of them to a fixed value (as explained above) but from what you are saying, you probably will not fix the variance and you will want to estimate this even if you do fix the mean by using a theoretical physical calculation.

What I'd recommend though for the mean is that you do inference on the mean by doing your theoretical calculation and then testing the hypothesis that the population mean equals the value of the theoretical calculation and see if you fail to reject that.

If you do reject that, it means that the data supports the notion that the theory is wrong (you can't conclude this, but it provides evidence for this) if all things have been setup and done according to the right protocol.