# Distribution function approach to error propagation

1. Aug 7, 2007

### Jezabel

Hello,

I'm familiar with the common calculus approach with partial derivatives to evaluate error propagation in calculations with random variables. However, I'm looking for a way to derive the classic formula with the sum of fractional errors squared:
$${\left(\frac{\Delta Z}{Z}\right)}^2 = {\left(\frac{\Delta X}{X}\right)}^2 + {\left(\frac{\Delta Y}{Y}\right)}^2$$

for the error propagation in a quotient of random variables X & Y:
$$Z = X/Y$$

using only the probability density functions (pdf), given that in my specific situation, X & Y are typical gaussian distributions. I've played around with transformations of pdf and multivariate joint pdf (though in my case X & Y are independent), but didn't reach my goal so far.

Is it possible to formally replace the $$\Delta X, \Delta Y, \Delta Z$$ of the above equation by the variance of the corresponding gaussian pdf? Or am I oblige to resort to Monte Carlo brute force to work with the pdf from the start?

2. Aug 8, 2007

### EnumaElish

If Z = X/Y, isn't $\Delta Z/Z = \Delta X/X - \Delta Y/Y$? How do you go from this to the squared errors? (Are you assuming $\Delta X\Delta Y = 0$?) Regardless, you may find the following useful: http://www.stata.com/support/faqs/stat/deltam.html

Last edited: Aug 8, 2007
3. Aug 8, 2007

### Dr Transport

In the derivation of the mean square error, you are assuming that the cross-terms are zero, hence your formula given in the first post.

4. Aug 14, 2007

### Jezabel

Thank you very much for your help, I apologize for not having answered earlier. I 'm facing a deadline to submit a paper, it's keeping me rather busy...

The link provided by EnumaElish guided me to the resource I very much needed: a more solid book on statistics than what I had already. It provided the clue I was looking for, the square error method comes from a linearization of the problem, thus passing into the realm of calculus through Taylor series expansion. And yes, as was pointed out, I was considering $$\Delta X \Delta Y = 0$$ at that point in my analysis.

However, I took a new approach to model my experiment redefining X and Y so they are very much correlated, I'll therefore include the cross-term in the end. Unfortunately, this new model has me stuck at a new place, so I thought I'd share the whole model with you instead of only the tip of the iceberg as I did previously. It goes as follow:

I have red dots and green dots that are drawn by bigger beads. The total number of dots is in large excess compared to the number of beads, so I consider what happens at each bead independently. I know the relative proportion of red dots versus green dots in my experiment, hence I can set the probability of drawing a red dot $$p_r$$ versus the probability of drawing a green dot $$p_g = 1 - p_r$$. I should end up with a binomial distribution for the number of red dots $$n_r$$ and for the number of green dots $$n_g$$ that are drawn when I observe a large sample of beads. So far, so good. My first hurdle was the question I initially posted. Experimentally, I don't have access to the absolute numbers of red and green dots, but instead their relative value $$n_r/n_g$$. How to calculate the resulting variance of this ratio of random variables? But given the experimental conditions this was measured in, I'm totally happy with the square error approximation method.

The second hurdle I just realized I had was that the number of draws (hence the total number of dots $$n_t = n_r + n_g$$ on a bead) is not a fixed number, it fluctuates from one bead to the next. *sigh* The number of draws for each bead is limited by the number of dots I can put to fill its surface and the beads have roughly, but not exactly, the same size. I have a few measurements that allow me to estimate the mean value of $$n_t$$ and have a very rough idea of its variance across the beads. As I write this post, I see better the link between $$n_t$$ and the ratio I measure for each bead $$n_r/n_g$$, but it's far from obvious to me how the fluctuations of $$n_t$$ will affect the distribution of my ratio measured for many beads...

Thanks again for any hindsight!

P.S. I feel any advice given beyond this point should be acknowledged in my paper. If you help me and do not mind sharing your real name with me (via private message), I'll gladly include your name in my acknowledgment section and give you enough info about the paper so you can track its publication.

5. Aug 15, 2007

### Jezabel

Progress

After a day of thinking about about it (and reading on statistics!), here is the way I'm currently going for to treat the problem:

I'm considering the drawing of dots by the bigger beads as a random experiment as well (and neglecting quite a bit of experimental factors at the same time, but I'll see about improvements to include them later). I'm taking the binomial approach where a dot have a probability $$1/B$$ (where $$B$$ is the total number of beads) to be drawn by a given bead and a probability $$1-1/B$$ by all the other beads. Since $$B$$ is large in my experiment and the number of draws made (i.e. $$D$$ equal to the total number of dots going on all beads) is even larger, I can approximate distribution of the number of dots on a given bead $$n_t$$ to be a poisson one with expected value and variance $$\lambda=D/B$$.

Then back to the red dot versus green dot part of the experiment which I described in my previous post. Another binomial distribution unless I'm very much mistaken! Looking at joint distributions and conditional distributions, I found an example (on p.86 of J.A. Rice, Mathematical Statistics and Data Analysis, Duxbury Press (1995)) that seemed similar to my situation. It showed how to sum over a poisson distribution (of $$n_t$$ in my case) multiplied by the joint binomial distribution conditional to it ($$n_t$$ fixing the number of trials of the red dot versus green dot part of the experiment). By law of total probability, this summation should give me the probability distribution of drawing $$n_r$$ red dots on a bead, which was a given in the book to be:

$$p(n_r) = \frac{ (\lambda p_r)^{n_r} }{ n_r! } e^{ -\lambda p_r }$$

where $$\lambda=D/B$$ as above and $$p_r$$ is known as mentioned in the previous post. I now have a Poisson distribution for $$n_r$$ as well, with expectation value and variance:

$$\lambda_{n_r} = \frac{D \cdot p_r}{B}$$

where $$D/B$$ is the expectation value of $$n_t$$ dots per bead for which I have an experimental estimate. I can take the same reasoning for $$n_g$$ and I know the proportion of red dots versus green dots, hence $$p_r$$ and $$p_g = 1-p_r$$. Thus, I should have all the information required to use the square error method (adding the covariance term of $$n_r$$ and $$n_g$$) to approximate the variance of the ratio $$Z = n_r/n_g$$ (see first post of this thread). Voilà.

Hoping this is somewhat clear, any comment is welcomed I'm somewhat disturbed by the absence of a contribution of the fluctuations of $$n_t$$ on the variance of $$n_r$$ and $$n_g$$, only its expected value $$D/B$$ is included, but I guess this is due to my choice of a Poisson distribution as model for $$n_t$$ which fix the variance to the expectation value...

6. Aug 20, 2007

### EnumaElish

Theorem 4 of Chp. V in Mood, Graybill, Boes, Intro. to the Theory of Stat.:

E[X/Y] approx. equal to EX/EY - Cov[X,Y]/(EY)^2 + EX Var[Y]/(EY)^3,

and

Var[X/Y] approx. eq. to (EX/EY)^2 (Var[X]/(EX)^2 + Var[Y]/(EY)^2 - 2Cov[X,Y]/(EX EY)).

P.S. EX = Mean[X] and EY = Mean[Y].

If X and Y are independent, their Cov = 0. If they are identically distributed, EX = EY and Var[X] = Var[Y].

The set of random variables {X, Y, X+Y} has one redundant element. If you know any two, then you know the third.

Last edited: Aug 20, 2007