Distribution function approach to error propagation

Jezabel
Messages
3
Reaction score
0
Hello,

I'm familiar with the common calculus approach with partial derivatives to evaluate error propagation in calculations with random variables. However, I'm looking for a way to derive the classic formula with the sum of fractional errors squared:
{\left(\frac{\Delta Z}{Z}\right)}^2 = {\left(\frac{\Delta X}{X}\right)}^2 + {\left(\frac{\Delta Y}{Y}\right)}^2

for the error propagation in a quotient of random variables X & Y:
Z = X/Y

using only the probability density functions (pdf), given that in my specific situation, X & Y are typical gaussian distributions. I've played around with transformations of pdf and multivariate joint pdf (though in my case X & Y are independent), but didn't reach my goal so far.

Is it possible to formally replace the \Delta X, \Delta Y, \Delta Z of the above equation by the variance of the corresponding gaussian pdf? Or am I oblige to resort to Monte Carlo brute force to work with the pdf from the start?
 
Physics news on Phys.org
Last edited:
In the derivation of the mean square error, you are assuming that the cross-terms are zero, hence your formula given in the first post.
 
Thank you very much for your help, I apologize for not having answered earlier. I 'm facing a deadline to submit a paper, it's keeping me rather busy...

The link provided by EnumaElish guided me to the resource I very much needed: a more solid book on statistics than what I had already. It provided the clue I was looking for, the square error method comes from a linearization of the problem, thus passing into the realm of calculus through Taylor series expansion. And yes, as was pointed out, I was considering \Delta X \Delta Y = 0 at that point in my analysis.

However, I took a new approach to model my experiment redefining X and Y so they are very much correlated, I'll therefore include the cross-term in the end. Unfortunately, this new model has me stuck at a new place, so I thought I'd share the whole model with you instead of only the tip of the iceberg as I did previously. It goes as follow:

I have red dots and green dots that are drawn by bigger beads. The total number of dots is in large excess compared to the number of beads, so I consider what happens at each bead independently. I know the relative proportion of red dots versus green dots in my experiment, hence I can set the probability of drawing a red dot p_r versus the probability of drawing a green dot p_g = 1 - p_r. I should end up with a binomial distribution for the number of red dots n_r and for the number of green dots n_g that are drawn when I observe a large sample of beads. So far, so good. My first hurdle was the question I initially posted. Experimentally, I don't have access to the absolute numbers of red and green dots, but instead their relative value n_r/n_g. How to calculate the resulting variance of this ratio of random variables? But given the experimental conditions this was measured in, I'm totally happy with the square error approximation method.

The second hurdle I just realized I had was that the number of draws (hence the total number of dots n_t = n_r + n_g on a bead) is not a fixed number, it fluctuates from one bead to the next. *sigh* The number of draws for each bead is limited by the number of dots I can put to fill its surface and the beads have roughly, but not exactly, the same size. I have a few measurements that allow me to estimate the mean value of n_t and have a very rough idea of its variance across the beads. As I write this post, I see better the link between n_t and the ratio I measure for each bead n_r/n_g, but it's far from obvious to me how the fluctuations of n_t will affect the distribution of my ratio measured for many beads...

Thanks again for any hindsight!

P.S. I feel any advice given beyond this point should be acknowledged in my paper. If you help me and do not mind sharing your real name with me (via private message), I'll gladly include your name in my acknowledgment section and give you enough info about the paper so you can track its publication.
 
Progress

After a day of thinking about about it (and reading on statistics!), here is the way I'm currently going for to treat the problem:

I'm considering the drawing of dots by the bigger beads as a random experiment as well (and neglecting quite a bit of experimental factors at the same time, but I'll see about improvements to include them later). I'm taking the binomial approach where a dot have a probability 1/B (where B is the total number of beads) to be drawn by a given bead and a probability 1-1/B by all the other beads. Since B is large in my experiment and the number of draws made (i.e. D equal to the total number of dots going on all beads) is even larger, I can approximate distribution of the number of dots on a given bead n_t to be a poisson one with expected value and variance \lambda=D/B.

Then back to the red dot versus green dot part of the experiment which I described in my previous post. Another binomial distribution unless I'm very much mistaken! Looking at joint distributions and conditional distributions, I found an example (on p.86 of J.A. Rice, Mathematical Statistics and Data Analysis, Duxbury Press (1995)) that seemed similar to my situation. It showed how to sum over a poisson distribution (of n_t in my case) multiplied by the joint binomial distribution conditional to it (n_t fixing the number of trials of the red dot versus green dot part of the experiment). By law of total probability, this summation should give me the probability distribution of drawing n_r red dots on a bead, which was a given in the book to be:

p(n_r) = \frac{ (\lambda p_r)^{n_r} }{ n_r! } e^{ -\lambda p_r }

where \lambda=D/B as above and p_r is known as mentioned in the previous post. I now have a Poisson distribution for n_r as well, with expectation value and variance:

\lambda_{n_r} = \frac{D \cdot p_r}{B}

where D/B is the expectation value of n_t dots per bead for which I have an experimental estimate. I can take the same reasoning for n_g and I know the proportion of red dots versus green dots, hence p_r and p_g = 1-p_r. Thus, I should have all the information required to use the square error method (adding the covariance term of n_r and n_g) to approximate the variance of the ratio Z = n_r/n_g (see first post of this thread). Voilà.

Hoping this is somewhat clear, any comment is welcomed :smile: I'm somewhat disturbed by the absence of a contribution of the fluctuations of n_t on the variance of n_r and n_g, only its expected value D/B is included, but I guess this is due to my choice of a Poisson distribution as model for n_t which fix the variance to the expectation value...
 
Theorem 4 of Chp. V in Mood, Graybill, Boes, Intro. to the Theory of Stat.:

E[X/Y] approx. equal to EX/EY - Cov[X,Y]/(EY)^2 + EX Var[Y]/(EY)^3,

and

Var[X/Y] approx. eq. to (EX/EY)^2 (Var[X]/(EX)^2 + Var[Y]/(EY)^2 - 2Cov[X,Y]/(EX EY)).

P.S. EX = Mean[X] and EY = Mean[Y].

If X and Y are independent, their Cov = 0. If they are identically distributed, EX = EY and Var[X] = Var[Y].

The set of random variables {X, Y, X+Y} has one redundant element. If you know any two, then you know the third.
 
Last edited:
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top