# RMS of gaussian distribution

1. Aug 1, 2013

### cqfd

Hi everyone, I'm new here and this is my first post in this forum. ^^

1. The problem statement, all variables and given/known data

Suppose that you observe a fluorescent object whose true location is x0. Individual photons come from this object with apparent locations xi in an approximately Gaussian distribution about x0. The root-mean-square deviation σ of this distribution mostly indicates the wavelength of light, not the true size of the object. But you are interested in the object's position, not its size. To estimate the true mean x0, you decide to take finite sample {x1, ... , xN} and compute its averate <x>N

Even if the object is not moving, nevertheless each time you do the measurement, you'll find a slightly different value for <x>N. Find the root-mean-square deviation of <x>N from the true mean x0.

2. Relevant equations

Gausian distribution: P(x) = 1/sqrt(2Pi)/σ*exp(-(xi-x0)^2/2σ^2)

3. The attempt at a solution

What I've done so far is to compute <x>N = Ʃxi*Gaussian by using the definition of the mean. This is the final form I computed and I didn't find any way to simplify.

Then I assumed that because the measurements are done several times, that <x>N becomes
<x>N,j.

Then I inserted this in the definition of the RMS deviation, which got me this:

RMS = sqrt(<(<x>N,j-x0)^2>).

But this is like a dead end because, honestly, I have no idea how I could solve this analytically. Could you give me a hint or a trick on how I could solve this?

2. Aug 1, 2013

### vela

Staff Emeritus
This isn't correct. The xi's are specific numbers, the result of N measurements. How do you calculate the mean of N numbers?

Once you have that, think of <x>N (in general) as a function of N random variables. In particular, <x>N is itself a random variable, which obeys some distribution. To find the mean of a random variable, you want to calculate its expected value.

3. Aug 1, 2013

### cqfd

Hmm, this makes sense. In fact his was what I thought of in the first place. So I get the mean like this:

<x>N = 1/N*sum(xi)

I'm sorry but I don't understand what this means. All I've calculated so far are the means of the individual samples <x>N,j with the now hopefully correct formula.

But how can I relate the <x>N,j with a Gaussian?

4. Aug 1, 2013

### vela

Staff Emeritus
You need to differentiate between the samples and the random variables. So here you have Xi, which are the random variables, and xi, which are the specific measurements obtained. As you noted, to calculate the sample mean, you use
$$\langle x \rangle_N = \frac{1}{N} \sum_{i=1}^N x_i.$$ The value of $\langle x \rangle_N$ obviously changes every time you repeat the experiment, so it corresponds to some random variable, which I'll call M. The relationship between M and the random variables Xi's is given by
$$M = \frac{1}{N} \sum_{i=1}^N X_i.$$ It's the same relationship as before except this time you're using the random variables instead of the specific samples. The problem requires you to calculate the expected value of M. You should find that it's simply equal to the true mean, $x_0$. Once you find that, you can proceed to find the standard deviation for M.

5. Aug 1, 2013

### cqfd

Thanks for your help already. This all makes sense to me intuitively, I mean it is clear to me that when we make the measurements many times and take the mean value, that this value is going to converge to x0, but I just can't figure out how to write this down formally. And I still can't figure how I can enter the Gaussian.

I've never had a course on the matter and now I'm working on a book independently without any solutions just because the topic is very interesting. ;)

6. Aug 2, 2013

### cqfd

Still can't figure this one out... :/

7. Aug 27, 2013

### cqfd

I've given the problem a second try, from the beginning, thinking about what you said. Now this is what i came up with, i tried to order all the different terms, and hope you correct me if there's something wrong:

We define a random variable, X, the position of the object. We know the pdf of X, given by

$$P(X = x) = pdf(x) = \frac{1}{\sqrt{2Pi}σ}e^{\frac{-(x-x_{0})^2}{2σ^2}}$$

Secondly, we have a sample, $x_i$, defined as N random draws from $P(X = x)$. Because we know a priori (i.e. it was given in the exercise text) that our samples are drawn from a gaussian, we can directly conclude that the expected value of the mean of any sample is $x_0$. Noted in mathematical terms:

$\langle\langle x\rangle_N\rangle = x_0$ (Or is the correct notation: $E[\langle x\rangle_N] = x_0$?)

Can this be considered a valid proof, or is there some fancy mathematical way?

Then, you defined some second random variable you called M, which you defined as:

$$M = \frac{1}{N'}\sum_{i=1}^N'X_i$$

Now the meaning of this, is that M is nothing but the average of N' successive draws of $\langle x\rangle_N$. Therefore we can directly conclude that $E[M]=E[X]$. Is that correct?

That being done, we finally come to the question asked in the problem. If i understood correctly, we want to find $RMS(M)$. (Is that even the correct notation, or should i better call it $σ_M$?)

For that part, unfortunately, i don't know how to begin. I'd be very grateful if someone could correct me if i stated something wrong and/or give me a hint on how to solve the RMS problem.

8. Aug 28, 2013

### vela

Staff Emeritus
No, it's not a valid proof. You're just asserting what you think the mean should be based on your intuition.

You have N samples, $x_1, x_2, \dots, x_N$. These correspond to the N random variables $X_1, X_2, \dots, X_N$. The sample mean $m = (x_1+x_2+\dots+x_N)/N$ corresponds to the random variable $M = (X_1+X_2+\dots+X_N)/N$. The fact that E[M]=x0 tells you that if you perform the experiment repeatedly, the sample means will cluster around the value x0.

9. Aug 28, 2013

### cqfd

But you're saying basically the same thing, and i don't understand how your way should be a valid proof, but not mine?

And shouldn't the expected value for each random variable that is corresponding to a draw from the sample $X_i$ be already $x_0$? I.e. $\forall i \in [1,N], E[X_i] = x_0$?

ONLY THEN, because we know that $E[X+Y]=E[X]+E[Y]$, we can conclude that $E[M] = N*E[X_i]/N = E[X_i] = x_0$!

10. Aug 28, 2013

### vela

Staff Emeritus
The claim E[Xi]=x0 and the claim E[M]=x0 are different statements. You can certainly say that given Xi's distribution that E[Xi]=x0, but simply asserting then that E[M]=x0, which was what you said, isn't a valid proof.

11. Aug 28, 2013

### cqfd

Oh, ok i see what you mean. I should have stated it like in post #9 from the beginning. ^^

Now i've thought about the RMS part some more, but i can't get my head around it. Has it something to do with this property?

Because then we would have $RMS(X_1+X_2+...+X_N) = \sqrt{RMS(X_1)^2+RMS(X_2)^2+...+RMS(X_N)^2} = \sqrt{N*\sigma^2} = \sqrt{N}*\sigma$

Then we would have $RMS(M) = \frac{\sqrt{N}*\sigma}{N} = \frac{\sigma}{\sqrt{N}}$.

?

Last edited: Aug 28, 2013
12. Aug 28, 2013

### cqfd

Sorry, i first thought i wanted to add something, but it ended up being wrong. :/

So this post can be deleted.

Last edited: Aug 28, 2013
13. Aug 28, 2013

### vanhees71

Supposed you have a stochastically independent realization of your random expriment. The distribution for $N$ outcomes is
$$P_N(x_1,\ldots,x_N)=P_1(x_1) \cdots P_2(x_N)$$
The expectation value of the mean is
$$\langle M \rangle=\left \langle \frac{x_1+\cdots+x_N}{N} \right \rangle=\int_{\mathbb{R}^N} \mathrm{d}^N x \frac{x_1+\cdots x_N}{N} P_N(x_1,\ldots,x_N)$$
and
its standard deviation
$$\langle (M-\langle{M} \rangle)^2 \rangle = \langle M^2 \rangle -\langle M \rangle^2.$$
Calculate this for your distribution, and you'll find the correct answer in a systematic way.

14. Aug 28, 2013

### cqfd

I'm always a little uncertain about such notational things, so just before i start, what you mean by your concrete notation, is that the experiment is performed 2 times, each times taking a sample of size N. In that case, the distribution of the position is given by the product of the total number of samples we've taken?

In mathematical terms if we realize the experiment N' times (if i understood correctly, the following formula is just a rewriting of your formula, in more detail):

$$P_{N'}(x_1,...,x_N) = P_1(x_1)*\cdots*P_1(x_N)*P_2(x_1)*\cdots*P_2(x_N)*\cdots*P_{N'}(x_1)* \cdots*P_{N'}(x_N)$$

Is that right?

15. Aug 28, 2013

### vela

Staff Emeritus
The two is a typo. He meant PN(xn).

16. Aug 29, 2013

### vanhees71

No, it's
$$P_N(x_1,\ldots,x_N)=P_1(x_1) P_1(x_2) \cdots P_1(x_N),$$
because it's the same probability distribution of each single random experiment.

17. Aug 29, 2013

### BruceW

That is the correct answer (and it relies on the fact that the Xi Gaussian random variables are all identical and independent with respect to each other). As others have said, you can calculate the answer from first principles. I just thought I should say, this is right, so that you have something 'to aim for' while you are doing it from first principles.