What is the RMS deviation from the true mean for a Gaussian distribution?

In summary, to estimate the true mean of a fluorescent object's location, you take a finite sample of individual photon measurements and compute their average. This value will vary slightly each time the experiment is repeated. To find the root-mean-square deviation of this average from the true mean, you can consider the average as a function of N random variables and calculate its expected value, which will be equal to the true mean. From there, you can proceed to find the standard deviation of the average, which will indicate the precision of your estimation.
  • #1
cqfd
11
0
Hi everyone, I'm new here and this is my first post in this forum. ^^


Homework Statement



Suppose that you observe a fluorescent object whose true location is x0. Individual photons come from this object with apparent locations xi in an approximately Gaussian distribution about x0. The root-mean-square deviation σ of this distribution mostly indicates the wavelength of light, not the true size of the object. But you are interested in the object's position, not its size. To estimate the true mean x0, you decide to take finite sample {x1, ... , xN} and compute its averate <x>N

Even if the object is not moving, nevertheless each time you do the measurement, you'll find a slightly different value for <x>N. Find the root-mean-square deviation of <x>N from the true mean x0.


Homework Equations



Gausian distribution: P(x) = 1/sqrt(2Pi)/σ*exp(-(xi-x0)^2/2σ^2)


The Attempt at a Solution



What I've done so far is to compute <x>N = Ʃxi*Gaussian by using the definition of the mean. This is the final form I computed and I didn't find any way to simplify.

Then I assumed that because the measurements are done several times, that <x>N becomes
<x>N,j.

Then I inserted this in the definition of the RMS deviation, which got me this:

RMS = sqrt(<(<x>N,j-x0)^2>).

But this is like a dead end because, honestly, I have no idea how I could solve this analytically. Could you give me a hint or a trick on how I could solve this?
 
Physics news on Phys.org
  • #2
cqfd said:
Hi everyone, I'm new here and this is my first post in this forum. ^^

Homework Statement



Suppose that you observe a fluorescent object whose true location is x0. Individual photons come from this object with apparent locations xi in an approximately Gaussian distribution about x0. The root-mean-square deviation σ of this distribution mostly indicates the wavelength of light, not the true size of the object. But you are interested in the object's position, not its size. To estimate the true mean x0, you decide to take finite sample {x1, ... , xN} and compute its averate <x>N

Even if the object is not moving, nevertheless each time you do the measurement, you'll find a slightly different value for <x>N. Find the root-mean-square deviation of <x>N from the true mean x0.

Homework Equations



Gausian distribution: P(x) = 1/sqrt(2Pi)/σ*exp(-(xi-x0)^2/2σ^2)

The Attempt at a Solution



What I've done so far is to compute <x>N = Ʃxi*Gaussian by using the definition of the mean. This is the final form I computed and I didn't find any way to simplify.
This isn't correct. The xi's are specific numbers, the result of N measurements. How do you calculate the mean of N numbers?

Once you have that, think of <x>N (in general) as a function of N random variables. In particular, <x>N is itself a random variable, which obeys some distribution. To find the mean of a random variable, you want to calculate its expected value.

Then I assumed that because the measurements are done several times, that <x>N becomes
<x>N,j.

Then I inserted this in the definition of the RMS deviation, which got me this:

RMS = sqrt(<(<x>N,j-x0)^2>).

But this is like a dead end because, honestly, I have no idea how I could solve this analytically. Could you give me a hint or a trick on how I could solve this?
 
  • #3
Hmm, this makes sense. In fact his was what I thought of in the first place. So I get the mean like this:

<x>N = 1/N*sum(xi)
Once you have that, think of <x>N (in general) as a function of N random variables. In particular, <x>N is itself a random variable, which obeys some distribution. To find the mean of a random variable, you want to calculate its expected value.

I'm sorry but I don't understand what this means. All I've calculated so far are the means of the individual samples <x>N,j with the now hopefully correct formula.

But how can I relate the <x>N,j with a Gaussian?
 
  • #4
You need to differentiate between the samples and the random variables. So here you have Xi, which are the random variables, and xi, which are the specific measurements obtained. As you noted, to calculate the sample mean, you use
$$\langle x \rangle_N = \frac{1}{N} \sum_{i=1}^N x_i.$$ The value of ##\langle x \rangle_N## obviously changes every time you repeat the experiment, so it corresponds to some random variable, which I'll call M. The relationship between M and the random variables Xi's is given by
$$ M = \frac{1}{N} \sum_{i=1}^N X_i.$$ It's the same relationship as before except this time you're using the random variables instead of the specific samples. The problem requires you to calculate the expected value of M. You should find that it's simply equal to the true mean, ##x_0##. Once you find that, you can proceed to find the standard deviation for M.
 
  • Like
Likes 1 person
  • #5
Thanks for your help already. This all makes sense to me intuitively, I mean it is clear to me that when we make the measurements many times and take the mean value, that this value is going to converge to x0, but I just can't figure out how to write this down formally. And I still can't figure how I can enter the Gaussian.

I've never had a course on the matter and now I'm working on a book independently without any solutions just because the topic is very interesting. ;)
 
  • #6
Still can't figure this one out... :/
 
  • #7
vela said:
You need to differentiate between the samples and the random variables. So here you have Xi, which are the random variables, and xi, which are the specific measurements obtained. As you noted, to calculate the sample mean, you use
$$\langle x \rangle_N = \frac{1}{N} \sum_{i=1}^N x_i.$$ The value of ##\langle x \rangle_N## obviously changes every time you repeat the experiment, so it corresponds to some random variable, which I'll call M. The relationship between M and the random variables Xi's is given by
$$ M = \frac{1}{N} \sum_{i=1}^N X_i.$$ It's the same relationship as before except this time you're using the random variables instead of the specific samples. The problem requires you to calculate the expected value of M. You should find that it's simply equal to the true mean, ##x_0##. Once you find that, you can proceed to find the standard deviation for M.

I've given the problem a second try, from the beginning, thinking about what you said. Now this is what i came up with, i tried to order all the different terms, and hope you correct me if there's something wrong:

We define a random variable, X, the position of the object. We know the pdf of X, given by

$$P(X = x) = pdf(x) = \frac{1}{\sqrt{2Pi}σ}e^{\frac{-(x-x_{0})^2}{2σ^2}}$$

Secondly, we have a sample, ##x_i##, defined as N random draws from ##P(X = x)##. Because we know a priori (i.e. it was given in the exercise text) that our samples are drawn from a gaussian, we can directly conclude that the expected value of the mean of any sample is ##x_0##. Noted in mathematical terms:

##\langle\langle x\rangle_N\rangle = x_0## (Or is the correct notation: ##E[\langle x\rangle_N] = x_0##?)

Can this be considered a valid proof, or is there some fancy mathematical way?

Then, you defined some second random variable you called M, which you defined as:

$$ M = \frac{1}{N'}\sum_{i=1}^N'X_i $$

Now the meaning of this, is that M is nothing but the average of N' successive draws of ##\langle x\rangle_N##. Therefore we can directly conclude that ##E[M]=E[X]##. Is that correct?


That being done, we finally come to the question asked in the problem. If i understood correctly, we want to find ##RMS(M)##. (Is that even the correct notation, or should i better call it ##σ_M##?)

For that part, unfortunately, i don't know how to begin. I'd be very grateful if someone could correct me if i stated something wrong and/or give me a hint on how to solve the RMS problem.

Thanks very much in advance!
 
  • #8
cqfd said:
I've given the problem a second try, from the beginning, thinking about what you said. Now this is what i came up with, i tried to order all the different terms, and hope you correct me if there's something wrong:

We define a random variable, X, the position of the object. We know the pdf of X, given by

$$P(X = x) = pdf(x) = \frac{1}{\sqrt{2Pi}σ}e^{\frac{-(x-x_{0})^2}{2σ^2}}$$

Secondly, we have a sample, ##x_i##, defined as N random draws from ##P(X = x)##. Because we know a priori (i.e. it was given in the exercise text) that our samples are drawn from a gaussian, we can directly conclude that the expected value of the mean of any sample is ##x_0##. Noted in mathematical terms:

##\langle\langle x\rangle_N\rangle = x_0## (Or is the correct notation: ##E[\langle x\rangle_N] = x_0##?)

Can this be considered a valid proof, or is there some fancy mathematical way?
No, it's not a valid proof. You're just asserting what you think the mean should be based on your intuition.

You have N samples, ##x_1, x_2, \dots, x_N##. These correspond to the N random variables ##X_1, X_2, \dots, X_N##. The sample mean ##m = (x_1+x_2+\dots+x_N)/N## corresponds to the random variable ##M = (X_1+X_2+\dots+X_N)/N##. The fact that E[M]=x0 tells you that if you perform the experiment repeatedly, the sample means will cluster around the value x0.
 
  • #9
But you're saying basically the same thing, and i don't understand how your way should be a valid proof, but not mine?

And shouldn't the expected value for each random variable that is corresponding to a draw from the sample ##X_i## be already ##x_0##? I.e. ##\forall i \in [1,N], E[X_i] = x_0##?

ONLY THEN, because we know that ##E[X+Y]=E[X]+E[Y]##, we can conclude that ##E[M] = N*E[X_i]/N = E[X_i] = x_0##!
 
  • #10
The claim E[Xi]=x0 and the claim E[M]=x0 are different statements. You can certainly say that given Xi's distribution that E[Xi]=x0, but simply asserting then that E[M]=x0, which was what you said, isn't a valid proof.
 
  • #11
Oh, ok i see what you mean. I should have stated it like in post #9 from the beginning. ^^

Now I've thought about the RMS part some more, but i can't get my head around it. Has it something to do with this property?

61f30b6924dfb26d181bd66ea316948d.png


Because then we would have ##RMS(X_1+X_2+...+X_N) = \sqrt{RMS(X_1)^2+RMS(X_2)^2+...+RMS(X_N)^2} = \sqrt{N*\sigma^2} = \sqrt{N}*\sigma##

Then we would have ##RMS(M) = \frac{\sqrt{N}*\sigma}{N} = \frac{\sigma}{\sqrt{N}}##.

?
 
Last edited:
  • #12
Sorry, i first thought i wanted to add something, but it ended up being wrong. :/

So this post can be deleted.
 
Last edited:
  • #13
Supposed you have a stochastically independent realization of your random expriment. The distribution for [itex]N[/itex] outcomes is
[tex]P_N(x_1,\ldots,x_N)=P_1(x_1) \cdots P_2(x_N)[/tex]
The expectation value of the mean is
[tex]\langle M \rangle=\left \langle \frac{x_1+\cdots+x_N}{N} \right \rangle=\int_{\mathbb{R}^N} \mathrm{d}^N x \frac{x_1+\cdots x_N}{N} P_N(x_1,\ldots,x_N)[/tex]
and
its standard deviation
[tex]\langle (M-\langle{M} \rangle)^2 \rangle = \langle M^2 \rangle -\langle M \rangle^2.[/tex]
Calculate this for your distribution, and you'll find the correct answer in a systematic way.
 
  • Like
Likes 1 person
  • #14
vanhees71 said:
Supposed you have a stochastically independent realization of your random expriment. The distribution for [itex]N[/itex] outcomes is
[tex]P_N(x_1,\ldots,x_N)=P_1(x_1) \cdots P_2(x_N)[/tex]

I'm always a little uncertain about such notational things, so just before i start, what you mean by your concrete notation, is that the experiment is performed 2 times, each times taking a sample of size N. In that case, the distribution of the position is given by the product of the total number of samples we've taken?

In mathematical terms if we realize the experiment N' times (if i understood correctly, the following formula is just a rewriting of your formula, in more detail):

$$P_{N'}(x_1,...,x_N) = P_1(x_1)*\cdots*P_1(x_N)*P_2(x_1)*\cdots*P_2(x_N)*\cdots*P_{N'}(x_1)* \cdots*P_{N'}(x_N)$$

Is that right?
 
  • #15
The two is a typo. He meant PN(xn).
 
  • #16
No, it's
[tex]P_N(x_1,\ldots,x_N)=P_1(x_1) P_1(x_2) \cdots P_1(x_N),[/tex]
because it's the same probability distribution of each single random experiment.
 
  • #17
cqfd said:
Oh, ok i see what you mean. I should have stated it like in post #9 from the beginning. ^^

Now I've thought about the RMS part some more, but i can't get my head around it. Has it something to do with this property?

61f30b6924dfb26d181bd66ea316948d.png


Because then we would have ##RMS(X_1+X_2+...+X_N) = \sqrt{RMS(X_1)^2+RMS(X_2)^2+...+RMS(X_N)^2} = \sqrt{N*\sigma^2} = \sqrt{N}*\sigma##

Then we would have ##RMS(M) = \frac{\sqrt{N}*\sigma}{N} = \frac{\sigma}{\sqrt{N}}##.

?
That is the correct answer (and it relies on the fact that the Xi Gaussian random variables are all identical and independent with respect to each other). As others have said, you can calculate the answer from first principles. I just thought I should say, this is right, so that you have something 'to aim for' while you are doing it from first principles.
 
  • Like
Likes 1 person

Related to What is the RMS deviation from the true mean for a Gaussian distribution?

What is the "RMS" of a gaussian distribution?

The "RMS" of a gaussian distribution refers to the root mean square, which is a statistical measure of the spread or variability of a set of data. It is calculated by taking the square root of the average of the squared deviations from the mean.

How is the "RMS" of a gaussian distribution different from the standard deviation?

The "RMS" and the standard deviation are both measures of variability, but they differ in their calculation and interpretation. The "RMS" is a measure of the overall spread of the data, while the standard deviation is a measure of how much the data deviates from the mean.

What does the "RMS" of a gaussian distribution tell us about the data?

The "RMS" of a gaussian distribution provides information about the variability of the data. A higher "RMS" value indicates a wider spread of the data, while a lower "RMS" value indicates a narrower spread.

How is the "RMS" of a gaussian distribution used in scientific research?

The "RMS" of a gaussian distribution is commonly used in scientific research to compare the variability of data sets. It can also be used in statistical tests to determine the significance of differences between data sets.

What is the relationship between the "RMS" and the standard deviation in a gaussian distribution?

The "RMS" and the standard deviation are mathematically related in a gaussian distribution. The "RMS" is equal to the standard deviation multiplied by the square root of 2, or approximately 1.414. This relationship allows for easy conversion between the two measures of variability.

Similar threads

  • Calculus and Beyond Homework Help
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
Replies
2
Views
5K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Linear and Abstract Algebra
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
Replies
3
Views
1K
  • Biology and Medical
Replies
4
Views
2K
Replies
1
Views
5K
  • Advanced Physics Homework Help
Replies
2
Views
3K
Back
Top