Extracting sigma from sum of distributions with moving mean

Click For Summary

Discussion Overview

The discussion revolves around the extraction of the standard deviation (σ) from the sum of distributions in an experimental context where the mean (μ) varies. Participants explore the implications of averaging multiple distributions, particularly Gaussian ones, and how the shifting mean affects the observed standard deviation. The conversation includes theoretical considerations, statistical methods, and experimental challenges related to measuring diffusion in a 2D distribution of particles.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Experimental/applied

Main Points Raised

  • One participant describes an experiment where each data point represents an entire distribution, suggesting that the sum of these distributions appears broader due to a varying mean, and questions how to isolate the actual σ.
  • Another participant proposes that if the mean is variable but the standard deviation is constant, the total observed standard deviation can be expressed as a quadratic sum of the true σ and the standard deviation of the mean (σμ).
  • A clarification indicates that while the standard deviation should remain constant across traces, the random shift in the mean complicates the averaging process, leading to a broader distribution.
  • One participant suggests that if the shift in the mean can be measured, it could be corrected for, potentially improving the accuracy of σ estimation.
  • Another participant notes that if the mean cannot be directly measured, it may be challenging to separate the variation of the mean from the variation of a single distribution.
  • A follow-up question introduces the concept of a bivariate distribution and asks how variations in the mean of one variable (y) might relate to the observed σ in another variable (x).
  • Participants discuss the independence of x and y in the context of diffusion, questioning whether the movement in y affects the distribution observed in x.
  • Some participants express uncertainty about the relationship between the two dimensions and the implications for measuring the spread of the distribution.

Areas of Agreement / Disagreement

Participants express differing views on the ability to isolate the effects of a shifting mean from the standard deviation of the distribution. There is no consensus on the best approach to handle the complexities introduced by the bivariate nature of the data and the independence of the variables.

Contextual Notes

Participants acknowledge limitations in measuring the mean directly and the challenges of averaging over multiple distributions, which may introduce additional uncertainties in estimating σ. The discussion also highlights the dependence on the specific experimental setup and the assumptions made regarding the distributions involved.

Who May Find This Useful

This discussion may be of interest to researchers and practitioners involved in experimental physics, statistics, and data analysis, particularly those working with Gaussian distributions and diffusion processes in multidimensional settings.

ArchieDave
Messages
15
Reaction score
0
I have an experiment in which I want to extract the distribution function of a process. I expect it to be Gaussian. Each data point measured is an entire distribution, f(x), but I am forced to average over many points such that the result of the experiment is the sum of many measurements of f(x). If A and σ are believed to be constants but the mean, μ, varies a little for each point, the resulting sum of distributions appears broader as if σ is larger. My question: If I believe I know the deviation of the mean μ, can this affect be subtracted out so that I am left with the actual value of σ?
 
Physics news on Phys.org
I'm not sure what you mean with "Each data point measured is an entire distribution". If every data point is a point, but comes from a distribution with fixed σ and variable μ, then it is possible. The measured standard deviation of your data points will be the quadratic sum of σ and the standard deviation of μ:
$$\sigma_{total} = \sqrt{\sigma^2 + \sigma_\mu^2}$$
 
Let me clarify. I'm running an experiment for which each result (what I called point) is a trace that gives a distribution. Based on the nature of the experiment, the standard deviation should be the same for each trace. However, there is a random shift of the mean that can obviously be measured. In order to achieve sufficient signal/noise, I must average over many traces such that this shifting of the mean causes a widening of the summation. I wasn't completely sure how to subtract out this effect. I wasn't able to find the solution in my old stats book that I had to dust off. I believe you answered the question, but I was also hoping to see this relationship proven. Thanks for the help.
 
If you can measure the shift for each data point, then you can do better - you can just shift your measurement to correct for it (and take the uncertainty of this correction as uncertainty on μ). If σμ is small compared to σ this won't make a significant difference, but if the two are comparable it can be a huge improvement.
 
Sorry again for the poor explanation. Because of the nature of the experiment, I can't measure the mean directly from each trace, but I can infer the deviation of the mean indirectly from other data.
 
If you can do repeat measurements within the same distribution, before the mean changes, then you can estimate the variance for each distribution and draw some conclusions. Otherwise, you can not statistically separate the variation of distribution mean from the variation for a single distribution with a fixed mean. If you know why and how the mean changes, maybe you can back that out from the data before applying statistical methods. Otherwise, I don't think you can do anything to estimate the variance at a fixed mean.
 
Follow up question. I get that if I know σ_μ than I can back out σ. Now if the distribution is bivariate such that the trace I see is actually g(x,y=0) and I also have some variation in the mean of y, is there a simple way to relate the mean of y to the resulting sigma_x? In other words, I will be averaging over g(x,y=+/-dy) and the trace (distribution) I see will appear broader. If I assume sigX=sigY, it seems as though I could back out σ again. Thanks again.
 
How are y and x related?

It would help to know more about the setup and your datasets...
 
I'd avoided explaining for fear it might devolve off topic, but here goes. My question is not actually directly related to statistics but I have reason to assume a normal distribution so it seemed like a good place to pose the question. I have a 2D distribution of particles injected into a gas medium that I can image using a gated camera and a laser that provides excitation (LIF images). I want to study the rate of diffusion (take the image at different times). The result should be symmetric and gaussian. However, the limitations are that I need to accumulate over many shots/injections in order to see a distribution for a given moment in time after the particles enter. Also, I know from using a medium with higher signal that while sigma is not expected to change at one point in time, the centerline/mean varies from shot to shot. I can measure this under other circumstances and I'd like to back out the effect from the intensity/population distributions that I see. Imagine trying to measure the spread of bird shot by looking at a target but needing to subtract out changes in centerline trajectory caused by imperfect rifleman.

So if I only needed to worry about the dimension in the direction of the beam (x-direction), I'd just use the equation you provided to subtract out the deviation of the mean, which I measure under other circumstances. However, since there is variation in the y-direction (perpendicular to beam) as well, I will be "missing" the centerline and getting a broadening effect there as well. It would be easy for me to generate a 2D normal distribution, assign random values for sigma within a confidence interval of my determination and take the average of a single line over many thousands of points and perform this repeatedly until I arrive at the answer/distribution that I see. However, I'd prefer to just find an actual solution. Any help is, again, greatly appreciated.
 
  • #10
For diffusion, x and y are independent, yes you can ignore y then (or look at it separately).
 
  • #11
hmmm

I understand that the rate of diffusion is independent for x and y, but if I'm measuring distribution over many "shots" and it's moving in y than I'll be probing a different point at each measurement. The distribution will appear broader when y ≠ 0, correct?
 
  • #12
Why should the x-distribution change if your y moves?
 
  • #13
Because the function f(x,y) such that f(x,y=const) is different for all y? I'm just looking at the equation for a bivariate normal distribution.
 
  • #14
Why should the x-dependence be different?
Independent variables mean you can find g, h such that f(x,y)=g(x)*h(y). You don't have to care about h to find g.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
3K
Replies
2
Views
2K
  • · Replies 9 ·
Replies
9
Views
5K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 7 ·
Replies
7
Views
2K