Undergrad Effect of a Moving Average on Gaussian Noise

Click For Summary
Averaging multiple data sets measuring the same signal reduces the noise, improving the signal-to-noise ratio (SNR) by a factor of 1/√N, where N is the number of data sets. The discussion clarifies that whether the noise is Gaussian or Poisson, as long as it is uncorrelated, the standard deviation of the noise in the averaged data set will be σ/√N. The importance of constructing a test statistic and understanding variance in relation to sample size is emphasized. The Central Limit Theorem supports that the distribution of averages approaches normality with large samples, allowing for accurate estimation of parameters. Overall, the conversation highlights the mathematical principles behind noise reduction through averaging in data analysis.
Twigg
Science Advisor
Gold Member
Messages
893
Reaction score
483
Hi all,

I think there is a really obvious answer to this, but I just don't see it yet. Suppose you had N data sets that all measured the same quantity as a function of time. Each data set shows the same signal plus a random noise component which is normally distributed about the signal with a constant standard deviation. If you were to take an average over the N data sets, you would expect to see the same signal reduced noise. If I'm not mistaken, the more data sets averaged, the lower the resulting noise. What would the standard deviation of the noise be in the averaged data? Thanks!
 
Physics news on Phys.org
IF the signals correlate and the noise doesn't, you expect an improvement of the signal to noise ratio by a factor of ##\ 1/\sqrt N\ ## with ##N## the number of data sets.
 
That makes sense I think. Let me know if I have this incorrectly:

For each of the N data sets, the noise is Poisson distributed at each time with a sample size of 1, which has standard deviation ##\sigma = \sqrt{1} = 1##. For the average of the N data sets, the noise is Poisson distributed at each time with a sample size of N, which has a standard deviation of ##\sigma = \sqrt{N}##. That gives the factor of ##\frac{1}{\sqrt{N}}## in SNR.
 
Didn't the thread mention Gaussian noise ? Anyway, key is: the noise is supposed to be uncorrelated, so it adds up squared, and the signal adds up direct. Divide by ##N## to average and the signal/noise goes like ##N/\sqrt N##.
 
I appreciate the correction! I clearly need to review and practice more.

Since I've already made a fool of myself once, can I ask you to verify that I've got it this time? If ##u_n(t)## is the noise component at time t in the n-th data set and ##\sigma## is the standard deviation of the noise in the Gaussian, then the standard deviation of the noise in the averaged data set is going to be ##\sigma_{avg} = \sqrt{<u^2(t)>} = \frac{\sqrt{\sum_{n=1}^{N} <u_{n}^2(t)>}}{N} = \frac{\sqrt{N*\sigma}}{N} = \frac{\sigma}{\sqrt{N}} ##? So SNR improves with ##\sqrt{N}##.

And from your last post, am I right to think that it doesn't matter if the noise is Gaussian distributed (e.g., Johnson-Nyquist noise) or Poissonian (e.g., shot noise), so long as it is uncorrelated and that the value of ##\sigma## doesn't change in time?
 
Hey Twigg.

Just construct the test statistic and then get a variance for it and see what it is as a function of your sample.

More information should always reduce the variance (via things like the Cramer-Rao bound) and when it comes to relating the noise to the assessment of what the signal is saying, if the signal has enough redundancy and you get enough actual information then statistically (at least) the noise probabilities (i.e. - the probability that the signal being assessed as a function of all data values) will be so low that there will be a high degree of confidence that whatever is inferred is likely to be what information was communicated.

This idea is found in things like probabilistic prime testing (like Rabin-Miller testing).
 
A moving average is a filter, more specifically a low-pass filter. You can get the same effect using a digital low-pass filter:

Assume that you take the moving average over N samples. Express this as y_{m}=\frac{1}{N}\sum_{i=m-N}^{m}x_{i}. Then y_{m+1}=y_{m}+\frac{1}{N}x_{m+1}-\frac{1}{N}x_{m-N}. From the formula for ym you have x_{m-N}\approx \frac{1}{N}y_{m}. Therefore a good approximation to a moving average over N samples is y_{m+1}=\frac{N-1}{N}y_{m}+\frac{1}{N}x_{m+1}...
 
Thanks all for the replies!

chiro said:
Just construct the test statistic and then get a variance for it and see what it is as a function of your sample.

I'm not sure I fully understand, as my statistics knowledge is lacking and I'm not familiar with the lingo. I think this is what I tried to show in my last post. I record the variance in each of the N measurements, and then take the average variance of the sample. If the signal is the same in each measurement (which I assumed), the variance is just the time-average of the square of the noise. Assuming the noise doesn't get bigger or smaller between measurements, the variance in each measurement should be roughly constant. By that argument, I thought the noise standard deviation after N averages would be less by a factor of ##\sqrt{N}##. Does that make sense? Intuitively I understand why averaging reduces noise levels, it's the quantitative analysis (how much does the noise get reduced) that I'm having more difficulty with.

@Svein I think I didn't explain the problem well. The "moving average" I meant isn't temporal, it's over separate data sets each taken over time. For example, if I had a DC power supply which produced 3V DC plus 10mV of noise, and I saved voltage-vs-time data on an oscilloscope over 10 seconds, and repeated this 100 times. I'm asking about averaging over the 100 different 10s-long traces, not a moving average of the data in each 10s trace. Sorry for the confusion.
 
A test statistic is technically just a function of a set of random variables - like a normal function except each variable is a random variable and not a deterministic one.

You have a moving average or some other thing thing you are trying to estimate (i.e. you are estimating something that is a function of your random variables but is constant regardless of them as the distribution you are looking at is also supposed to be the same for all of the sample elements) and you construct what is called an estimator.

An estimator is something that is used to get the distribution of the parameter you are trying to actually find.

Usually it's a mean, standard deviation, variance, median or something of that nature.

If it's an expectation then the standard statistical trick for large samples is via the Central Limit Theorem and its analogues. This means that if you find expectations of sums (or even means) where everything is IID (independent and identically distributed) then the distribution becomes normal.

When you do the expectation and variance calculations on this, you find that the standard deviation is sigma/sqrt(n) and the mean is just the mean of the sample. You usually have to estimate sigma using the sample variance and the sample mean is your estimate for your actual mean.

This is used to get point estimates and confidence intervals.

I'd strongly suggest you pick up an introductory textbook on statistical inference (which is the body of knowledge concerning this sort of thing) and look at the Maximum Likelihood Estimator technique to get an idea of how they are constructed.

That way - when you look at the formulas they can be followed rather than taking everything completely "on faith".
 
  • Like
Likes Twigg

Similar threads

  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 5 ·
Replies
5
Views
1K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 0 ·
Replies
0
Views
2K
Replies
28
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 19 ·
Replies
19
Views
2K
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K