Effect of a Moving Average on Gaussian Noise

Click For Summary

Discussion Overview

The discussion centers around the effect of averaging multiple data sets containing Gaussian noise on the resulting signal-to-noise ratio (SNR). Participants explore the implications of averaging in the context of statistical noise reduction, specifically addressing how the standard deviation of noise changes when multiple measurements are averaged. The conversation includes theoretical considerations and practical implications of moving averages and noise characteristics.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant suggests that averaging N data sets should reduce the noise, leading to an expected improvement in the SNR by a factor of ##\frac{1}{\sqrt{N}}## if the signals correlate and the noise is uncorrelated.
  • Another participant proposes that for each data set, the noise is Poisson distributed, leading to a standard deviation of ##\sigma = \sqrt{N}## for the averaged data, which supports the factor of ##\frac{1}{\sqrt{N}}## in SNR.
  • A different viewpoint emphasizes that the noise adds up squared while the signal adds directly, suggesting that the averaging process results in a signal-to-noise relationship of ##N/\sqrt{N}##.
  • One participant questions the initial assumption about noise distribution, clarifying that it is Gaussian and discussing the implications of uncorrelated noise on the averaging process.
  • Another participant discusses the construction of a test statistic and the relationship between information and variance reduction, referencing statistical principles like the Cramer-Rao bound.
  • A participant introduces the concept of a moving average as a low-pass filter, providing a mathematical expression for the moving average and its implications for noise reduction.
  • One participant expresses uncertainty about the statistical concepts involved and seeks clarification on the relationship between variance and noise reduction through averaging.
  • Another participant explains the role of estimators and the Central Limit Theorem in deriving the standard deviation of the averaged data, suggesting that the standard deviation is ##\sigma/\sqrt{N}##.

Areas of Agreement / Disagreement

Participants express varying interpretations of the noise characteristics and the implications of averaging. While some agree on the general principle that averaging reduces noise, there is no consensus on the specifics of the noise distribution or the mathematical derivations involved.

Contextual Notes

Participants note the dependence of their arguments on the assumptions regarding noise distribution (Gaussian vs. Poisson) and the correlation of signals. There are unresolved mathematical steps and varying interpretations of statistical concepts that contribute to the complexity of the discussion.

Twigg
Science Advisor
Gold Member
Messages
893
Reaction score
483
Hi all,

I think there is a really obvious answer to this, but I just don't see it yet. Suppose you had N data sets that all measured the same quantity as a function of time. Each data set shows the same signal plus a random noise component which is normally distributed about the signal with a constant standard deviation. If you were to take an average over the N data sets, you would expect to see the same signal reduced noise. If I'm not mistaken, the more data sets averaged, the lower the resulting noise. What would the standard deviation of the noise be in the averaged data? Thanks!
 
Physics news on Phys.org
IF the signals correlate and the noise doesn't, you expect an improvement of the signal to noise ratio by a factor of ##\ 1/\sqrt N\ ## with ##N## the number of data sets.
 
That makes sense I think. Let me know if I have this incorrectly:

For each of the N data sets, the noise is Poisson distributed at each time with a sample size of 1, which has standard deviation ##\sigma = \sqrt{1} = 1##. For the average of the N data sets, the noise is Poisson distributed at each time with a sample size of N, which has a standard deviation of ##\sigma = \sqrt{N}##. That gives the factor of ##\frac{1}{\sqrt{N}}## in SNR.
 
Didn't the thread mention Gaussian noise ? Anyway, key is: the noise is supposed to be uncorrelated, so it adds up squared, and the signal adds up direct. Divide by ##N## to average and the signal/noise goes like ##N/\sqrt N##.
 
I appreciate the correction! I clearly need to review and practice more.

Since I've already made a fool of myself once, can I ask you to verify that I've got it this time? If ##u_n(t)## is the noise component at time t in the n-th data set and ##\sigma## is the standard deviation of the noise in the Gaussian, then the standard deviation of the noise in the averaged data set is going to be ##\sigma_{avg} = \sqrt{<u^2(t)>} = \frac{\sqrt{\sum_{n=1}^{N} <u_{n}^2(t)>}}{N} = \frac{\sqrt{N*\sigma}}{N} = \frac{\sigma}{\sqrt{N}} ##? So SNR improves with ##\sqrt{N}##.

And from your last post, am I right to think that it doesn't matter if the noise is Gaussian distributed (e.g., Johnson-Nyquist noise) or Poissonian (e.g., shot noise), so long as it is uncorrelated and that the value of ##\sigma## doesn't change in time?
 
Hey Twigg.

Just construct the test statistic and then get a variance for it and see what it is as a function of your sample.

More information should always reduce the variance (via things like the Cramer-Rao bound) and when it comes to relating the noise to the assessment of what the signal is saying, if the signal has enough redundancy and you get enough actual information then statistically (at least) the noise probabilities (i.e. - the probability that the signal being assessed as a function of all data values) will be so low that there will be a high degree of confidence that whatever is inferred is likely to be what information was communicated.

This idea is found in things like probabilistic prime testing (like Rabin-Miller testing).
 
A moving average is a filter, more specifically a low-pass filter. You can get the same effect using a digital low-pass filter:

Assume that you take the moving average over N samples. Express this as y_{m}=\frac{1}{N}\sum_{i=m-N}^{m}x_{i}. Then y_{m+1}=y_{m}+\frac{1}{N}x_{m+1}-\frac{1}{N}x_{m-N}. From the formula for ym you have x_{m-N}\approx \frac{1}{N}y_{m}. Therefore a good approximation to a moving average over N samples is y_{m+1}=\frac{N-1}{N}y_{m}+\frac{1}{N}x_{m+1}...
 
Thanks all for the replies!

chiro said:
Just construct the test statistic and then get a variance for it and see what it is as a function of your sample.

I'm not sure I fully understand, as my statistics knowledge is lacking and I'm not familiar with the lingo. I think this is what I tried to show in my last post. I record the variance in each of the N measurements, and then take the average variance of the sample. If the signal is the same in each measurement (which I assumed), the variance is just the time-average of the square of the noise. Assuming the noise doesn't get bigger or smaller between measurements, the variance in each measurement should be roughly constant. By that argument, I thought the noise standard deviation after N averages would be less by a factor of ##\sqrt{N}##. Does that make sense? Intuitively I understand why averaging reduces noise levels, it's the quantitative analysis (how much does the noise get reduced) that I'm having more difficulty with.

@Svein I think I didn't explain the problem well. The "moving average" I meant isn't temporal, it's over separate data sets each taken over time. For example, if I had a DC power supply which produced 3V DC plus 10mV of noise, and I saved voltage-vs-time data on an oscilloscope over 10 seconds, and repeated this 100 times. I'm asking about averaging over the 100 different 10s-long traces, not a moving average of the data in each 10s trace. Sorry for the confusion.
 
A test statistic is technically just a function of a set of random variables - like a normal function except each variable is a random variable and not a deterministic one.

You have a moving average or some other thing thing you are trying to estimate (i.e. you are estimating something that is a function of your random variables but is constant regardless of them as the distribution you are looking at is also supposed to be the same for all of the sample elements) and you construct what is called an estimator.

An estimator is something that is used to get the distribution of the parameter you are trying to actually find.

Usually it's a mean, standard deviation, variance, median or something of that nature.

If it's an expectation then the standard statistical trick for large samples is via the Central Limit Theorem and its analogues. This means that if you find expectations of sums (or even means) where everything is IID (independent and identically distributed) then the distribution becomes normal.

When you do the expectation and variance calculations on this, you find that the standard deviation is sigma/sqrt(n) and the mean is just the mean of the sample. You usually have to estimate sigma using the sample variance and the sample mean is your estimate for your actual mean.

This is used to get point estimates and confidence intervals.

I'd strongly suggest you pick up an introductory textbook on statistical inference (which is the body of knowledge concerning this sort of thing) and look at the Maximum Likelihood Estimator technique to get an idea of how they are constructed.

That way - when you look at the formulas they can be followed rather than taking everything completely "on faith".
 
  • Like
Likes   Reactions: Twigg

Similar threads

  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K