What is the error on averaging?

yungman · Sep 2, 2016

Let say I have a random number generator with numbers in and include between +200 to -200. We know the average has to be zero if you sample infinite number of times. How about you are only allow to sample 30 times, what is the maximum error from ideal zero in 30 random sampling?

Is there a formula for this? I am not a math student and this is not homework. This is a real life engineering problem I am facing. I have 140mVrms of white noise( total random) riding on a DC level. I want to find the DC level. I want to know if I spot sample 30 times and take the average, how accurate I can get to the true DC level.

Thanks

Dale · Sep 2, 2016

This is called the standard error of the mean.

yungman · Sep 2, 2016

Can you help telling me how to calculate this?

Thanks

Mark44 · Sep 2, 2016

yungman said:

Can you help telling me how to calculate this?

Do a web search for "standard error of the mean". There should be tons of references to this very well known statistic.

yungman · Sep 3, 2016

I did, question is what is the standard of deviation of a total random number described in my original post? I know the deviation of 30 sample is σ_x=σ/√(30). But I need to know what is the σ for this. Is the σ = 200 in my case since it is total random? So σ_x = 200/5.48=36.5?

As I said before, this is not homework. I am an EE with a problem. I really not interested in learning statistics and spend hours in reading what is standard deviation as I will not use this ever again. Please tell me what is the σ in my original post and I assume σ_x is the deviation after 30 sampling. I would really appreciate your help so I can get on with my design.

Thanks

chiro · Sep 3, 2016

Hey yungman.

You are going to have to get a joint distribution of your sample and use statistics and intervals to estimate what the error is.

I say estimate because you will only get so many percent within the interval.

As an example - a 95% interval for some parameter means that 95% of the time, the real value will be in that interval that you get and the other 5% it will be outside it.

This is assuming that your probability model is correct - if it isn't then you have to update it to something that is more in line with what actually is correct.

At some point you need to make assumptions and once you do this you get your interval and/or point estimates and go from there.

Heinera · Sep 3, 2016

yungman said:

I did, question is what is the standard of deviation of a total random number described in my original post? I know the deviation of 30 sample is σ_x=σ/√(30). But I need to know what is the σ for this. Is the σ = 200 in my case since it is total random? So σ_x = 200/5.48=36.5?
Thanks

No, σ will not be 200. To find σ, you will need to know what the distribution looks like. If the random numbers are uniformly distributed between -200 and 200, then σ = 115.47.

The formula for σ for a uniform random variable between a and b is σ = (b - a)/√12.

Dale · Sep 3, 2016

yungman said:

But I need to know what is the σ for this. Is the σ = 200 in my case since it is total random?

I can't help more than that. I don't know what you mean by "total random". I know what a uniformly distributed random is, and I know what a Gaussian distributed random is, but I don't know what a total random is.

yungman · Sep 3, 2016

Dale said:

I can't help more than that. I don't know what you mean by "total random". I know what a uniformly distributed random is, and I know what a Gaussian distributed random is, but I don't know what a total random is.

There goes to show you how little I know about this subject. I already spent almost an hour reading this topic after you told me what to look for. My knowledge of statistics was from high school...50 years ago!

What I meant total random is like there is no pattern, possibility of any number comes up between +200 and -200 is the same. I don't know exactly what is Gaussian distribution. It's like the probability of any number comes up within the range is the same. I don't know how to say it better anymore.

Thanks

Dale · Sep 3, 2016

yungman said:

possibility of any number comes up between +200 and -200 is the same.

That sounds like a uniform distribution.

yungman said:

there is no pattern

That sounds like an independent, identically distributed (IID) sample

For an IID sample of a uniform distribution @Heinera has given the calculation for the standard error.

https://en.m.wikipedia.org/wiki/Uniform_distribution_(continuous)

yungman · Sep 3, 2016

Heinera said:

No, σ will not be 200. To find σ, you will need to know what the distribution looks like. If the random numbers are uniformly distributed between -200 and 200, then σ = 115.47.

The formula for σ for a uniform random variable between a and b is σ = (b - a)/√12.

Thanks, I just remember σ = (b - a)/√12.

In my circuit, the +/-200 is actually noise voltage I see on the oscilloscope. It was +/-200mV or noise. I want to reduce the noise by sampling 30 times and use it to reduce the noise output. 30 times is all the time I can afford to sample within that period.

I was thinking, the formula said σ_x=σ/√(30)= σ/5.48, this means I would expect to see improvement of 5.48 times. That is after sampling, I would expect the noise will reduce by +/-200mV/5.48=+./-36.5mV.

I really don't even have to know the σ, all I have to know is σ_x= σ/5.48. The noise is proportional to this.

You think this is correct?

Thanks

yungman · Sep 3, 2016

chiro said:

Hey yungman.

You are going to have to get a joint distribution of your sample and use statistics and intervals to estimate what the error is.

I say estimate because you will only get so many percent within the interval.

As an example - a 95% interval for some parameter means that 95% of the time, the real value will be in that interval that you get and the other 5% it will be outside it.

This is assuming that your probability model is correct - if it isn't then you have to update it to something that is more in line with what actually is correct.

At some point you need to make assumptions and once you do this you get your interval and/or point estimates and go from there.

Sorry, I really don't understand this, it's so over my head, but thanks.

yungman · Sep 3, 2016

Dale said:

That sounds like a uniform distribution.

That sounds like an independent, identically distributed (IID) sample

For an IID sample of a uniform distribution @Heinera has given the calculation for the standard error.

https://en.m.wikipedia.org/wiki/Uniform_distribution_(continuous)

I think this is it.

Please help me by looking at post #9. If my assumption is correct that I reduce the noise by 5.48 time by sampling 30 times, that's my answer already. Hopefully I don't have to go any deeper on this.

Thanks

Heinera · Sep 3, 2016

yungman said:

I really don't even have to know the σ, all I have to know is σ_x= σ/5.48. The noise is proportional to this.

You think this is correct?

Thanks

Yes, but only if every data point in your sample of 30 are independent, and they all have the same distribution. It sounds like you have an analog signal that you are digitally sampling. If these samples are made with a high frequency (that is, with a short time interval between them), then there are several possible (and pausible) reasons that they might not be independent, and other formulas will apply. Best way of figuring this out is to do a lot of samples and running them through some statistical tests, before you start implementing your design.

FactChecker · Sep 3, 2016

You must be careful to distinguish between the maximum error and the expected error. It is highly unlikely, but possible to get all 30 values of 200. In that case, the error would be 200 and that is the maximum (you can also get -200). You may want to consider the expected error of the average, or the probability distribution of the sample error.

chiro · Sep 4, 2016

What I mean is that you have a sample which has a distribution and you use that to get a statistic to estimate something.

This is what statistics is all about - you estimate the distribution (or obtain it from first principles) and then you use that to get something you are looking for.

FactChecker · Sep 4, 2016

The OP asked about the maximum error but all the replies are talking about the statistic, with it's expected value and distribution. I hope the OP noticed the switch but I don't think that it was clearly mentioned. The maximum error is +- 200.

What is the error on averaging?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad How do E[X] and E[|X|] relate?

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight