# Measurements - How can it be that precise?

• B

## Summary:

How can even a few measured values be "honored" by a factor of ##1/\sqrt{N}##?

## Main Question or Discussion Point

I have a question about statistics or measurement technology.

Let be ##K## and ##N## natural numbers. We measure a variable ##x_{kn}## where ##k\in\{1,K\}## and ##n\in\{1,N\}##.

Let us say that you have ##K## measurement series to measure ##N## values ##x_{kn}##
No matter how big ##N## is.

The measured value is the mean value of ##x_{kn}## plus or minus the uncertencies ##u_k##.

The uncertainty is ##s_{xk}## and we can correct the result by a student factor ##t_k## to be more precise for ##N \leq 200##, thumb rule (sorry, it is engineering I am speaking about).

Now the mircacle begins (honestly, I just don't understand it). The saying goes: The value we measured is much more accurate as we may believe. If we measure ##k \rightarrow \infty## with ##N## measurements all the time, then we can be much more accurate - which is obviously true - to estimate the "true value". We calculate basically the standard deviation of the mean value. We get a factor ##1/\sqrt{N}## which lets the uncertainty be way smaller.
And here I need your help: This is done even for ##K=1##, the telling is "imagine that we did it several times, very often"... but we didn't! In the measurement theory we will divide by ##\sqrt10## for even ##10## single values.
How does this work?

Thanks,
Jens

Related Set Theory, Logic, Probability, Statistics News on Phys.org
Dale
Mentor
Summary:: How can even a few measured values be "honored" by a factor of ##1/\sqrt{N}##?

How does this work?
If you have 1 measurement the error may be large or small. If it is large then the next measurement is very likely to reduce the average error. If it is small then the next measurement will still probably reduce the average error, although the probability will be closer to 50% than in the case of a large first error.

If you have 1 measurement the error may be large or small. If it is large then the next measurement is likely to reduce the average error. The more measurements you have the more likely you are to have reduced the measurement error of a typical measurement.
Sure, this is how it works - but if you have only one run you will do the same - but why?

FactChecker
Gold Member
I am confused by the problem statement. How many measurements from a single distribution are being averaged to a single estimate number? Your use of the two variables N and K has me confused. If you have a question, even for K=1, why not restate the question without mentioning K? I think I could understand that easier.

• Dale
I am using ##K## to signalize that we need a bunch of measurements to have a mean of the means.
But this seems to be not needed. Even if you have one series of data you will simply believe that the data stream to your sensors would have done it forever - If you believe it would do it forever.

If it is small then the next measurement will still probably reduce the average error, although the probability will be closer to 50% than in the case of a large first error.
So I basically believe in the future? I reduce the uncertainty because I believe that in the future will happen the same?

FactChecker
Gold Member
Of course, averaging N values does not necessarily mean that the average is more accurate. It just means that the average is probably more accurate given the assumption that the errors, ##\epsilon_i##, (not "uncertainties") are independent random variables with mean 0.
Perhaps the confusion is in the use of "uncertainty". The more standard, defined description of an error probability distribution in this context is standard deviation.

Last edited:
• etotheipi
Dale
Mentor
Sure, this is how it works - but if you have only one run you will do the same - but why?
I am not sure why you are worried about N=1. ##1/\sqrt{1}=1## means no reduction for 1 measurement

Perhaps the confusion is in the use of "uncertainty". The more standard, defined description of an error probability distribution in this context is standard deviation.
Nope, the term "error" is a zombie terminus which has to die out. Nature is not an "error".
See the GUM terms. https://www.iso.org/sites/JCGM/GUM-introduction.htm
Of course, averaging N values does not necessarily mean that the average is more accurate. It just means that the average is probably more accurate given the assumption that the errors, ##\epsilon_i##, (not "uncertainties") are independent random variables with mean 0.
Perhaps the confusion is exactly in not using "uncertainty", see GUM.

Dale
Mentor
So what is the question now? Are we focusing on N=1 and K=1? What is the concern?

I am not sure why you are worried about N=1. ##1/\sqrt{1}=1## means no reduction for 1 measurement
Let us speak about 20 measurements with t=2.97 for a confidence interval of 95.5%.
The next step is that we simply multiply the uncertainty by ##1/\sqrt{20}##
This is enormous! It is an enormous amount of trust to say, no prob, just give it ##1/\sqrt{20}## for the future, isn't it?

FactChecker
Gold Member
Nope, the term "error" is a zombie terminus which has to die out. Nature is not an "error".
See the GUM terms. https://www.iso.org/sites/JCGM/GUM-introduction.htm

Perhaps the confusion is exactly in not using "uncertainty", see GUM.
IMHO, that is not the same subject as the one where you can average several readings and reduce the standard deviation by a factor of ##1/\sqrt N##. Uncertainty in measurements does not necessarily imply the required properties of independent samples with an random variable error term with mean zero.

In that context, you are correct that you can not say there is a reduction of the error. But then you are left with very little applicable theory. There is a great advantage in designing experiments and measurements so that the powerful and profound theories of probability and statistics can be applied.

Last edited:
Dale
Mentor
Let us speak about 20 measurements with t=2.97 for a confidence interval of 95.5%.
The next step is that we simply multiply the uncertainty by ##1/\sqrt{20}##
This is enormous! It is an enormous amount of trust to say, no prob, just give it ##1/\sqrt{20}## for the future, isn't it?
What are you talking about? What is the trust for the future? You are making no sense. Please explain very clearly what your concern is.

We have 20 measurements with some uncertainty ##\sigma## so the uncertainty of the mean is ##\sigma/\sqrt{20}##. What is at all confusing about that? And where does trust or the future enter in?

This question is not well stated so I will give my interpretation.
For any set of say N=10 measurements of "x" repeated many times there will be a distribution of calculated values for σ2. This distribution (for say gaussian distribution for x........ it may be more general but I forget) will conform to a CHI square distribution for σ2 with N-1 degrees of freedon. So you get (N-1) in the denominator and nonsense for N=1 as you should. I believe that is the crux here.

FactChecker
Gold Member
Let us speak about 20 measurements with t=2.97 for a confidence interval of 95.5%.
The next step is that we simply multiply the uncertainty by ##1/\sqrt{20}##
This is enormous! It is an enormous amount of trust to say, no prob, just give it ##1/\sqrt{20}## for the future, isn't it?
That is the confidence that we have in the true location of the mean of a random variable. That is not the confidence of where any single sample of that random variable will be.
Consider the toss of a fair coin and give numerical values, H=0, T=1. The true mean is 1/2 and the variance is 1/4. Consider the experiment of a hundred tosses taking the average of the results. The average will be very close to 1/2 with a variance of (1/4)/10 = 1/40. That does not mean that any single result will be that close since single results are 0 or 1. But it does mean that if I did 100 coin tosses and took the average, it would be within (1/2-1/40, 1/2+1/40) with a probability of about 0.682.

What are you talking about? What is the trust for the future? You are making no sense. Please explain very clearly what your concern is.

We have 20 measurements with some uncertainty ##\sigma## so the uncertainty of the mean is ##\sigma/\sqrt{20}##. What is at all confusing about that? And where does trust or the future enter in?

If you have measured 20 times a value then you have measured 20 times a value. This gives you exactly one (1) mean value. This gives you exactly one (1) sample standard deviation. On a time line: It is done. It is over, okay?

The next step is the following: If the conditions for the measurement are pretty much the same we can measure and measure again to make the measurement more accurate.
We will get new samples. This allows to achieve new precion levels, which is not done by mathetmatics but physics, okay?

In short: To speak about the uncertainty of the uncertainty obviously makes only sence if the future is taken into account, obviously the stability of the setup plays a role.

Dale
Mentor

If you have measured 20 times a value then you have measured 20 times a value. This gives you exactly one (1) mean value. This gives you exactly one (1) sample standard deviation.
There is one sample mean and one sample standard deviation, but the uncertainty in the sample mean is not equal to the sample standard deviation. What is so difficult to understand about that?

In short: To speak about the uncertainty of the uncertainty obviously makes only sence if the future is taken into account
If it is obvious then it should be easy to provide a professional scientific reference that makes this “obvious” claim. Please provide such a reference.

Last edited:
FactChecker
Gold Member
We will get new samples. This allows to achieve new precion levels, which is not done by mathetmatics but physics, okay?
What do you mean by this? Are you taking the same sample size? The first sample gave a number which is the sample variance. The process of obtaining a sample and calculating a sample variance is, itself, a random process that has a mean and variance. The variance of the variance is a complicated equation. See https://mathworld.wolfram.com/SampleVarianceDistribution.html.
In short: To speak about the uncertainty of the uncertainty obviously makes only sence if the future is taken into account,
Not really. One can treat the variance estimating process as a random process and (in theory) figure out its properties ahead of time. It will generate a number that has a mean and its own variance. That can be used to explain and better understand past experimental results. There is no reason to insist on future experiments or that the experimental setup still exists.
obviously the stability of the setup plays a role.
Throughout this, I would have to stipulate that the process remains completely unchanged so that it is a random process with an unchanging probability density function. There are subjects like Markov processes and time series, where the situation changes over time, but that is another can of worms.

Last edited:
• Omega0 and Dale
There is one sample mean and one sample standard deviation, but the uncertainty in the mean is not equal to the sample standard deviation. What is so difficult to understand about that?
To calculate the mean of means requires more than one (1) mesurements. What is so difficult to understand about that?

Dale
Mentor
To calculate the mean of means requires more than one (1) mesurements. What is so difficult to understand about that?
We are not calculating a mean of means. We are calculating the uncertainty of one sample mean.

• FactChecker
Dale, yes, we are calculating the uncertainty of a simple one sample measurement. Sorry. Of course not the mean of the means.

Dale
Mentor
Ok, so is the confusion resolved?

Stephen Tashi
Now the mircacle begins (honestly, I just don't understand it). The saying goes: The value we measured is much more accurate as we may believe.
Saying "the value" is "more accurate" is a poetic way of describing facts about mathematical probability. In the common-language interpretation of "accurate", there is no guarantee that "the value" becomes more accurate. In fact there is no guarantee that "the value" refers to a single value since the mean value of a collection of samples can change as more samples are taken.

A less poetic way of stating the mathematical facts is to say that the probability distribution for the sample mean of N independent identically distributed random variables is approximately a normal distribution whose standard deviation (aka "uncertainty") is approximately ##1/\sqrt{N}## times the standard deviation of an individual random variable.

To understand why this is true requires studying probability theory. If you are studying "uncertainties" merely as a set of procedures to be followed in reporting lab data, you might not study the probability theory that underlies the procedures. If you are studying probability theory, we can discuss why the normal distribution arises.

• Omega0
Okay, let's speak in short about the mathematics and sorry Dale etc. if this is this sounds too stupid, I just want to explain my point.

I learned that the mean value of a measurement is a pretty good measurement for the "true value" of one single measurement campaign. The standard deviation gives you trust intervals for the reliabilty of a single measurement campaign, nothing else.

Now let us speak about the standard deviation of the mean value of arbitrary functions which have ##N## measured values each.

Definition of this mean value: ##\bar{y}=f\left(y_i\right)=1/N\left(y_1+y_2+\dots+y_N\right)##

The derivatives are: ##\frac{\partial f}{\partial {y_i}}=\frac{1}{N}##

Please note that the "functions" ##y_i## are all consisting out of ##N## measured values ##x_k##. So far for all the measured ##y_i## you can create individually mean values etc.

Now with the variances ##s^2(y_i)## we get $$s^2(\bar{y})=\left(\frac{1}{N}\right)^2 \left( s^2\left(y_1\right)+s^2\left(y_1\right)+\dots+s^2\left(y_N\right)\right)$$.

Let us get to the last step which is the one which made we wonder a lot. We take all ##s^2(y_k)## to be literally the same, as they are made of the same input values. It is - tell me if I am wrong - as if you expect the measurement campaign to be ##N## times identical, correct?

This is why you have suddenly such a simple formula. This is why suddenly $$s(\bar{x})=\frac{s(x_i)}{\sqrt{N}}$$

From the derivation there is only one way to get the final result: You will need (mathematically, theoretically, however, ...) ##N## times the same result, correct?

Last edited:
To understand why this is true requires studying probability theory. If you are studying "uncertainties" merely as a set of procedures to be followed in reporting lab data, you might not study the probability theory that underlies the procedures. If you are studying probability theory, we can discuss why the normal distribution arises.
I am pretty sure to understand why the normal distribution arises, I even show it to my students in simulations of coupled random functions but I may loose against you in a dog fight. 