Measurements - How can it be that precise?

Omega0 · May 9, 2020

I have a question about statistics or measurement technology.

Let be ##K## and ##N## natural numbers. We measure a variable ##x_{kn}## where ##k\in\{1,K\}## and ##n\in\{1,N\}##.

Let us say that you have ##K## measurement series to measure ##N## values ##x_{kn}##
No matter how big ##N## is.

The measured value is the mean value of ##x_{kn}## plus or minus the uncertencies ##u_k##.

The uncertainty is ##s_{xk}## and we can correct the result by a student factor ##t_k## to be more precise for ##N \leq 200##, thumb rule (sorry, it is engineering I am speaking about).

Now the mircacle begins (honestly, I just don't understand it). The saying goes: The value we measured is much more accurate as we may believe. If we measure ##k \rightarrow \infty## with ##N## measurements all the time, then we can be much more accurate - which is obviously true - to estimate the "true value". We calculate basically the standard deviation of the mean value. We get a factor ##1/\sqrt{N}## which let's the uncertainty be way smaller.
And here I need your help: This is done even for ##K=1##, the telling is "imagine that we did it several times, very often"... but we didn't! In the measurement theory we will divide by ##\sqrt10## for even ##10## single values.
How does this work?

Thanks,
Jens

Dale · May 9, 2020

Omega0 said:

Summary:: How can even a few measured values be "honored" by a factor of ##1/\sqrt{N}##?

How does this work?

If you have 1 measurement the error may be large or small. If it is large then the next measurement is very likely to reduce the average error. If it is small then the next measurement will still probably reduce the average error, although the probability will be closer to 50% than in the case of a large first error.

Omega0 · May 9, 2020

Dale said:

If you have 1 measurement the error may be large or small. If it is large then the next measurement is likely to reduce the average error. The more measurements you have the more likely you are to have reduced the measurement error of a typical measurement.

Sure, this is how it works - but if you have only one run you will do the same - but why?

FactChecker · May 9, 2020

I am confused by the problem statement. How many measurements from a single distribution are being averaged to a single estimate number? Your use of the two variables N and K has me confused. If you have a question, even for K=1, why not restate the question without mentioning K? I think I could understand that easier.

Omega0 · May 9, 2020

I am using ##K## to signalize that we need a bunch of measurements to have a mean of the means.
But this seems to be not needed. Even if you have one series of data you will simply believe that the data stream to your sensors would have done it forever - If you believe it would do it forever.

Omega0 · May 9, 2020

Dale said:

If it is small then the next measurement will still probably reduce the average error, although the probability will be closer to 50% than in the case of a large first error.

So I basically believe in the future? I reduce the uncertainty because I believe that in the future will happen the same?

FactChecker · May 9, 2020

Of course, averaging N values does not necessarily mean that the average is more accurate. It just means that the average is probably more accurate given the assumption that the errors, ##\epsilon_i##, (not "uncertainties") are independent random variables with mean 0.
Perhaps the confusion is in the use of "uncertainty". The more standard, defined description of an error probability distribution in this context is standard deviation.

Dale · May 9, 2020

Omega0 said:

Sure, this is how it works - but if you have only one run you will do the same - but why?

I am not sure why you are worried about N=1. ##1/\sqrt{1}=1## means no reduction for 1 measurement

Omega0 · May 9, 2020

FactChecker said:

Perhaps the confusion is in the use of "uncertainty". The more standard, defined description of an error probability distribution in this context is standard deviation.

Nope, the term "error" is a zombie terminus which has to die out. Nature is not an "error".
See the GUM terms. https://www.iso.org/sites/JCGM/GUM-introduction.htm

Of course, averaging N values does not necessarily mean that the average is more accurate. It just means that the average is probably more accurate given the assumption that the errors, ##\epsilon_i##, (not "uncertainties") are independent random variables with mean 0.

Perhaps the confusion is exactly in not using "uncertainty", see GUM.

Dale · May 9, 2020

So what is the question now? Are we focusing on N=1 and K=1? What is the concern?

Omega0 · May 9, 2020

Dale said:

I am not sure why you are worried about N=1. ##1/\sqrt{1}=1## means no reduction for 1 measurement

Let us speak about 20 measurements with t=2.97 for a confidence interval of 95.5%.
The next step is that we simply multiply the uncertainty by ##1/\sqrt{20}##
This is enormous! It is an enormous amount of trust to say, no prob, just give it ##1/\sqrt{20}## for the future, isn't it?

FactChecker · May 9, 2020

Omega0 said:

Nope, the term "error" is a zombie terminus which has to die out. Nature is not an "error".
See the GUM terms. https://www.iso.org/sites/JCGM/GUM-introduction.htm

Perhaps the confusion is exactly in not using "uncertainty", see GUM.

IMHO, that is not the same subject as the one where you can average several readings and reduce the standard deviation by a factor of ##1/\sqrt N##. Uncertainty in measurements does not necessarily imply the required properties of independent samples with an random variable error term with mean zero.

In that context, you are correct that you can not say there is a reduction of the error. But then you are left with very little applicable theory. There is a great advantage in designing experiments and measurements so that the powerful and profound theories of probability and statistics can be applied.

Dale · May 9, 2020

Omega0 said:

Let us speak about 20 measurements with t=2.97 for a confidence interval of 95.5%.
The next step is that we simply multiply the uncertainty by ##1/\sqrt{20}##
This is enormous! It is an enormous amount of trust to say, no prob, just give it ##1/\sqrt{20}## for the future, isn't it?

What are you talking about? What is the trust for the future? You are making no sense. Please explain very clearly what your concern is.

We have 20 measurements with some uncertainty ##\sigma## so the uncertainty of the mean is ##\sigma/\sqrt{20}##. What is at all confusing about that? And where does trust or the future enter in?

hutchphd · May 9, 2020

This question is not well stated so I will give my interpretation.
For any set of say N=10 measurements of "x" repeated many times there will be a distribution of calculated values for σ². This distribution (for say gaussian distribution for x... it may be more general but I forget) will conform to a CHI square distribution for σ² with N-1 degrees of freedon. So you get (N-1) in the denominator and nonsense for N=1 as you should. I believe that is the crux here.

FactChecker · May 9, 2020

Omega0 said:

Let us speak about 20 measurements with t=2.97 for a confidence interval of 95.5%.
The next step is that we simply multiply the uncertainty by ##1/\sqrt{20}##
This is enormous! It is an enormous amount of trust to say, no prob, just give it ##1/\sqrt{20}## for the future, isn't it?

That is the confidence that we have in the true location of the mean of a random variable. That is not the confidence of where any single sample of that random variable will be.
Consider the toss of a fair coin and give numerical values, H=0, T=1. The true mean is 1/2 and the variance is 1/4. Consider the experiment of a hundred tosses taking the average of the results. The average will be very close to 1/2 with a variance of (1/4)/10 = 1/40. That does not mean that any single result will be that close since single results are 0 or 1. But it does mean that if I did 100 coin tosses and took the average, it would be within (1/2-1/40, 1/2+1/40) with a probability of about 0.682.

Omega0 · May 9, 2020

Dale said:

What are you talking about? What is the trust for the future? You are making no sense. Please explain very clearly what your concern is.

We have 20 measurements with some uncertainty ##\sigma## so the uncertainty of the mean is ##\sigma/\sqrt{20}##. What is at all confusing about that? And where does trust or the future enter in?

Okay, please read it carefully and take your time.

If you have measured 20 times a value then you have measured 20 times a value. This gives you exactly one (1) mean value. This gives you exactly one (1) sample standard deviation. On a time line: It is done. It is over, okay?

The next step is the following: If the conditions for the measurement are pretty much the same we can measure and measure again to make the measurement more accurate.
We will get new samples. This allows to achieve new precion levels, which is not done by mathetmatics but physics, okay?

In short: To speak about the uncertainty of the uncertainty obviously makes only sense if the future is taken into account, obviously the stability of the setup plays a role.

Dale · May 9, 2020

Omega0 said:

Okay, please read it carefully and take your time.

Similarly, please take your time to write carefully. Your posts are very disorganized and unclear.

Omega0 said:

If you have measured 20 times a value then you have measured 20 times a value. This gives you exactly one (1) mean value. This gives you exactly one (1) sample standard deviation.

There is one sample mean and one sample standard deviation, but the uncertainty in the sample mean is not equal to the sample standard deviation. What is so difficult to understand about that?

Omega0 said:

In short: To speak about the uncertainty of the uncertainty obviously makes only sense if the future is taken into account

If it is obvious then it should be easy to provide a professional scientific reference that makes this “obvious” claim. Please provide such a reference.

FactChecker · May 9, 2020

Omega0 said:

We will get new samples. This allows to achieve new precion levels, which is not done by mathetmatics but physics, okay?

What do you mean by this? Are you taking the same sample size? The first sample gave a number which is the sample variance. The process of obtaining a sample and calculating a sample variance is, itself, a random process that has a mean and variance. The variance of the variance is a complicated equation. See https://mathworld.wolfram.com/SampleVarianceDistribution.html.

In short: To speak about the uncertainty of the uncertainty obviously makes only sense if the future is taken into account,

Not really. One can treat the variance estimating process as a random process and (in theory) figure out its properties ahead of time. It will generate a number that has a mean and its own variance. That can be used to explain and better understand past experimental results. There is no reason to insist on future experiments or that the experimental setup still exists.

obviously the stability of the setup plays a role.

Throughout this, I would have to stipulate that the process remains completely unchanged so that it is a random process with an unchanging probability density function. There are subjects like Markov processes and time series, where the situation changes over time, but that is another can of worms.

Omega0 · May 9, 2020

Dale said:

There is one sample mean and one sample standard deviation, but the uncertainty in the mean is not equal to the sample standard deviation. What is so difficult to understand about that?

To calculate the mean of means requires more than one (1) mesurements. What is so difficult to understand about that?

Dale · May 9, 2020

Omega0 said:

To calculate the mean of means requires more than one (1) mesurements. What is so difficult to understand about that?

We are not calculating a mean of means. We are calculating the uncertainty of one sample mean.

Omega0 · May 10, 2020

Dale, yes, we are calculating the uncertainty of a simple one sample measurement. Sorry. Of course not the mean of the means.

Dale · May 10, 2020

Ok, so is the confusion resolved?

Stephen Tashi · May 11, 2020

Omega0 said:

Now the mircacle begins (honestly, I just don't understand it). The saying goes: The value we measured is much more accurate as we may believe.

Saying "the value" is "more accurate" is a poetic way of describing facts about mathematical probability. In the common-language interpretation of "accurate", there is no guarantee that "the value" becomes more accurate. In fact there is no guarantee that "the value" refers to a single value since the mean value of a collection of samples can change as more samples are taken.

A less poetic way of stating the mathematical facts is to say that the probability distribution for the sample mean of N independent identically distributed random variables is approximately a normal distribution whose standard deviation (aka "uncertainty") is approximately ##1/\sqrt{N}## times the standard deviation of an individual random variable.

To understand why this is true requires studying probability theory. If you are studying "uncertainties" merely as a set of procedures to be followed in reporting lab data, you might not study the probability theory that underlies the procedures. If you are studying probability theory, we can discuss why the normal distribution arises.

Omega0 · May 12, 2020

Okay, let's speak in short about the mathematics and sorry Dale etc. if this is this sounds too stupid, I just want to explain my point.

I learned that the mean value of a measurement is a pretty good measurement for the "true value" of one single measurement campaign. The standard deviation gives you trust intervals for the reliabilty of a single measurement campaign, nothing else.

Now let us speak about the standard deviation of the mean value of arbitrary functions which have ##N## measured values each.

Definition of this mean value: ##\bar{y}=f\left(y_i\right)=1/N\left(y_1+y_2+\dots+y_N\right)##

The derivatives are: ##\frac{\partial f}{\partial {y_i}}=\frac{1}{N}##

Please note that the "functions" ##y_i## are all consisting out of ##N## measured values ##x_k##. So far for all the measured ##y_i## you can create individually mean values etc.

Now with the variances ##s^2(y_i)## we get $$s^2(\bar{y})=\left(\frac{1}{N}\right)^2 \left( s^2\left(y_1\right)+s^2\left(y_1\right)+\dots+s^2\left(y_N\right)\right) $$.

Let us get to the last step which is the one which made we wonder a lot. We take all ##s^2(y_k)## to be literally the same, as they are made of the same input values. It is - tell me if I am wrong - as if you expect the measurement campaign to be ##N## times identical, correct?

This is why you have suddenly such a simple formula. This is why suddenly $$s(\bar{x})=\frac{s(x_i)}{\sqrt{N}}$$

From the derivation there is only one way to get the final result: You will need (mathematically, theoretically, however, ...) ##N## times the same result, correct?

Omega0 · May 12, 2020

Stephen Tashi said:

To understand why this is true requires studying probability theory. If you are studying "uncertainties" merely as a set of procedures to be followed in reporting lab data, you might not study the probability theory that underlies the procedures. If you are studying probability theory, we can discuss why the normal distribution arises.

I am pretty sure to understand why the normal distribution arises, I even show it to my students in simulations of coupled random functions but I may loose against you in a dog fight.

FactChecker · May 12, 2020

Omega0 said:

I learned that the mean value of a measurement is a pretty good measurement for the "true value" of one single measurement campaign. The standard deviation gives you trust intervals for the reliabilty of a single measurement campaign, nothing else.

What is a measurement "campaign"?

Now with the variances ##s^2(y_i)## we get $$s^2(\bar{y})=\left(\frac{1}{N}\right)^2 \left( s^2\left(y_1\right)+s^2\left(y_1\right)+\dots+s^2\left(y_N\right)\right) $$.

Where did this come from?

In its simplest form, aren't you saying that this formula for a sample standard deviation is wrong?
## s = \sqrt {\frac {\sum_{i=1}^{N} {(x_i-\bar x)^2}}{N-1}} ##

Omega0 · May 12, 2020

FactChecker said:

Where did this come from?

I am not sure how to cite here, but:
Book: "Elektrische Messtechnik"
Authors:
- Prof. em. Dr. rer. nat. Dr. hc. mult. Elmar Schrüfer
(TU Munich)
- Prof. Dr. techn. Leonhard Reindl
(University of Freiburg)
- Prof. Dr. techn. Bernard Zagar
University of LInz

2014, Hanser Verlag

This is where I got those equations come from - but I thought there are well known - so far.

FactChecker · May 12, 2020

Are you sure that the fraction ## 1/N ## should be squared? If one considers the standard deviations of single samples, I think that your equation should reduce to the standard one for the sample standard deviation.

Dale · May 12, 2020

Omega0 said:

Okay, let's speak in short about the mathematics and sorry Dale etc. if this is this sounds too stupid, I just want to explain my point.

I learned that the mean value of a measurement is a pretty good measurement for the "true value" of one single measurement campaign. The standard deviation gives you trust intervals for the reliabilty of a single measurement campaign, nothing else.

Now let us speak about the standard deviation of the mean value of arbitrary functions which have ##N## measured values each.

Definition of this mean value: ##\bar{y}=f\left(y_i\right)=1/N\left(y_1+y_2+\dots+y_N\right)##

The derivatives are: ##\frac{\partial f}{\partial {y_i}}=\frac{1}{N}##

Please note that the "functions" ##y_i## are all consisting out of ##N## measured values ##x_k##. So far for all the measured ##y_i## you can create individually mean values etc.

Now with the variances ##s^2(y_i)## we get $$s^2(\bar{y})=\left(\frac{1}{N}\right)^2 \left( s^2\left(y_1\right)+s^2\left(y_1\right)+\dots+s^2\left(y_N\right)\right) $$.

Let us get to the last step which is the one which made we wonder a lot. We take all ##s^2(y_k)## to be literally the same, as they are made of the same input values. It is - tell me if I am wrong - as if you expect the measurement campaign to be ##N## times identical, correct?

This is why you have suddenly such a simple formula. This is why suddenly $$s(\bar{x})=\frac{s(x_i)}{\sqrt{N}}$$

From the derivation there is only one way to get the final result: You will need (mathematically, theoretically, however, ...) ##N## times the same result, correct?

What is a “measurement campaign” and why do you need one at all?

Omega0 · May 12, 2020

FactChecker said:

In its simplest form, aren't you saying that this formula for a sample standard deviation is wrong?
## s = \sqrt {\frac {\sum_{i=1}^{N} {(x_i-\bar x)^2}}{N-1}} ##

What? Excuse me... what??

Omega0 · May 12, 2020

Dale said:

What is a “measurement campaign” and why do you need one at all?

Well, how do you call it to measure things? In US english? A measurement? We have different words in Germany I guess. A "measurement" would be a value and this "campaign" would be a bunch of values. So please don't ask me about the correct terms of measurement technology in english.
A measurement campaign is for me to measure several values in a row (under certain conditions etc.)

Dale · May 12, 2020

Omega0 said:

It is - tell me if I am wrong - as if you expect the measurement campaign to be N times identical, correct?

Yes, the formula only works if the N measurements are independent and identically distributed. That is usually an explicit assumption in the derivation, often phrased in terms of random sampling of a fixed large population

FactChecker · May 12, 2020

Omega0 said:

Let us get to the last step which is the one which made we wonder a lot. We take all ##s^2(y_k)## to be literally the same, as they are made of the same input values. It is - tell me if I am wrong - as if you expect the measurement campaign to be ##N## times identical, correct?

This is why you have suddenly such a simple formula. This is why suddenly $$s(\bar{x})=\frac{s(x_i)}{\sqrt{N}}$$

That equation does not require that the values of the sample are identical. It requires that they are all independent samples from the same distribution. Aren't you talking about independent samples? If they are from distributions that are correlated, then the formula must include the correlation coefficients.

Stephen Tashi · May 13, 2020

Omega0 said:

Now let us speak about the standard deviation of the mean value of arbitrary functions which have ##N## measured values each.

Let us get to the last step which is the one which made we wonder a lot. We take all s2(yk) to be literally the same, as they are made of the same input values.

You must distinguish between the mean value and standard deviation of a random variable versus the mean value and the standard deviation of a particular set of samples of that random variable.

Suppose the mean of random variable X is ##\mu_X## and its variance is ##\sigma^2_X##. The mean ##M## of N independent samples of that random variable is also a random variable. The mean of ##M## is ##\mu## and the variance of ##M## is ##\sigma^2_X/ N##.

However for a particular N samples of X, there is no guarantee the the mean of those N values will be ##\mu## and there is no guarantee that the variance of those N values will be ##\sigma^2_X/N##. In fact, for typical random variables, it's unlikely that sample statistics will match population parameters.

So you can't formulate a proof that the variance of ##M## is ##\sigma_X^2/N## by imagining that you are working with ##N## particular values of ##X##.

skanskan · May 16, 2020

Look at the Central Limit Theorem.

Measurements - How can it be that precise?

Similar threads

Hot Threads

Recent Insights