# Measurements - How can it be that precise?

• B
FactChecker
Gold Member
I learned that the mean value of a measurement is a pretty good measurement for the "true value" of one single measurement campaign. The standard deviation gives you trust intervals for the reliabilty of a single measurement campaign, nothing else.
What is a measurement "campaign"?
Now with the variances ##s^2(y_i)## we get $$s^2(\bar{y})=\left(\frac{1}{N}\right)^2 \left( s^2\left(y_1\right)+s^2\left(y_1\right)+\dots+s^2\left(y_N\right)\right)$$.
Where did this come from?

In its simplest form, aren't you saying that this formula for a sample standard deviation is wrong?
## s = \sqrt {\frac {\sum_{i=1}^{N} {(x_i-\bar x)^2}}{N-1}} ##

Dale
Where did this come from?
I am not sure how to cite here, but:
Book: "Elektrische Messtechnik"
Authors:
- Prof. em. Dr. rer. nat. Dr. hc. mult. Elmar Schrüfer
(TU Munich)
- Prof. Dr. techn. Leonhard Reindl
(University of Freiburg)
- Prof. Dr. techn. Bernard Zagar
University of LInz

2014, Hanser Verlag

This is where I got those equations come from - but I thought there are well known - so far.

FactChecker
Gold Member
Are you sure that the fraction ## 1/N ## should be squared? If one considers the standard deviations of single samples, I think that your equation should reduce to the standard one for the sample standard deviation.

Dale
Mentor
Okay, let's speak in short about the mathematics and sorry Dale etc. if this is this sounds too stupid, I just want to explain my point.

I learned that the mean value of a measurement is a pretty good measurement for the "true value" of one single measurement campaign. The standard deviation gives you trust intervals for the reliabilty of a single measurement campaign, nothing else.

Now let us speak about the standard deviation of the mean value of arbitrary functions which have ##N## measured values each.

Definition of this mean value: ##\bar{y}=f\left(y_i\right)=1/N\left(y_1+y_2+\dots+y_N\right)##

The derivatives are: ##\frac{\partial f}{\partial {y_i}}=\frac{1}{N}##

Please note that the "functions" ##y_i## are all consisting out of ##N## measured values ##x_k##. So far for all the measured ##y_i## you can create individually mean values etc.

Now with the variances ##s^2(y_i)## we get $$s^2(\bar{y})=\left(\frac{1}{N}\right)^2 \left( s^2\left(y_1\right)+s^2\left(y_1\right)+\dots+s^2\left(y_N\right)\right)$$.

Let us get to the last step which is the one which made we wonder a lot. We take all ##s^2(y_k)## to be literally the same, as they are made of the same input values. It is - tell me if I am wrong - as if you expect the measurement campaign to be ##N## times identical, correct?

This is why you have suddenly such a simple formula. This is why suddenly $$s(\bar{x})=\frac{s(x_i)}{\sqrt{N}}$$

From the derivation there is only one way to get the final result: You will need (mathematically, theoretically, however, ...) ##N## times the same result, correct?
What is a “measurement campaign” and why do you need one at all?

In its simplest form, aren't you saying that this formula for a sample standard deviation is wrong?
## s = \sqrt {\frac {\sum_{i=1}^{N} {(x_i-\bar x)^2}}{N-1}} ##
What? Excuse me.... what??

What is a “measurement campaign” and why do you need one at all?
Well, how do you call it to measure things? In US english? A measurement? We have different words in Germany I guess. A "measurement" would be a value and this "campaign" would be a bunch of values. So please don't ask me about the correct terms of measurement technology in english.
A measurement campaign is for me to measure several values in a row (under certain conditions etc.)

Dale
Mentor
It is - tell me if I am wrong - as if you expect the measurement campaign to be N times identical, correct?
Yes, the formula only works if the N measurements are independent and identically distributed. That is usually an explicit assumption in the derivation, often phrased in terms of random sampling of a fixed large population

Last edited:
Omega0 and FactChecker
FactChecker
Gold Member
Let us get to the last step which is the one which made we wonder a lot. We take all ##s^2(y_k)## to be literally the same, as they are made of the same input values. It is - tell me if I am wrong - as if you expect the measurement campaign to be ##N## times identical, correct?

This is why you have suddenly such a simple formula. This is why suddenly $$s(\bar{x})=\frac{s(x_i)}{\sqrt{N}}$$
That equation does not require that the values of the sample are identical. It requires that they are all independent samples from the same distribution. Aren't you talking about independent samples? If they are from distributions that are correlated, then the formula must include the correlation coefficients.

Omega0
Stephen Tashi
Now let us speak about the standard deviation of the mean value of arbitrary functions which have ##N## measured values each.
Let us get to the last step which is the one which made we wonder a lot. We take all s2(yk) to be literally the same, as they are made of the same input values.
You must distinguish between the mean value and standard deviation of a random variable versus the mean value and the standard deviation of a particular set of samples of that random variable.

Suppose the mean of random variable X is ##\mu_X## and its variance is ##\sigma^2_X##. The mean ##M## of N independent samples of that random variable is also a random variable. The mean of ##M## is ##\mu## and the variance of ##M## is ##\sigma^2_X/ N##.

However for a particular N samples of X, there is no guarantee the the mean of those N values will be ##\mu## and there is no guarantee that the variance of those N values will be ##\sigma^2_X/N##. In fact, for typical random variables, it's unlikely that sample statistics will match population parameters.

So you can't formulate a proof that the variance of ##M## is ##\sigma_X^2/N## by imagining that you are working with ##N## particular values of ##X##.

Omega0
Look at the Central Limit Theorem.

Guys, could you give me an online source where the derivation of the factor $$1/\sqrt{N}$$ for the standard derivation for one data sample is derived correctly?
Thanks.

Dale
Mentor
Omega0
FactChecker
Gold Member
I'm afraid that that reference refers to other charts that prove the essential facts. I could not find those charts. I think this is more self-contained and complete: http://www.milefoot.com/math/stat/rv-sums.htm
(For continuous random variables, the summations signs are replaced by integrals.)

Omega0 and Dale
Dale, that's it. Now I got it. Thanks to you and the other guys, Stephen, FastChecker etc.

To justify my stupid fault a bit: In the (very good) book it says, see the derivation above, my translation: "Here the ##s^2\left(y_i\right) ## are all the same because they are built each from the same input values. "
What I absolutely didn't understand is that, naturally, they don't need to be the same input values but the same ##s^2##.
This was my big failure.
Having said this, I would have found it fair if the professors would have remarked this. Seems they expected a knowledge in statistics above my level.

Thanks and sorry for my ignorance.

FactChecker and Dale
Stephen Tashi
Which formulae for population parameters also work for sample statistics that estimate them?

If we have a random samples ##S_x##, ##S_y## of N things taken from each of two random variables ##X## and ##Y##, we can imagine the values in ##S_x## and ##S_y## to define an empirical joint probability distributionj. So the sample mean of the pairwise sums of values in in ##S_x## and ##S_y## should be the sum of their sample means - analagous (but not identical) to the fact that ##E(X+Y) = E(X) + E(Y)##

However if ##X## and ##Y## are independent random variables, there is no guarantee that the empirical distribution of N pairs of numbers of the form ##(x,y), x \in S_x, y\in S_y## will factor as an distribution of x-value times an (independent) distribution of y-values. So we can't conclude the sample variance of ##x+y## is the sample variance of the x-values plus the sample variance of the y-values.

If, instead of N pairs of values, we looked at all the possible ##N^2## values ##(x,y)## we would get an empirical distribution where the x and y values are independent. Then something analgous to ##Var(X+Y) = Var(X) + Var(Y)## should work out. We would have to be specific about what definition of "sample variance" is used. For example, can we used the unbiased estimators of the population variances?

FactChecker
Gold Member
I prefer to think that (using variables with zero mean for simple equations):
$$\sigma^2_{X+Y} = E( (X+Y)^2 ) = E( X^2 + 2XY + Y^2) = E(X^2) + 2E(XY) + E(Y^2)$$ $$= \sigma^2_X + 2*cov(XY) + \sigma^2_Y$$
So the independence of ##X## and ##Y## implies the desired result.

Stephen Tashi
So the independence of ##X## and ##Y## implies the desired result.
##E(X^2) = \sigma_X^2 ## in the case of a random variable with zero mean. For a set of data ##x_i## realized from such a random variable, we don't necessarily have a zero sample mean.

FactChecker
FactChecker