How Does Sampling Strategy Impact Measurement Accuracy in Statistics?

fog37
Messages
1,566
Reaction score
108
Hello Forum,

I am taking a lab and we are learning about measurement and uncertainty. Suppose we have to measure the length L of an object. Once the data has been collected we can calculate the mean (average) and the standard deviation s. The resulting measurement would be expressed as [ mean +- s/sqrt(N)] unit where N is the number of collected measurements.

In statistics, there is the practice to collect multiple samples and obtain statistics from the sampling distribution. But this approach does not seem to apply in in the context of measuring a physical variable. Why?
Would it be better, from a statistical standpoint, to collect a single sample of N=100 measurements or M=10 samples each containing N=10 measurements? On both cases the total number of measurements is 100...

Thanks,
fog37
 
Physics news on Phys.org
I'm not 100% sure I understand what you're asking but here goes...

In statistics it may be helpful to consider multiple samples to better understand the sampling distribution but as you say the N=100 once vs Ten cases of N=10 are equivalent amounts of data. The rule in statistics is that when we add independent random variables we add their variances (square of the standard deviation). Thus when we average (1/N*sum) we get for the composite standard deviation the S_{mean} = \frac{S}{\sqrt{N}}.

Consider then your N=100 vs Ten N=10 cases. Assuming the sample standard deviations for each sample are about equal and close to the pop. stdev we would average the averages to get:
S_{Mean} = \frac{ \frac{S}{ \sqrt{10}} }{\sqrt{10}} = \frac{S}{\sqrt{10}\cdot \sqrt{10}}= \frac{S}{\sqrt{100}}
so the standard deviation calculation is the same.
[There's additional stat analysis we could do with the variations in the sample standard deviations but it should all balance out as your intuition would suggest since fundamentally what it's telling you is correct, this partitioning of the sample doesn't change the information it contains.]

As you move deeper into the probability theory for the sampling distribution you'll note the Central Limit Theorem applies an you can describe the shape of the distribution for the sample mean you use for your measurement. (that it has the normal distribution with its bell shaped density curve).

What may be confusing you is that to talk about the sampling distribution of your mean measurement, one must speak of how it behaves under many samplings. One is abstracting the random experiment of measuring the system one level higher to the random experiment of measuring N systems and calculating a mean value. To describe this second random experiment we need a (meta)sample i.e. many N-measurement samples. Once we consider this then we can describe the aggregate behavior of such samples and thus make definitive statements about the probability that your mean value deviates a certain amount from the actual physical value it seeks to measure.

This is where the S/sqrt(N) formula comes from... in particular:
\sigma_{\overline{X}} = \frac{1}{\sqrt{N}} \sigma_X
where \overline{X} is the mean value of N independent random variables each with standard deviation \sigma_X.

One is here describing how the mean value behaves over many samples in terms of how the original variable behaves over a single sample.
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.

Similar threads

Replies
13
Views
3K
Replies
3
Views
2K
Replies
7
Views
2K
Replies
30
Views
2K
Replies
2
Views
2K
Replies
42
Views
3K
Replies
24
Views
5K
Back
Top