Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Measurement and Statistics

  1. Jan 19, 2016 #1
    Hello Forum,

    I am taking a lab and we are learning about measurement and uncertainty. Suppose we have to measure the length L of an object. Once the data has been collected we can calculate the mean (average) and the standard deviation s. The resulting measurement would be expressed as [ mean +- s/sqrt(N)] unit where N is the number of collected measurements.

    In statistics, there is the practice to collect multiple samples and obtain statistics from the sampling distribution. But this approach does not seem to apply in in the context of measuring a physical variable. Why?
    Would it be better, from a statistical standpoint, to collect a single sample of N=100 measurements or M=10 samples each containing N=10 measurements? On both cases the total number of measurements is 100...

    Thanks,
    fog37
     
  2. jcsd
  3. Jan 20, 2016 #2

    jambaugh

    User Avatar
    Science Advisor
    Gold Member

    I'm not 100% sure I understand what you're asking but here goes...

    In statistics it may be helpful to consider multiple samples to better understand the sampling distribution but as you say the N=100 once vs Ten cases of N=10 are equivalent amounts of data. The rule in statistics is that when we add independent random variables we add their variances (square of the standard deviation). Thus when we average (1/N*sum) we get for the composite standard deviation the [itex]S_{mean} = \frac{S}{\sqrt{N}}[/itex].

    Consider then your N=100 vs Ten N=10 cases. Assuming the sample standard deviations for each sample are about equal and close to the pop. stdev we would average the averages to get:
    [tex] S_{Mean} = \frac{ \frac{S}{ \sqrt{10}} }{\sqrt{10}} = \frac{S}{\sqrt{10}\cdot \sqrt{10}}= \frac{S}{\sqrt{100}}[/tex]
    so the standard deviation calculation is the same.
    [There's additional stat analysis we could do with the variations in the sample standard deviations but it should all balance out as your intuition would suggest since fundamentally what it's telling you is correct, this partitioning of the sample doesn't change the information it contains.]

    As you move deeper into the probability theory for the sampling distribution you'll note the Central Limit Theorem applies an you can describe the shape of the distribution for the sample mean you use for your measurement. (that it has the normal distribution with its bell shaped density curve).

    What may be confusing you is that to talk about the sampling distribution of your mean measurement, one must speak of how it behaves under many samplings. One is abstracting the random experiment of measuring the system one level higher to the random experiment of measuring N systems and calculating a mean value. To describe this second random experiment we need a (meta)sample i.e. many N-measurement samples. Once we consider this then we can describe the aggregate behavior of such samples and thus make definitive statements about the probability that your mean value deviates a certain amount from the actual physical value it seeks to measure.

    This is where the S/sqrt(N) formula comes from... in particular:
    [tex] \sigma_{\overline{X}} = \frac{1}{\sqrt{N}} \sigma_X[/tex]
    where [itex]\overline{X}[/itex] is the mean value of N independent random variables each with standard deviation [itex]\sigma_X[/itex].

    One is here describing how the mean value behaves over many samples in terms of how the original variable behaves over a single sample.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook