1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Data repeatability (statistics question)

  1. Apr 13, 2013 #1
    1. The problem statement, all variables and given/known data
    I am trying to see if two sets of data represent the same values or not. I have:
    Mean1 = 9.3155, stdev1 = 0.1334; mean2 = 9.3040, stdev2 = 0.1248;
    N1 = N2 = 1000;
    I got these values from my data using MATLAB (std() and mean());

    2. Relevant equations

    [itex]z = \frac{(mean1-mean2)}{\sqrt{stdev1^{2}/N1^{2}+stdev2^{2}/N2^{2}}}[/itex]

    3. The attempt at a solution

    Null hypothesis: Sets are different.
    Alternative: Sets are the same.

    Using the formula above I get z score of 63, which accepts my Null hypothesis that the two series are different.

    However, I don't really seem to understand why they would be considered different given the fairly large standard deviation and close means. The way I think is kind of like - the second mean fits within mean1+/-stdev1, so shouldn't the z score be smaller?

    Statistics isn't my strong suit, and this is for an electronics thing, but I'm curious what I'm thinking wrong exactly.
     
  2. jcsd
  3. Apr 14, 2013 #2
    The standard deviation of a distribution s describes how individual entries of the distribution scatter. But it does not describe the uncertainty u with which the mean is determined. The uncertainty is determined from N numbers, not a single one. Therefore, its statistical uncertainty u is smaller than s, namely u² = s²/N (*). What you essentially want to do is comparing the difference between the means with the statistical inaccurancies you expect for them. Not comparing the difference to the width of the distributions. This would tell you to what extent you could take a single number and tell which of the two probability distributions it probably belongs to (assuming the distributions are different, of course).

    Btw.: Excellent question. It's nice to see when students question results that seem wrong to them.


    (*): Note by comparison that your equation for the not-further-specified "z" is a bit fishy. Also note that it is good habit to define/explain terms used. Just because everyone in your class knows what your teacher means by "z" does not imply that everyone around the world does.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: Data repeatability (statistics question)
  1. Data networks question (Replies: 1)

Loading...