Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Am I punctual?

  1. May 25, 2010 #1

    I have the electronic record of my time sheets at my work, for a two year period. Every day when I reach and leave the factory, I swipe my identity card through a machine which records the time.

    I generally reach work at 8.00 am, sometimes early, sometimes late. I wish to know if I am a punctual person, statistically – on average, do I reach the office at 8.00 am or before?

    I have about 500 data points. I am thinking of the following three approaches. I am not sure which one is the best.

    1. Take a simple average of all the 500 arrival times. The average will tell me if I am punctual.

    2. Take a sample of 50 data points, calculate the sample mean, calculate the sample variance and estimate the population variance. Assume that the population mean is 8.00 am. And see if the sample mean is within 3 standard deviations (population’s SD) from the population mean.

    3. Treat the 500 data points as a sample and follow the steps outlined in the #2 approach.

    Which one do you think is the best way and why?


  2. jcsd
  3. May 25, 2010 #2
    hmm I'd go with option 3. It will give you the most acurrate result.
  4. May 26, 2010 #3
    Do you mind explaning a bit more? Thanks.
  5. May 26, 2010 #4
    well lets look at option 1. you are finding the mean but this is largely affected by outliers in the data. Then option 2 you are taking, essentially, a sample size of 50 and doing a hypothesis test based on that sample and well in option 3 you are doing the same thing but just taking a larger sample. Either way the hypothesis test is more accurate than just finding the mean, however, because we do not know the whether your times are normally distributed or not we have to apply the central limit theorem. (if you do not know about this I suggest you read up about it somewhere, it's a simple concept) The theorem then implies that the larger the sample the closer your data resembles a normally distributed curve. Hence if you take a larger sample the more accurate your result will be. Of course it will be more time consuming to find your variance etc. but if accuracy is your aim then by all means.
    ( I apologize if my explanation seems a little haywire but my english isn't the best.. I'll clear up whatever.. just ask)
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook