Comparing Approaches for Determining Punctuality

  • Thread starter Thread starter musicgold
  • Start date Start date
AI Thread Summary
To determine punctuality based on a two-year record of arrival times, three statistical approaches are considered. The first option calculates a simple average of all 500 data points, but this can be skewed by outliers. The second option involves taking a sample of 50 data points to perform a hypothesis test, which provides more accuracy. The third option, recommended by participants, treats all 500 data points as a sample, leveraging the central limit theorem for improved accuracy in estimating variance and mean. Ultimately, using the larger sample size is deemed the best approach for a more reliable assessment of punctuality.
musicgold
Messages
303
Reaction score
19
Hi,

I have the electronic record of my time sheets at my work, for a two year period. Every day when I reach and leave the factory, I swipe my identity card through a machine which records the time.

I generally reach work at 8.00 am, sometimes early, sometimes late. I wish to know if I am a punctual person, statistically – on average, do I reach the office at 8.00 am or before?

I have about 500 data points. I am thinking of the following three approaches. I am not sure which one is the best.

1. Take a simple average of all the 500 arrival times. The average will tell me if I am punctual.

2. Take a sample of 50 data points, calculate the sample mean, calculate the sample variance and estimate the population variance. Assume that the population mean is 8.00 am. And see if the sample mean is within 3 standard deviations (population’s SD) from the population mean.

3. Treat the 500 data points as a sample and follow the steps outlined in the #2 approach.

Which one do you think is the best way and why?

Thanks,

MG.
 
Physics news on Phys.org
hmm I'd go with option 3. It will give you the most acurrate result.
 
Do you mind explaning a bit more? Thanks.
 
well let's look at option 1. you are finding the mean but this is largely affected by outliers in the data. Then option 2 you are taking, essentially, a sample size of 50 and doing a hypothesis test based on that sample and well in option 3 you are doing the same thing but just taking a larger sample. Either way the hypothesis test is more accurate than just finding the mean, however, because we do not know the whether your times are normally distributed or not we have to apply the central limit theorem. (if you do not know about this I suggest you read up about it somewhere, it's a simple concept) The theorem then implies that the larger the sample the closer your data resembles a normally distributed curve. Hence if you take a larger sample the more accurate your result will be. Of course it will be more time consuming to find your variance etc. but if accuracy is your aim then by all means.
( I apologize if my explanation seems a little haywire but my english isn't the best.. I'll clear up whatever.. just ask)
 
I was reading a Bachelor thesis on Peano Arithmetic (PA). PA has the following axioms (not including the induction schema): $$\begin{align} & (A1) ~~~~ \forall x \neg (x + 1 = 0) \nonumber \\ & (A2) ~~~~ \forall xy (x + 1 =y + 1 \to x = y) \nonumber \\ & (A3) ~~~~ \forall x (x + 0 = x) \nonumber \\ & (A4) ~~~~ \forall xy (x + (y +1) = (x + y ) + 1) \nonumber \\ & (A5) ~~~~ \forall x (x \cdot 0 = 0) \nonumber \\ & (A6) ~~~~ \forall xy (x \cdot (y + 1) = (x \cdot y) + x) \nonumber...
Back
Top