Calculating uncertainty for a small sample

  • Thread starter Taylor_1989
  • Start date
  • Tags
    Uncertainty
In summary, the author is trying to determine how to calculate the uncertainty associated with the average time for a set of data. He is unsure about how to do this for such a small data set.
  • #1
Taylor_1989
402
14

Homework Statement


I am currently doing a lab experiment on viscosity. Basically placing a glass ball inside a tube full of sugar solution., and recording the times for the ball to travel a distance of 100mm. My issue is for such a small data set i.e 3 trials how do I calculate the uncertainty associated with the average, as I need to plot the results on a graph and this uncertainty will be my error bars.

So sample of my first set of data is as follows

\begin{pmatrix}Measurment\:No.&Time&Reaction\:time\\ 1&10:0491\:s&0.3\\ 2&10:0581\:s&0.3\\ 3&9:3521\:s&0.3\end{pmatrix}

So my average time is time is 9.8194s I am just not sure for such a small data set to just use the reaction time as the uncertainty associated with the average time?

Is there a possibility someone could explain the best course of action to take here please.

Sorry for the matrix form, I tried tabular environment some reason didn’t want to work
 
Physics news on Phys.org
  • #2
I think the uncertainty is greater than .3 since the difference between the measurements is more like .7. Maybe you can find a formula for the standard deviation in terms of the squares of the differences from the mean.
 
  • #3
Gene Naden said:
I think the uncertainty is greater than .3 since the difference between the measurements is more like .7. Maybe you can find a formula for the standard deviation in terms of the squares of the differences from the mean.

Yes I can see what you mean by the .7 the issue is tempeture is playing a role with the heating of the surgar solution, so on my last time trail, I think the temp made a slight increase resulting in shorter time interval.

Can us please elaborate thought on the Standard devation method you mentioned, I am kinda of learning stats as I go.
 
  • #4
Thanks for your reply. The standard deviation is a measure of the spread in data values. It compares all the data to the mean. The first step is to calculate the mean (the simple average). Then you look at how each data point differs from the mean. It doesn't matter if the data is higher than the mean or lower than the mean. What counts is the absolute value of the difference. Wikipedia says the standard deviation, or sigma, is "a measure that is used to quantify the amount of variation or dispersion of a set of data values." You shouldn't have to dig very deep to find the formula.
 
  • #5
I just been doing some reading myself quickly and would the average standard deviation be more appropriate because I can only say with 50% confidence level that my results will be with in the give range, I mean the average deviation gave me a 0.31s. Or have I done what many newbies do in stats, completely missed the point
 
  • #6
Thanks for your reply. I don't think "average standard deviation" means anything. I have never seen the term in my statistics books. You could compute the "average deviation" but that is perhaps more crude than the standard deviation. What you want to capture is the variation in your data. Remember what I quoted from Wikipedia, standard deviation "is a measure that is used to quantify the amount of variation or dispersion of a set of data values."
 
  • #7
Okay sorry for this misspelling I meant average. Erm so what I think you are saying from what you quoted is essence the pression, this is what the standard deviation would give me?
 
  • #8
Assuming the data is in the cells A1, A2, and A3 of a spreadsheet I usually recommend using the spreadsheet formula:

=stdev(A1:A3)/sqrt(count(A1:A3))

This computes the standard error of the mean (SEM). using stdev() rather than stdevp() is based on the idea that the data is from a sample rather than being the whole population, but it is larger than stdevp() so it is a bit more conservative if you are unsure.

See: https://en.wikipedia.org/wiki/Standard_error
 
  • #9
So Dr. Courtney's formula is different from what I was hinting at. It gives a smaller value. Dr. Courtney says you want standard error of the mean rather than standard deviation. So that is what you should use.
 
  • #10
Taylor_1989 said:
the uncertainty associated with the average, as I need to plot the results on a graph and this uncertainty will be my error bars.
No, that's not how you get error bars.

"the uncertainty associated with the average" is the standard error of the mean, and Dr Courtney has told you how to get that. However, that process assumes each individual datapoint is completely accurate. E.g., if you conduct a poll on attitudes to a proposal, each individual response is considered accurate, and the SEM tells you how surely the average response reflects the population as a whole.

Error bars should show the uncertainty inherent in the individual measurement. It is not affected by other measurements in your dataset. If you are measuring a distance with a ruler, there is some uncertainty from visual parallax, and some from the granularity of the ruler.
In your particular experiment you mention temperature variation. That is awkward. You can estimate what the temperature variation might be, but without other data you cannot tell how that translates into variations in descent time.

Assuming you can figure out the inherent uncertainty in individual measurements, this can be factored into the calculation of the SEM.
A pathological example: your underling provides a set of a thousand measurements of weights of packets of biscuits coming off a production line. They are supposed to be 100g each. The variation is rather high, but with a thousand measurements the SEM is quite small, so the average weight seems to be accurately determined. However, you discover that the numbers came from a machine that only measures in whole units of 30g. That makes the error bars on the individual measurements rather long, and leads to a much larger SEM.
 
  • #11
I thought this was a simple question but it now seems rather complicated. It sounds to me like the mentors are implying that your original idea to use reaction time as the error bar was the best choice. And it seems clear to me that my assumption that you should use some kind of measure of the variation of your data to determine the error bars was, well, not what the experimenters would do. I apologize for misleading you.
 
  • #12
Gene Naden said:
use reaction time as the error bar
Not sure about that either. It depends exactly how the experiment is being conducted, and what the 0.3 seconds means.
If the reaction time applies equally to the start and end of the timing then it is only the variation in reaction time that matters, probably a lot less than 0.3 s.
Similarly, if the reaction time applies only to, say, stopping the timer, and is a reasonable estimate, then it could be subtracted from all the timings. Thus the error would be the difference between 0.3s and the actual reaction time.
 
  • #13
In experiments with small sample sizes, estimating uncertainties is as much art as science.

The question of "how accurate are the error bars?" is much harder than "what is an estimate of the uncertainty in the mean?"

My advice for newbies is to use several plausible approaches to estimate the uncertainty and then consider that the uncertainty is probably within the range of those estimates, or at worst within a factor of two of the worst case. But this is based more on decades of experience rather than a rigorous certainty.
 
  • #14
First of thank you for the advice much appreciated and tried to take on board too what you have said. The procedure I followed was this, after thinking about it and reading the comments, The only thing I can be sure with is that my variation in reaction time is the only thing I could calculate for sure. So my method I did to obtain the value for my variation in reaction time was like so.

I took 40 trails on an online app measuring my reaction time, I then removed any outlier via a box plot ( it was trivial but I want to practice and improve my stats) I then took the standard deviation and calculate the standard error, which cam to to 5ms or 0.005s.
 
  • #15
Taylor_1989 said:
First of thank you for the advice much appreciated and tried to take on board too what you have said. The procedure I followed was this, after thinking about it and reading the comments, The only thing I can be sure with is that my variation in reaction time is the only thing I could calculate for sure. So my method I did to obtain the value for my variation in reaction time was like so.

I took 40 trails on an online app measuring my reaction time, I then removed any outlier via a box plot ( it was trivial but I want to practice and improve my stats) I then took the standard deviation and calculate the standard error, which cam to to 5ms or 0.005s.
Sounds good to me.
 

1. What is uncertainty in a small sample?

Uncertainty in a small sample refers to the range of possible values for a measurement or data point, due to the limited number of observations or participants in the sample. It represents the potential error or variation in the sample and can affect the accuracy and reliability of the results.

2. How is uncertainty calculated for a small sample?

Uncertainty in a small sample is typically calculated using statistical methods, such as standard error or confidence intervals. These calculations take into account the sample size, the variability of the data, and the level of confidence desired for the results.

3. Why is it important to calculate uncertainty for a small sample?

Calculating uncertainty for a small sample is important because it allows researchers to determine the reliability and generalizability of their results. It also helps to identify any potential biases or limitations in the sample, and can inform the interpretation and implications of the findings.

4. Can uncertainty be reduced in a small sample?

While it is not possible to completely eliminate uncertainty in a small sample, there are some methods that can help reduce it. These include increasing the sample size, using more precise measurement tools, and controlling for potential confounding variables in the study design.

5. How does the size of a sample affect the level of uncertainty?

The size of a sample has a direct impact on the level of uncertainty. Generally, the larger the sample size, the lower the uncertainty will be. This is because a larger sample provides more data points, allowing for a more accurate and representative estimate of the true population value.

Similar threads

  • Introductory Physics Homework Help
Replies
6
Views
1K
  • Introductory Physics Homework Help
Replies
7
Views
2K
  • Introductory Physics Homework Help
Replies
20
Views
1K
  • Introductory Physics Homework Help
Replies
6
Views
513
  • Introductory Physics Homework Help
Replies
2
Views
1K
  • Introductory Physics Homework Help
Replies
3
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Introductory Physics Homework Help
Replies
8
Views
1K
  • Introductory Physics Homework Help
Replies
6
Views
4K
  • Introductory Physics Homework Help
Replies
1
Views
4K
Back
Top