# Margin of error in a t-distribution

This is not a homework problem.

I am working on an experiment and I need to know how many samples (n) I need to achieve a margin of error (e) below 2%.

Looking through a statistics text book they provide a calculation for e using z-distributions, but not t-distributions.

Replacing variables I concluded that e = (ta/2S)2/$\sqrt{}n$ where ta/2 is the upper bound, S is the sample standard deviation.
Is is correct? Also, if so, is the value of e given as a percentage?

Lastly, from some preliminary tests (2 tests), the closer the initial tests results are to each other the smaller the error value (obviously). But I am concerned that a sample size of two is simply too small to definitively conclude that I am safely within my desired margin or error.

I need to conduct tests under a variety of conditions and the final number of tests performed may run into the hundreds, if not thousands, so it is vital that I do not perform more tests for any particular set of conditions that absolutely necessary. Any advice in this regard would be greatly appreciated.

Stephen Tashi
What is your definition of "margin of error"?

Do you want to say something like "the estimated value, based on the sample, is 28.93 and there is a 95% chance that the true value is within plus or minus 1.30 of this value"? Then give up, if you are using ordinary ("frequentist") statistics. It won't tell you that.

You can get something called a "confidence" interval from frequentist statistics. It doesn't tell you the probability that the true value (of whatever your are estimating) is in a specific numerical interval.

You didn't say what your are estimating and what estimator you are using. Once those are specified, you can estimate the standard deviation of the estimator. Some people call an interval that is plus or minus two (or three or four) standard deviations around the estimated value, the "margin of error". Is that what you mean?

Last edited:
What is your definition of "margin of error"?

Do you want to say something like "the estimated value, based on the sample, is 28.93 and there is a 95% chance that the true value is within plus or minus 1.30 of this value"?

That is exacting what I am trying to say. If "frequentist" statistics are not the right route what theories/methods should I be looking at?

I am trying to find the coefficient of friction, and I plan to use the sample mean as the estimator.

Last edited:
Stephen Tashi
That is exacting what I am trying to say. If "frequentist" statistics are not the right route what theories/methods should I be looking at?

If you want that kind of statement, you should look at Bayesian statistics and "prediction intervals". The mathematical facts of life are that unless you are willing to hypothesize a distribution for the thing you are estimating prior to analyzing the data, it is impossible to quantify a probability distribution for that thing after you have the data. (This is analgous to the fact that you can't find the sides and angles of a triangle when you are given only one side and one angle. It isn't a matter of philosophy. It's just the nature of what constitutes sufficient information to solve the problem.)

However, if you are thinking about publishing a report, consider that there are some areas of engineering and science where frequentist statistics is traditional. Frequentist statistics emphasizes "confidence intervals" for estimators. A confidence interval approach can make a statement like "When we base our estimate on 50 samples, there is a 95% probability the the true value will be within plus or minus 1.3 of our estimate." This statement is similar to what you want, but it cannot be applied to a particular estimate, such as 28.93. Laymen often incorrectly apply it to read like the statement you want.

Bayesian and frequentist statistical methods often use substantially the same formulae. There is a distinct difference in the problems they are solving with these formulae.

Thank you for your help. It is greatly appreciated.

Homework Helper
""When we base our estimate on 50 samples, there is a 95% probability the the true value will be within plus or minus 1.3 of our estimate."

Except it is not stated that way, as this makes it seem the true value is the quantity that is random.
"When we repeat this process a large number of times, and create a confidence interval each time, 95% of those intervals will fall around the true value" is the appropriate interpretation. Notice that this does not attach any information to a specific instance of an interval.

Homework Helper
A final comment: "Replacing variables I concluded that e = (ta/2S)2/√n where ta/2 is the upper bound, S is the sample standard deviation.
Is is correct? Also, if so, is the value of e given as a percentage?"

won't work. In order to know which t-value to use in this formula you need to select a number of degrees of freedom: as soon as you do that you've selected a sample size. The original formula uses z because (say for 95% confidence) a single z value suffices for every sample size. The downside: you must assume normality (as you do even for a t-interval)

I did some research and one study suggested taking a couple samples, finding the minimum number of samples for a z-distribution, then use that value of n in determining the degrees of freedom for the t-distribution. Then I solve for the new n. with each new sample everything is recalulated, S, n for z-dis and n for the t-dis.

This is not a homework problem.

I am working on an experiment and I need to know how many samples (n) I need to achieve a margin of error (e) below 2%.

Looking through a statistics text book they provide a calculation for e using z-distributions, but not t-distributions.

Replacing variables I concluded that e = (ta/2S)2/$\sqrt{}n$ where ta/2 is the upper bound, S is the sample standard deviation.
Is is correct? Also, if so, is the value of e given as a percentage?

Lastly, from some preliminary tests (2 tests), the closer the initial tests results are to each other the smaller the error value (obviously). But I am concerned that a sample size of two is simply too small to definitively conclude that I am safely within my desired margin or error.

I need to conduct tests under a variety of conditions and the final number of tests performed may run into the hundreds, if not thousands, so it is vital that I do not perform more tests for any particular set of conditions that absolutely necessary. Any advice in this regard would be greatly appreciated.

I can't tell whether you have exactly two data points, or two runs with n data points. If n is 30 or more then you have nothing to worry about. If n is two then you have a lot to worry about.

The problem is that with n=2 the sample standard deviation is likely to be quite an inaccurate estimate of the standard deviation of the population. So you have an additional uncertainty.

I don't know whether in your many runs you can assume that the population standard deviation is close to the same in each run. If you can then you can use a pooled sample standard deviation and your problems are over. If you can't and your n for each possible population standard deviation is low then you need to use the t distribution instead of z.

Stephen Tashi