CE 102What is the relationship between confidence intervals and t-distributions?

  • Context: Graduate 
  • Thread starter Thread starter bhoover05
  • Start date Start date
  • Tags Tags
    intervals
Click For Summary

Discussion Overview

The discussion centers around the relationship between confidence intervals (CIs) and t-distributions, particularly focusing on the degrees of freedom associated with different confidence levels. Participants explore the implications of sample sizes on the critical values used in calculating CIs and how these relate to hypothesis testing.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant asserts that a 95% CI requires a critical value from a t-distribution with 25 degrees of freedom, while another claims it should be 23 degrees of freedom, leading to confusion about the correct degrees of freedom to use.
  • Another participant mentions that the degrees of freedom are typically calculated as n_1 + n_2 - 2 for two samples, but this is contingent on the sample sizes and assumptions about the populations.
  • There is a discussion about the implications of not having a specified population mean (mu initial) for hypothesis testing, with one participant suggesting that a null hypothesis can be set to any value for testing purposes.
  • One participant provides a detailed procedure for calculating a confidence interval when mu initial is unknown, emphasizing the need to estimate standard deviations and adjust degrees of freedom accordingly.

Areas of Agreement / Disagreement

Participants express differing views on the correct degrees of freedom for calculating confidence intervals, indicating that there is no consensus on this point. Additionally, there is a lack of agreement on the implications of not having a specified mu initial for hypothesis testing.

Contextual Notes

Participants mention the need for assumptions about population distributions and the implications of sample sizes on degrees of freedom, but these assumptions remain unresolved and are not universally accepted.

bhoover05
Messages
20
Reaction score
0
I was under the impression that 95% C.I requires that the critical value in the error term comes from a t-distribution with 25 degrees of freedom.

I was taught in class today that a 25% CI requires that the critical value in the error term comes from a t-distribution with 23 degrees of freedom? I am unsure as to why this is?

~For example, Let's use this. . .
Average length of workweeks of 15 randomly selected employees in the mining industry and 10 randomly selected employees in the manufacturing industry were obtained.
Miners(x)=15 observation, Mean of 47.5, and Std. Dev. of 5.5
Manufacturers= 10 observations, Mean of 42.5, and Std. Dev of 4.9.
With this data I obtained that Sp 5.27, t=2.069, and a 95% CI =(0.55,9.45)

~I never used 23 degrees of freedom to obtain those answers, were are correct. Why should the 95% CI for the difference between x-y require the critical value in the error term comes from a t-distribtuion with 23 degrees of freedom?
 
Physics news on Phys.org
I thought that Nu was directly related to n? Either n-1 or n-2 (can't remember which, statistics wasn't my favorite class).
 
bhoover05 said:
I was under the impression that 95% C.I requires that the critical value in the error term comes from a t-distribution with 25 degrees of freedom.

I was taught in class today that a 25% CI requires that the critical value in the error term comes from a t-distribution with 23 degrees of freedom? I am unsure as to why this is?

~For example, Let's use this. . .
Average length of workweeks of 15 randomly selected employees in the mining industry and 10 randomly selected employees in the manufacturing industry were obtained.
Miners(x)=15 observation, Mean of 47.5, and Std. Dev. of 5.5
Manufacturers= 10 observations, Mean of 42.5, and Std. Dev of 4.9.
With this data I obtained that Sp 5.27, t=2.069, and a 95% CI =(0.55,9.45)

~I never used 23 degrees of freedom to obtain those answers, were are correct. Why should the 95% CI for the difference between x-y require the critical value in the error term comes from a t-distribtuion with 23 degrees of freedom?

The critical value depends on the number of degrees of freedom, which depends on the sample size. For a 95% CI (a t-distribution is used for small samples) the number of degrees of freedom will typically be [tex]n_1 + n_2 - 2[/tex], if I remeber correctly...this should be in your statistics book somewhere and you have to make some assumptions about the populations.

CS
 
Oh that is so easy! I way over thought that. . . Thanks guys
 
Ok. . . Now to expand from CI's to hypothesis testing. . .

I understand that for small n, and data approx. Normal, you use the formula
T= (x-bar - mu initial)/(s/ sqr n)

Now, what If i have no mu initial given. . .

Example-
Sample mean= 0.8
St. D= 0.1789
n=6
 
"T= (x-bar - mu initial)/(s/ sqr n)

Now, what If i have no mu initial given. . .

Example-
Sample mean= 0.8
St. D= 0.1789
n=6"

In this case, I think what you want to do is _find_ mu initial. So, plug the values you have into the above formula. This gives,
T=(.8 - mu initial)/(.0730)

You also should have a size for the confidence level. Usually, it is 95% so I will use that.

Now, T is distributed as a t-distribution with _5_ degrees of freedom. (This is because don't know the _TRUE_ sigma. You then must estimate it by s, so you must subtract one from the number of data points in the sample to get the degrees of freedom.) In your table of t-distributions, find a number. That number, call it "talpha/2" has the following property.

The probability of T being larger than talpha/2 is .025.

(I think the above is an important phrase to remember.)


I looked up t in my table and found 2.571.

Then, 95% of the time,

-talpha/2<=T<=talpha/2

(The idea here is that the probability of being greater than talpha/2 is .025 and the probability of being less than -talpha/2 is also .025. Therefore, 95% of the time, T will be between -talpha/2 and talpha/2.)

This now says:

-2.571<=(.8-mu initial)/(.0730)<=2.571

Now solve this inequality for mu initial. You will then have a confidence interval for mu initial.

So, the procedure is:
1) Find the size of your confidence interval and subtract it from 1.
2) Divide that number by two.
3) Look up talpha/2 in a table with n-=1 degrees of freedom.
4) Next plug your other numbers into the formula (xbar - mu initial)/(s/sqr(n)). You will now have something only involving mu initial.
5) Put that formula between -talpha/2 and +talpha/2.
6) Solve for mu initial.

----------------------
Now, in your previous question, you asked why you needed 23 degrees of freedom rather than 24. That question looks like a t-test to me. You are trying to find out if there is a non-random difference between the length of work weeks of Manufacturers and of Miners.
In this case, there are two standard deviations you don't know. (You have an estimate s but you do not know the _TRUE_ standard deviation, sigma.) You must

(1)estimate them by their standard deviations and
(2)combine these estimates to get an estimate of the standard deviation of the distribution of the differences between the means.

Because you are using two estimates, you will need to subtract two degrees of freedom.

The rule-of-thumb is that, you must subtract one degree of freedom for each standard deviation you do not know for certain.
-------------------------

I hope this helps.
 
bhoover05 said:
Ok. . . Now to expand from CI's to hypothesis testing. . .

I understand that for small n, and data approx. Normal, you use the formula
T= (x-bar - mu initial)/(s/ sqr n)

Now, what If i have no mu initial given. . .

Example-
Sample mean= 0.8
St. D= 0.1789
n=6

If [itex]\mu_0[/itex] is not given or assumed, then you have nothing to test. A hypothesis test is used to infer something about the population mean relative to [itex]\mu_0[/itex]. For example your null hypothesis may be that [itex]\mu = 0[/itex] and your alternative hypothesis [itex]\mu > 0[/itex]. If your test statistic (T) falls in the rejection region, then you can infer that the population mean, [itex]\mu[/itex], is greater than 0. Note that we tested the hypothesis that the population mean was equal to 0 (i.e. you selected the value for [itex]\mu_0[/itex] which was 0 in this example).

Alternatively, [itex]\mu_0[/itex] can be any value you wish to test the population mean against.

Refer to page 6 of this http://www.sjsu.edu/faculty/gerstman/StatPrimer/hyp-test.pdf" for more info.

Hope that helps.

CS
 
Last edited by a moderator:

Similar threads

  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 3 ·
Replies
3
Views
4K