Mathematical Statistics- two sample t-test

Click For Summary

Homework Help Overview

The discussion revolves around a two-sample t-test to determine if the means of two datasets are equal at a 5% significance level. The original poster presents their findings, including a t-value, degrees of freedom, and a p-value, while expressing confusion regarding the implications of their results.

Discussion Character

  • Exploratory, Assumption checking, Problem interpretation

Approaches and Questions Raised

  • The original poster attempts to interpret their t-test results, questioning the validity of their p-value and the confidence interval not including zero. Some participants suggest considering the normal distribution due to the large sample size, while others express confusion about the implications of the confidence interval in relation to the null hypothesis.

Discussion Status

Participants are actively engaging with the original poster's findings, offering insights and questioning assumptions about the statistical methods used. There is a recognition of potential issues with the p-value and the interpretation of the confidence interval, but no explicit consensus has been reached.

Contextual Notes

The original poster mentions having two sets of data with 2000 random variables each, and there is uncertainty regarding the distribution used to generate these random variables. The discussion also reflects on the appropriateness of using the t-test versus a normal distribution due to the sample size.

Roni1985
Messages
200
Reaction score
0

Homework Statement


At the 5% level, will the two means be equal ?


Homework Equations





The Attempt at a Solution



I tested the variances and found out that it's very likely that the variances are equal, so this is an assumption we make when we do the second test.
Now, I'm trying to test the means and see if they are equal.
I'm getting confused because this is what I get:

Two Sample t-test

data: Var1 and Var2
t = -2.1372 df=3998 p-value = 0.03264
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.128821801 -0.005552541
sample estimates:
mean of x mean of y
-0.0670763877 0.0001107835

zero is not included in the but my p-value is greater than 2.5% (for a two sided test)...
what do I do in such case ?
my t value is more extreme than at the 2.5%.
I think the p-value I'm given is wrong, is it possible ?

Thanks,
Roni.
 
Physics news on Phys.org
It seems very odd to me that you confidence interval doesn't include 0. Also, since df = 3998, your sample size must be about 4000. With that large a sample, you don't need to use the Student's t test, but could instead use a normal distribution.

There is a lot you don't show, but I infer that you have a binomial distribution that you can approximate with a normal distribution.
 
Mark44 said:
It seems very odd to me that you confidence interval doesn't include 0. Also, since df = 3998, your sample size must be about 4000. With that large a sample, you don't need to use the Student's t test, but could instead use a normal distribution.

There is a lot you don't show, but I infer that you have a binomial distribution that you can approximate with a normal distribution.

You are right, I have 2 different sets of data with 2000 r.v's in each of them.

I think if it works with the normal dist test, it should also work with the t-test (the df is large)

here is some of the data:

1 -1.67695968151127 0.577070066857405
2 -0.642441727404665 0.53974331907915
3 1.03286440763869 -1.31999608111349
4 2.02667778667435 -0.0361000625995362
5 0.417407406923095 1.23635840784727
6 0.338215656007098 0.295867822353842
7 -0.831289368212899 -0.419309627121533
8 0.90774100682453 1.09654263070674
9 -1.22308033810869 0.474269912619356
10 0.201242609350191 0.793994235426449
11 -0.349209983009311 -0.620406980101651
12 0.303036015889208 -1.01786836015372
13 0.946824776561656 1.61722911478799
14 1.04405267789264 0.95021579495309
15 -0.5631803909092 -0.773840047711912
16 -0.618697005188519 0.877219620279467
17 -1.14813286274261 0.774274810083378
18 0.361139276781852 1.29857639982538
19 -1.69816166131597 -0.132765129167876
20 -1.85142475578048 2.01112325343722
21 -0.75348529082793 0.125903773730686
22 0.373202072550282 0.210826733939696

I don't know which distribution is used to generate these r.v's...
 
Well, something is not working, it seems. If your null hypothesis is that the difference of the two means is zero (i.e. they are equal), your confidence interval should straddle 0.

As you have things, 0 is outside the confidence interval, so if your test statistic came out as 0 you would reject the null hypothesis! That doesn't seem reasonable at all.
 
Mark44 said:
Well, something is not working, it seems. If your null hypothesis is that the difference of the two means is zero (i.e. they are equal), your confidence interval should straddle 0.

As you have things, 0 is outside the confidence interval, so if your test statistic came out as 0 you would reject the null hypothesis! That doesn't seem reasonable at all.

I think the p-value that's given to me is already multiplied by two.
I just did the z-test for two samples with excel and I got the same p-value.
Now, that the p-value is less than 5%, we can reject the null hypothesis.

Thanks for the help.
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 21 ·
Replies
21
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
Replies
4
Views
3K