Student t test with small sample number

  • Context: Undergrad 
  • Thread starter Thread starter Matheus del Valle
  • Start date Start date
  • Tags Tags
    Student Test
Click For Summary
SUMMARY

The discussion centers on the application of the Student's t-test for comparing two methods with a small sample size (N=3). The one-tailed t-test yielded a p-value of 0.11, indicating no significant difference, despite apparent differences in the data. Participants recommend using a nonparametric test, such as the Wilcoxon Rank Sum, and emphasize the importance of increasing sample size for reliable results. Additionally, they suggest employing a Bland-Altman plot and establishing a region of practical equivalence to assess method equivalence effectively.

PREREQUISITES
  • Understanding of Student's t-test and its limitations with small sample sizes
  • Knowledge of nonparametric tests, specifically the Wilcoxon Rank Sum test
  • Familiarity with Bland-Altman plots for method comparison
  • Concept of region of practical equivalence in statistical analysis
NEXT STEPS
  • Learn how to construct and interpret Bland-Altman plots
  • Research the implications of sample size on statistical significance
  • Study the Wilcoxon Rank Sum test and its application in small sample scenarios
  • Explore the concept of region of practical equivalence and how to define it
USEFUL FOR

Researchers, statisticians, and data analysts involved in method comparison studies, particularly those working with small sample sizes and seeking to validate statistical significance in their findings.

Matheus del Valle
Messages
2
Reaction score
0
Hello,

I'm checking the similarity of two methods for my research (a gold-standard method and another one which I need to check if it’s eficiente compared to the gold-standard) with student t test.

I have the following datas (N=3):
method 1 (gold-standard): 120, 347, 116;
method 2: 2603, 5203, 25011;

The result of one-tailed, independent samples student t test is p=0.11, which is bigger than 0.05.
So the test says that there's no significance differece between the two methods, but they are clearly different.
The t test is giving me a false result due to the small N number? Should I use another statistic test? Thanks.
 
Physics news on Phys.org
The problem is that if your second method data IS normally distributed, then it's clear your variance will be huge. This is a problem if you want to obtain significance.

It is kind of doubtful that your second method is normally distributed too, you have too little data points to check this anyway.

So either you continue to believe that your data points come from a normal distribution, in which case you'll need a hell of a lot more data points. Or you use a nonparametric test.
 
  • Like
Likes   Reactions: FactChecker
You should get more data if you can. Three values from each is a very small sample, no matter how obvious the differences look. The huge variation of the second set weaken the statistical results. I tried using a non-parametric test (Wilcoxon Rank Sum) and it was not significant. Out of curiosity, I adding one made-up typical data point to each set and the results became significant.

In general there are real concerns with shopping around for a test that makes your results look significant. A lot of insignificant results will look significant in some way if you examine them from every possible aspect
 
I agree with the comments above, but in addition if your goal is to demonstrate equivalence between the alternative test and the gold standard test then this is the wrong method.

The first step would be to do a Bland Altman plot. This is just a graphical method, but it is very commonly used in this type of research.

The next thing that you want to do is to decide on a region of practical equivalence. For example, the first gold standard test was 120, if another test gave 121 would you consider that to be practically equivalent? How about 130, or 150, or 200?

Once you have chosen a region of practical equivalence, then you take your data and construct a 95% confidence interval. If it lies entirely within the region of practical equivalence then you have good evidence of equivalence. Otherwise you do not have good evidence.
 
Thank you all for the help. I managed to improve my samples and now I'm analysing based on your tips.

I'm also using the ICC (intraclass correlation coefficient) to compare two different methods and it seems to be pretty satisfactory.
 
The ICC isn't really appropriate here. The regular correlation is more appropriate, with the gold standard as the independent variable and the new method as the dependent variable. However, correlation is not a good measure for this.

You should read Bland and Altman's highly influential paper on this subject. A paper where you don't at least provide a Bland Altman plot will likely be rejected in peer review in any decent journal.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 27 ·
Replies
27
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 20 ·
Replies
20
Views
3K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 7 ·
Replies
7
Views
6K
  • · Replies 23 ·
Replies
23
Views
3K
  • · Replies 2 ·
Replies
2
Views
1K
Replies
1
Views
2K