When does difference in sample size become an issue?

  • Context: Undergrad 
  • Thread starter Thread starter 80past2
  • Start date Start date
  • Tags Tags
    Difference Sample size
Click For Summary
SUMMARY

The discussion centers on the impact of sample size on statistical significance, specifically comparing groups with sample sizes of 1600 and 700. The user found significant differences in their results, raising concerns about whether these findings were influenced by the larger sample size. They attempted to equalize the sample sizes through random selection and bootstrapping, yielding consistent results. Additionally, the user is developing a plotting program to assess data normality and distribution types.

PREREQUISITES
  • Understanding of statistical significance and sample size effects
  • Familiarity with bootstrapping techniques in statistics
  • Knowledge of data distribution types, particularly normal distribution
  • Experience with data visualization tools for plotting distributions
NEXT STEPS
  • Learn about the impact of sample size on statistical power and significance
  • Explore advanced bootstrapping methods for statistical analysis
  • Investigate techniques for checking normality, such as the Shapiro-Wilk test
  • Study data visualization libraries like Matplotlib or Seaborn for plotting distributions
USEFUL FOR

Statisticians, data analysts, researchers comparing group differences, and anyone interested in understanding the implications of sample size on statistical results.

80past2
Messages
39
Reaction score
0
I was comparing two different groups, and in one, my n was 1600, and the other n was around 700. I found pretty much all significant differences, but is that maybe due to sample size. I tried doing a random selection making the sample sizes equal (both around 700) and got more or less the same numbers and significance every time. Should I do anything else, or is this fine?
I also bootstrapped within one of these restricted sets and got about the same numbers.
 
Physics news on Phys.org
Do you need help checking distribution type by plotting it?

80past2 said:
I was comparing two different groups, and in one, my n was 1600, and the other n was around 700. I found pretty much all significant differences, but is that maybe due to sample size. I tried doing a random selection making the sample sizes equal (both around 700) and got more or less the same numbers and significance every time. Should I do anything else, or is this fine?
I also bootstrapped within one of these restricted sets and got about the same numbers.

I am doing a plotting program to look at data-sets and check for normal-ness,
if you'd like, & your data is "near" normal -- you can attach a text file with the data (or a scaled version of it...to obscure what it is) that just lists the data values. eg:
7
3.5
11.0

etc;
And I could plot the data into 1% or 0.05% quantiles; like this:
converting binomial/normal distribution into quantiles and comparing against normal
and then I could post the graphs for you... :smile:
It will show skewness, and some information that could help identify what type of distribution it really is, but it's mostly to check Gaussian data...
 
Last edited:

Similar threads

  • · Replies 31 ·
2
Replies
31
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 9 ·
Replies
9
Views
4K
  • · Replies 24 ·
Replies
24
Views
6K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 10 ·
Replies
10
Views
2K