Kolmogorov smirnov in r - cannot compute correct p-values with ties

  • Context: Graduate 
  • Thread starter Thread starter joanne34567
  • Start date Start date
  • Tags Tags
    Kolmogorov
Click For Summary
SUMMARY

The discussion centers on the challenges of computing p-values using the Kolmogorov-Smirnov test in R when dealing with datasets that contain ties. The user encounters an error indicating that correct p-values cannot be computed due to the presence of duplicate values in their datasets, which consist of 2000 and 50000 observations, respectively. A suggestion is made to empirically test the sensitivity of the p-value by adding small random perturbations to the data, allowing for a better understanding of the p-value's validity in the presence of ties.

PREREQUISITES
  • Understanding of the Kolmogorov-Smirnov test
  • Familiarity with R programming language
  • Knowledge of statistical significance and p-values
  • Experience with data manipulation in R
NEXT STEPS
  • Explore the R documentation for the ks.test function
  • Learn about handling ties in statistical tests
  • Investigate methods for perturbing data to assess p-value sensitivity
  • Research alternative statistical tests for comparing distributions with ties
USEFUL FOR

Statisticians, data analysts, and R programmers who are working with distribution comparisons and need to understand the implications of ties on p-value calculations.

joanne34567
Messages
12
Reaction score
0
Hi,
I'm trying to use the kolmogorov smirnov test in in R to compare one distribution with another. I'm getting the following error: cannot compute correct p-values with ties
I think this is because the dataset that I am using has in the first instance around 2000 values, and the second, 50000 values. A number of these are inherently going to be replicates of another number in the series (i.e. "ties"). I'm just wondering if this has implcations for the p value I have? Is the p value valid?
Cheers all
 
Physics news on Phys.org
Did you read the R documentation for the ks.test function?

You could test the sensitivity of the p-value empirically. Add small random perturbations to your data to generate new data sets that have more digits. See how much the p value changes.
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 14 ·
Replies
14
Views
6K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 12 ·
Replies
12
Views
4K
  • · Replies 0 ·
Replies
0
Views
3K
  • · Replies 105 ·
4
Replies
105
Views
15K
  • · Replies 3 ·
Replies
3
Views
4K