Understanding the Kolmogorov–Smirnov test

  • Context: Undergrad 
  • Thread starter Thread starter JoePhysicsNut
  • Start date Start date
  • Tags Tags
    Test
Click For Summary
SUMMARY

The discussion centers on the implementation and interpretation of the two-sample Kolmogorov–Smirnov (KS) test, specifically in the context of comparing histograms. The function used returns a value of 1 for identical histograms and values between 0.05 and 0.25 for reasonably compatible histograms. A key point of confusion arises regarding the uniform distribution of the returned PROB value for compatible histograms, which is clarified by referencing the assumption of a uniform distribution for the KS statistic in statistical theory.

PREREQUISITES
  • Understanding of the Kolmogorov–Smirnov test
  • Familiarity with histogram analysis
  • Basic knowledge of statistical distributions
  • Experience with statistical software for hypothesis testing
NEXT STEPS
  • Study the mathematical foundations of the Kolmogorov–Smirnov test
  • Learn about the implications of uniform distribution in statistical tests
  • Explore software options for performing the Kolmogorov–Smirnov test, such as R or Python's SciPy library
  • Investigate alternative methods for histogram comparison
USEFUL FOR

Statisticians, data analysts, and researchers involved in hypothesis testing and data comparison, particularly those working with histogram data.

JoePhysicsNut
Messages
35
Reaction score
0
Dear all,

I am using some software to perform a two-sample Kolmogorov–Smirnov test. Specifically, I am testing the compatibility of two histograms.

The function returns a single number that is 1 for a perfect match (when I compare the histogram to itself) and somewhere between 0.05 to 0.25 for histograms that show reasonable compatibility.

The method seems to work as I expect, but there is a sentence in the description of the function that I do not understand:

"The returned value PROB is calculated such that it will be uniformly distributed between zero and one for compatible histograms".


Each test yields one value not many, so can't be distributed in any way. If it's over many tests, then compatible histograms should yield a high value for PROB and incompatible ones a low value. Why/how would the distribution be uniform?

Note that I'm not a statistics expert to a friendly explanation would be very welcome.
 
Physics news on Phys.org
Perhaps the documentation is a mangled attempt to say that the distribution of the KS two sample statistic is the same if both distributions are the same - regardless of the shape of the common distribution. Hence the distribution of the KS statistic can be calculated by assuming the common distribution is a uniform distribution for the sake of simplicity. ( - so says a poster in http://stats.stackexchange.com/questions/17495/kolmogorov-smirnov-two-sample-p-values)
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K