Distances in Knn: Are they metric functions in the Mathematical sense?

Click For Summary
SUMMARY

The discussion centers on whether distance functions used in K-Nearest Neighbors (KNN) must adhere to the mathematical properties of metrics. It concludes that while many commonly used distances, such as Pythagorean, taxicab (Manhattan), and Minkowski distances, satisfy these properties, others like cosine similarity and certain boolean metrics do not. The triangle inequality is particularly highlighted as a condition that may not be necessary for effective classification, as demonstrated by the use of Euclidean distance plus one. Ultimately, the choice of distance function should be based on performance rather than strict adherence to metric properties.

PREREQUISITES
  • Understanding of K-Nearest Neighbors (KNN) algorithm
  • Familiarity with distance metrics in machine learning
  • Knowledge of mathematical properties of metrics
  • Basic concepts of binary classification
NEXT STEPS
  • Research the properties of distance metrics in machine learning
  • Explore the implementation of KNN using different distance functions
  • Learn about the implications of using cosine similarity in clustering
  • Investigate the performance of non-metric distance functions in classification tasks
USEFUL FOR

Data scientists, machine learning practitioners, and anyone involved in implementing or optimizing KNN algorithms will benefit from this discussion.

WWGD
Science Advisor
Homework Helper
Messages
7,773
Reaction score
13,008
TL;DR
The term 'distance' is used in Knn. Are these distance functions required to satisfy the properties of a metric?
Hi,
Just curious as to whether distances 'd' , used in Knn ; K nearest neighbors, in Machine Learning, are required to be metrics in the Mathematical Sense, i.e., if they are required to satisfy, in a space A:
##d: A \times A \rightarrow \mathbb R^{+} \cup \{0\} ;
d(a,a)=0 ;
d(a,b)=d(b,a) ; d(a,c) < d(a,b)+d(b,c) ##
?
 
Physics news on Phys.org
Suggestion: Expand on what you are talking about.
 
mathman said:
Suggestion: Expand on what you are talking about.
Sure. We're trying to ; a binary classification 0/1 , of test subjects with the following method.
I am simplifying somewhat

Basically an object belongs to the class of objects it is closest to, by some fixed criterion d.

0) Choose a positive integer k
1) We have objects known to belong to both class 0 and class 1
2) We have a collection of objects to test on whether they belong to class 0 or class 1.
3)We have a real -valued non-negative binary function d AxB --> R ; A is in 2), B is in 1)
4) We find the value of d(a,B) , i.e., { d(a,b) : a fixed, for all b in B}
5) a is assigned to the class of the object b for which d(a,b) is the smallest.

I suspect @pbuk, @Office_Shredder or @Dale may know.
 
Last edited:
WWGD said:
... ##d(a,c) < d(a,b)+d(b,c) ##
I think you mean ## d(a,c) \le d(a,b)+d(b,c) ##.

I am not sure we can say that the distance functions are required to satisfy any conditions, however we are only interested in functions that perform well and it may be possible to show that certain conditions must be satisfied in order for this to be the case.

Many metrics in ML do satisfy the triangle equality, however I don't think this is true for the cosine distance and I have a hunch this is also the case for some of the distances in boolean spaces e.g. Kuslinski distance.

Taking the other contemplated 'requirements':
  • ## d: A \times A \rightarrow \mathbb R^{+} \cup \{0\} ; ## clearly not the case for boolean metrics.
  • ## d(a,a)=0 ; ## I don't see this as necessary: Euclidian distance + 1 would work almost as well as Euclidean distance in most cases.
  • ## d(a,b)=d(b,a)## I'll leave a counterexample for this to you :-p
 
  • Like
Likes   Reactions: WWGD
  • Like
Likes   Reactions: jedishrfu and WWGD
I agree with the clever observation that euclidean distance +1 results in the exact same algorithm but does not satisfy the triangle inequality.

Cosine similarity does not satisfy the triangle inequality either, but does have the property that if A and B are similar, then the similarity of A and C is about the same as the similarity of B and C. Otherwise you can't really get clusters of points that are similar to each other and dissimilar from the other clusters.
 
Last edited:
  • Like
Likes   Reactions: pbuk and WWGD

Similar threads

  • · Replies 17 ·
Replies
17
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 4 ·
Replies
4
Views
926
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 15 ·
Replies
15
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 0 ·
Replies
0
Views
2K