# I Understanding Mahalanobis distance

Tags:
1. Nov 1, 2016

### Avatrin

I am currently taking a course in pattern recognition, and several times I have encountered the multivariable normal distribution and thus, Mahalanobis distance. I want to understand Mahalanobis distance; Primarily for understanding the normal distribution, but also to understand the measure itself.

I have read several intuitive explanations here and in books, but how can I do this rigorously? I have had some, but not much, measure theory (in the part of a real analysis course that covered integration theory). I have had one introductory statistics and probability theory course.

What should I read to understand the Mahalanobis distance? Clearly it is related to ellipoids, but how? Again, I don't want a rough intuitive explanation.

2. Nov 1, 2016

3. Nov 1, 2016

### Avatrin

Well, I did write I want to understand it in regard to the N dimensional normal distribution. According to Wikipedia, it tells me how many standard deviations a point x is from the mean of the deviation. The books on pattern recognition don't even really mention it.

So, I guess, what I am looking for is the property of the metric. Why is it used in statistics rather than some other metric?

4. Nov 1, 2016

### MarneMath

Imagine you have units on different scales, then the euclidian distance doesn't really make sense, since you're simply adding the squared units of that measurement. So if it's all the same, then we're good. However, if you have units in different scales and types, the idea of distance becomes a bit more complicated. In fact, I don't really like to think about the Mahalonbis distance as a distance but rather a measurement of intensity.

The number one answer here does a good job of explaining how Mahalonbis does a good job at transforming the data into something reasonable: http://stats.stackexchange.com/questions/62092/bottom-to-top-explanation-of-the-mahalanobis-distance

Overall, though the main advantages are that it considers variance, covariances and unitizes uncorrelated variables for the Euclidian distance.