# Newbie question: Algebra of Mahalanobis distance

1. Nov 11, 2013

### anja.ende

Hello,

The Mahalanobis distance or rather its square is defined as :

$(X-\mu)^2/\Sigma$ which is then written as

$(X-\mu)^{T} Ʃ^{-1}(X-\mu)$

Ʃ is the covariance matrix. My silly question is why is the sigma placed in the middle of the dot product of the (X-μ) vector with itself. I am sure this makes sense mathematically (this reduces the output to a scalar) but I would like to know the intuitive reason behind it.

Thanks a lot!
Anja

2. Nov 11, 2013

### Office_Shredder

Staff Emeritus
The idea behind the Mahalanobis distance is that you are measuring how many standard deviations from the mean X is in the one dimensional case. In multidimensional cases, $\Sigma$ is going to be a positive (semi)definite matrix, which will have a unique positive (semi)definite square root which I will call S. S serves the same role as the standard deviation. Then the expression above is the same as

$$\left( S^{-1}(X-\mu) \right)^T \left(S^{-1}(X-\mu) \right)$$

basically, you scale the random vector $X-\mu$ by the standard deviation, the same as you would in the one dimensional case.

3. Nov 11, 2013

### D H

Staff Emeritus
The expression $(X-\mu)^T \Sigma^{-1}(X-\mu) = \sigma^2$ defines a family of hyperellipsoids in the N-dimensional space in which X and μ live, characterized by the scalar parameter σ. I used σ intentionally. Think of σ as representing "standard deviations". For example, $(X-\mu)^T \Sigma^{-1}(X-\mu) = 1$ is the one sigma hyperellipsoid.

The Mahalanobis distance is essentially a measure of how many standard deviations a point X is from the mean μ.

4. Nov 11, 2013

### anja.ende

Thank you guys!