Mahalanobis Distance using Eigen-Values of the Covariance Matrix

orajput
Messages
1
Reaction score
0
Given the formula of Mahalanobis Distance:

D^2_M = (\mathbf{x} - \mathbf{\mu})^T \mathbf{S}^{-1} (\mathbf{x} - \mathbf{\mu})

If I simplify the above expression using Eigen-value decomposition (EVD) of the Covariance Matrix:

S = \mathbf{P} \Lambda \mathbf{P}^T

Then,

D^2_M = (\mathbf{x} - \mathbf{\mu})^T \mathbf{P} \Lambda^{-1} \mathbf{P}^T (\mathbf{x} - \mathbf{\mu})

Let, the projections of (\mathbf{x}-\mu) on all eigen-vectors present in \mathbf{P} be \mathbf{b}, then:

\mathbf{b} = \mathbf{P}^T(\mathbf{x} - \mathbf{\mu})

And,

D^2_M = \mathbf{b}^T \Lambda^{-1} \mathbf{b}

D^2_M = \sum_i{\frac{b^2_i}{\lambda_i}}

The problem that I am facing right now is as follows:

The covariance matrix \mathbf{S} is calculated on a dataset, in which no. of observations are less than the no. of variables. This causes some zero-valued eigen-values after EVD of \mathbf{S}.

In these cases the above simplified expression does not result in the same Mahalanobis Distance as the original expression, i.e.:

(\mathbf{x} - \mathbf{\mu})^T \mathbf{S}^{-1} (\mathbf{x} - \mathbf{\mu}) \neq \sum_i{\frac{b^2_i}{\lambda_i}} (for non-zero \lambda_i)

My question is: Is the simplified expression still functionally represents the Mahalanobis Distance?

P.S.: Motivation to use the simplified expression of Mahalanbis Distance is to calculate its gradient wrt b.
 
Physics news on Phys.org
Hello,

In order to be invertible, S mustn't have zero eigen values, that is , must be positive definite or negative definite. Apart from that , that expression must work...

All the best

GoodSpirit
 
Hey orajput and welcome to the forums.

For your problem, if you do have a singular or ill-conditioned covariance matrix, I would try and do something like Principal Components, or to remove the offending variable from your system and re-do the analysis.
 
The world of 2\times 2 complex matrices is very colorful. They form a Banach-algebra, they act on spinors, they contain the quaternions, SU(2), su(2), SL(2,\mathbb C), sl(2,\mathbb C). Furthermore, with the determinant as Euclidean or pseudo-Euclidean norm, isu(2) is a 3-dimensional Euclidean space, \mathbb RI\oplus isu(2) is a Minkowski space with signature (1,3), i\mathbb RI\oplus su(2) is a Minkowski space with signature (3,1), SU(2) is the double cover of SO(3), sl(2,\mathbb C) is the...
Back
Top