# Question about sample covariance matrix

1. May 23, 2012

### sanctifier

Suppose vectors X1, X2, ... , Xn whose components are random variables are mutually independent(I mean Xi's are vectors of components with constants which are possible values of random variables labeled by the component indice, and all these labeled random variables are organized as a vector X, hence Xi's just are samples of such a X), and the sample mean is $\hat{M}$ = $\frac{1}{N}$$\sum$ $\stackrel{N}{i = 1}$ Xi, and the true mean of all Xi's is M. Then to estimate the covariance matrix of Xi, we employ the following formula:
$\hat{Ʃ}$ = $\frac{1}{N}$$\sum$ $\stackrel{N}{i = 1}$ {(Xi - $\hat{M}$)(Xi - $\hat{M}$)T}
$\$$\$$\:$= $\frac{1}{N}$$\sum$ $\stackrel{N}{i = 1}$ {((Xi - M) - ($\hat{M}$ - M))((Xi - M) - ($\hat{M}$ - M))T}
$\$$\$$\:$= $\frac{1}{N}$$\sum$ $\stackrel{N}{i = 1}$ (Xi - M)(Xi - M)T - ($\hat{M}$ - M)($\hat{M}$ - M)T
My question is how does the equal sign hold in the last step?
I did some work about this question, first I note that the transpose is a llinear transformation, i.e., for two vectors V and U, (V + U)T = VT + UT, then I realize that the following equation is legal.
(V - U)(V - U)T = V$\!$VT - VUT - UVT + UUT
Let V = (Xi - M) and U = ($\hat{M}$ - M), the terms missing in the last step of $\hat{Ʃ}$ are -VUT and -UVT, OK, I know the entries of E[VUT] actually are covariances of Xi and $\hat{M}$, and I assume they are all zero, consequently the terms -VUT and -UVT do miss because of taking the expectation on $\hat{Ʃ}$, but in the last step, they vanished before taking the expectation! Why?
Finally, I also notice that the sign of UUT = ($\hat{M}$ - M)($\hat{M}$ - M)T has been changed from + to -, how does this happen?

Last edited: May 23, 2012
2. May 23, 2012

### sanctifier

Ok, if 1/N can be envisaged as a approximate probability of each entry of VUT, this can explain the vanishing of -VUT and -UVT without taking a expectation, but how to explain the sign change of UUT occurred in the last step?