# B Elementary eigenvector question

1. Mar 27, 2017

### jonjacson

Hi
If I have this matrix:
\begin{array}{cc}0&1\\1&0\end{array}
and I want to find its eigenvectors and eigenvalues, I can try it using the definition of an eigenvector which is:

A x = λ x , where x are the eigenvectors

But if I try this directly I fail to get the right answer, for example using a column eigenvector (a b) , instead I get:

(b a) = λ (a b) , (These are column vectors.)

THere is no lambda able to make this correct, unless it is zero which is not the right answer. Why is that this approach didn't work?

I have to use the identity matrix, and the determinant of A - λ I, to get the right result.

Thanks!

2. Mar 27, 2017

### Staff: Mentor

The part I bolded is false. The set of two equations of two unknowns is not hard to solve. Try it!

3. Mar 27, 2017

### BvU

You can check that this must be wrong by comparing it with the outcome of
where you find $\lambda^2-1 = 0 \Rightarrow \lambda = \pm 1$ and can then determine the corresponding eigenvectors from (b a) = λ (a b) !

4. Mar 27, 2017

### jonjacson

Ups, ok, you are right. I get the same results, and then for the eigenvectors I get:

Lambda=1 ---> (a a) , I simply get a=b, so the eigenvector is (1 1)

Lambda=-1---> (a -a), I get a=-b, so the eigenvector is (1 -1)

5. Mar 27, 2017

### BvU

Well done! (apart from normalization) you have found the eigenvectors of the first of the Pauli matrices; they play an important role in quantum mechanics for particles with spin.

6. Mar 27, 2017

### jonjacson

Thanks.

7. Mar 30, 2017

### mathwonk

you could just picture the map, since from the matrix columns, it is obvious it interchanges the standard unit vectors on the x and y axes. Hence it is a reflection of the plane in the line x=y, hence it leaves invariant both that line and its orthocomplement, acting as the identity on the line x=y and as - the identity on the line x=-y. but the algebraic method is more sure, if less geometric.

8. Mar 30, 2017

### WWGD

This is a good idea for transformations but it seems harder for other situations. Can you (or anyone else, of course) see some geometric interpretations for e.g. a correlation or covariance matrix?

9. Apr 4, 2017

### WWGD

Sorry, I don't mean to hijack the thread, it is just that I am curious about a related issue: the interpretation of eigenvalues in correlation/covariance matrices. These supposedly describe directions of maximal variability of the data, but I just cannot see it at this point. I thought since the OP seems satisfied with the answers given, it may make sense to extend the thread beyond the original scope.

10. Apr 5, 2017

### StoneTemplePython

I'm not totally sure I understand your question as they are a lot of interpretations here. In all cases I assume we're working with centered (read: zero mean, by column) data. I also assume we're operating in reals.

If you have your data in a matrix $\mathbf A$, and you have some arbitrary vector $\mathbf x$, where $\big \vert \big \vert \mathbf x \big \vert \big \vert_2^{2} = 1$, then to maximize $\big \vert \big \vert \mathbf{Ax} \big \vert \big \vert_2^{2}$, you'd allocate entirely to $\lambda_1$, $\mathbf v_1$ the largest eigenpair in $\mathbf A^T \mathbf A$ (aka largest singular value (squared) of $\mathbf A$ and the associated right singular vector). This is a quadratic form interpretation of your answer. Typically people prove this with a diagonalization argument or a Lagrange multiplier argument. I assume the eigenvalues are well ordered for this symmetric positive (semi)definite covariance matrix, so $\lambda_1 \geq \lambda_2 \geq ... \lambda_n \geq 0$. Where
$\mathbf A = \bigg[\begin{array}{c|c|c|c} \mathbf a_1 & \mathbf a_2 &\cdots & \mathbf a_{n} \end{array}\bigg]$
That is, $\mathbf a_j$ refers to the jth feature column in $\mathbf A$

using the interpretation of matrix vector multiplication as the scaled sum across the column space of a matrix, we see that $\mathbf {Ax} = x_1 *\mathbf a_1 + x_2 *\mathbf a_2 + ... + x_n *\mathbf a_n$ .

Thus when someone asks to do a constrained maximization of $\big \vert \big \vert \mathbf{Ax} \big \vert \big \vert_2^{2}$, what they are saying is come up with the vector that is a linear combination of features from data matrix $\mathbf A$ that has the maximal length, subject to the constraint that $x_1^2 + x_2^2 + ... + x_n^2 = 1$ (or some other constant > 0, but we use one for simplicity here). Since all features are zero mean (i.e. you centered your data), what you have done is extract a vector with the highest second moment / variance from your features -- again subject to the constraint $x_1^2 + x_2^2 + ... + x_n^2 = 1$.

here is another interpretation

If you wanted to low rank approximate -- say rank one -- your matrix $\mathbf A$, and you were using $\big \vert \big \vert \mathbf A \big \vert \big \vert_F^{2}$ as your ruler (i.e. sum up the squared value of everything in $\mathbf A$ which is a generalization of a L2 norm on vectors), you'd also allocate entirely to $\lambda_1$, where $\big \vert \big \vert \mathbf A \big \vert \big \vert_F^{2} = trace\big(\mathbf A^T \mathbf A\big) = \lambda_1 + \lambda_2 + ... + \lambda_n$, where each associated eigenvector is mutually orthonormal, so we have a clean partition, and for each eigenvalue your allocate to, you increase the rank of your approximation, thus for rank 2 approximation you'd allocate to $\lambda_1$ and $\lambda_2$ and so forth.