Why Didn't the Eigenvector Approach Work for This Matrix?

  • Context: High School 
  • Thread starter Thread starter jonjacson
  • Start date Start date
  • Tags Tags
    Eigenvector Elementary
Click For Summary

Discussion Overview

The discussion revolves around the process of finding eigenvectors and eigenvalues for a specific matrix, particularly the matrix \(\begin{array}{cc}0&1\\1&0\end{array}\). Participants explore the application of the eigenvector definition and the determinant method, addressing challenges encountered during the calculations. The conversation also touches on geometric interpretations and extends to the interpretation of eigenvalues in correlation and covariance matrices.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning
  • Experimental/applied

Main Points Raised

  • One participant attempts to find eigenvectors and eigenvalues using the definition \(A x = \lambda x\) but encounters difficulties, suggesting that no suitable \(\lambda\) exists unless it is zero.
  • Another participant challenges this assertion, indicating that the resulting equations can be solved and that the initial claim is incorrect.
  • A participant later confirms the eigenvalues found using the determinant method, stating \(\lambda^2 - 1 = 0\) leads to \(\lambda = \pm 1\) and provides corresponding eigenvectors.
  • One participant notes the significance of the eigenvectors in quantum mechanics, specifically relating to the Pauli matrices.
  • There is a discussion about the geometric interpretation of the matrix, describing it as a reflection in the plane and its effects on standard unit vectors.
  • Another participant expresses curiosity about extending the discussion to the interpretation of eigenvalues in correlation and covariance matrices, questioning their meaning in terms of data variability.
  • A detailed explanation is provided regarding the maximization of variance in data represented by a covariance matrix, including the role of eigenvalues and eigenvectors in this context.

Areas of Agreement / Disagreement

Participants generally agree on the eigenvalues and eigenvectors derived from the determinant method, but there is initial disagreement regarding the validity of the direct approach to finding eigenvectors. The discussion on the interpretation of eigenvalues in correlation matrices remains unresolved, with differing perspectives on their significance.

Contextual Notes

Some assumptions about the definitions and properties of eigenvalues and eigenvectors are not explicitly stated, and the discussion includes various interpretations that may depend on specific contexts, such as the treatment of data in correlation matrices.

Who May Find This Useful

Readers interested in linear algebra, particularly eigenvalues and eigenvectors, as well as those exploring applications in quantum mechanics and data analysis, may find this discussion beneficial.

jonjacson
Messages
450
Reaction score
38
Hi
If I have this matrix:
\begin{array}{cc}0&1\\1&0\end{array}
and I want to find its eigenvectors and eigenvalues, I can try it using the definition of an eigenvector which is:

A x = λ x , where x are the eigenvectors

But if I try this directly I fail to get the right answer, for example using a column eigenvector (a b) , instead I get:

(b a) = λ (a b) , (These are column vectors.)

THere is no lambda able to make this correct, unless it is zero which is not the right answer. Why is that this approach didn't work?

I have to use the identity matrix, and the determinant of A - λ I, to get the right result.

Thanks!
 
Physics news on Phys.org
jonjacson said:
(b a) = λ (a b) , (These are column vectors.)

THere is no lambda able to make this correct, unless it is zero which is not the right answer. Why is that this approach didn't work?
The part I bolded is false. The set of two equations of two unknowns is not hard to solve. Try it!
 
  • Like
Likes   Reactions: jonjacson
jonjacson said:
THere is no lambda able to make this correct, unless it is zero
You can check that this must be wrong by comparing it with the outcome of
jonjacson said:
I have to use the identity matrix, and the determinant of A - λ I, to get the right result
where you find ##\lambda^2-1 = 0 \Rightarrow \lambda = \pm 1## and can then determine the corresponding eigenvectors from (b a) = λ (a b) !
 
Ups, ok, you are right. I get the same results, and then for the eigenvectors I get:

Lambda=1 ---> (a a) , I simply get a=b, so the eigenvector is (1 1)

Lambda=-1---> (a -a), I get a=-b, so the eigenvector is (1 -1)
 
Well done! (apart from normalization) you have found the eigenvectors of the first of the Pauli matrices; they play an important role in quantum mechanics for particles with spin.
 
  • Like
Likes   Reactions: jonjacson
Yes, that is what I was reading about.

Thanks.
 
you could just picture the map, since from the matrix columns, it is obvious it interchanges the standard unit vectors on the x and y axes. Hence it is a reflection of the plane in the line x=y, hence it leaves invariant both that line and its orthocomplement, acting as the identity on the line x=y and as - the identity on the line x=-y. but the algebraic method is more sure, if less geometric.
 
  • Like
Likes   Reactions: BvU
mathwonk said:
you could just picture the map, since from the matrix columns, it is obvious it interchanges the standard unit vectors on the x and y axes. Hence it is a reflection of the plane in the line x=y, hence it leaves invariant both that line and its orthocomplement, acting as the identity on the line x=y and as - the identity on the line x=-y. but the algebraic method is more sure, if less geometric.

This is a good idea for transformations but it seems harder for other situations. Can you (or anyone else, of course) see some geometric interpretations for e.g. a correlation or covariance matrix?
 
Sorry, I don't mean to hijack the thread, it is just that I am curious about a related issue: the interpretation of eigenvalues in correlation/covariance matrices. These supposedly describe directions of maximal variability of the data, but I just cannot see it at this point. I thought since the OP seems satisfied with the answers given, it may make sense to extend the thread beyond the original scope.
 
  • #10
WWGD said:
Sorry, I don't mean to hijack the thread, it is just that I am curious about a related issue: the interpretation of eigenvalues in correlation/covariance matrices. These supposedly describe directions of maximal variability of the data, but I just cannot see it at this point. I thought since the OP seems satisfied with the answers given, it may make sense to extend the thread beyond the original scope.

I'm not totally sure I understand your question as they are a lot of interpretations here. In all cases I assume we're working with centered (read: zero mean, by column) data. I also assume we're operating in reals.

If you have your data in a matrix ##\mathbf A##, and you have some arbitrary vector ##\mathbf x##, where ##\big \vert \big \vert \mathbf x \big \vert \big \vert_2^{2} = 1##, then to maximize ##\big \vert \big \vert \mathbf{Ax} \big \vert \big \vert_2^{2}##, you'd allocate entirely to ##\lambda_1##, ##\mathbf v_1## the largest eigenpair in ##\mathbf A^T \mathbf A## (aka largest singular value (squared) of ##\mathbf A## and the associated right singular vector). This is a quadratic form interpretation of your answer. Typically people prove this with a diagonalization argument or a Lagrange multiplier argument. I assume the eigenvalues are well ordered for this symmetric positive (semi)definite covariance matrix, so ##\lambda_1 \geq \lambda_2 \geq ... \lambda_n \geq 0##. Where
##\mathbf A =
\bigg[\begin{array}{c|c|c|c}
\mathbf a_1 & \mathbf a_2 &\cdots & \mathbf a_{n}
\end{array}\bigg]
##
That is, ##\mathbf a_j## refers to the jth feature column in ##\mathbf A##

using the interpretation of matrix vector multiplication as the scaled sum across the column space of a matrix, we see that ##\mathbf {Ax} = x_1 *\mathbf a_1 + x_2 *\mathbf a_2 + ... + x_n *\mathbf a_n## .

Thus when someone asks to do a constrained maximization of ##\big \vert \big \vert \mathbf{Ax} \big \vert \big \vert_2^{2}##, what they are saying is come up with the vector that is a linear combination of features from data matrix ##\mathbf A## that has the maximal length, subject to the constraint that ##x_1^2 + x_2^2 + ... + x_n^2 = 1## (or some other constant > 0, but we use one for simplicity here). Since all features are zero mean (i.e. you centered your data), what you have done is extract a vector with the highest second moment / variance from your features -- again subject to the constraint ##x_1^2 + x_2^2 + ... + x_n^2 = 1##.

here is another interpretation

If you wanted to low rank approximate -- say rank one -- your matrix ##\mathbf A##, and you were using ##\big \vert \big \vert \mathbf A \big \vert \big \vert_F^{2}## as your ruler (i.e. sum up the squared value of everything in ##\mathbf A## which is a generalization of a L2 norm on vectors), you'd also allocate entirely to ##\lambda_1##, where ##\big \vert \big \vert \mathbf A \big \vert \big \vert_F^{2} = trace\big(\mathbf A^T \mathbf A\big) = \lambda_1 + \lambda_2 + ... + \lambda_n##, where each associated eigenvector is mutually orthonormal, so we have a clean partition, and for each eigenvalue your allocate to, you increase the rank of your approximation, thus for rank 2 approximation you'd allocate to ##\lambda_1## and ##\lambda_2## and so forth.
 
  • Like
Likes   Reactions: WWGD

Similar threads

  • · Replies 33 ·
2
Replies
33
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 4 ·
Replies
4
Views
6K
  • · Replies 9 ·
Replies
9
Views
4K