Understanding Markov Chains and Different Types of Approximation

Click For Summary

Homework Help Overview

The discussion revolves around understanding singular value decomposition (SVD) and its properties, as well as exploring Markov chains and their applications in probability. Participants are examining the relationships between matrices and their eigenvalues, as well as the distinctions between different types of approximation methods in statistics.

Discussion Character

  • Mixed

Approaches and Questions Raised

  • Participants are clarifying the components of SVD, including the arrangement of singular values and the relationship between eigenvalues of different matrix products. Questions about the ordering of eigenvectors and the implications of matrix dimensions are also raised. Additionally, there is an exploration of the definitions of approximation methods and the application of Markov chains in predicting weather probabilities.

Discussion Status

The discussion includes various attempts to clarify concepts related to SVD and Markov chains. Some participants provide insights into the geometric interpretation of SVD and the arrangement of matrices, while others express uncertainty about the differences between data fitting and global approximation. There is an ongoing exploration of the probability calculations related to Markov chains, with participants questioning their understanding and seeking further clarification.

Contextual Notes

Participants note the complexity of transitioning from linear algebra concepts to statistical applications, indicating a potential gap in foundational knowledge. There is also a suggestion to create separate threads for distinct topics to facilitate more focused discussions.

Jamin2112
Messages
973
Reaction score
12

Homework Statement



Quick question, bros.

Homework Equations



Tell me if I'm right.

The Attempt at a Solution



A = U∑VT

where ∑ contains on its diagonal the singular values of ATA
U contains the corresponding eigenvectors for those singular values
V contains the eigenvectors corresponding to the singular values of AAT
 
Physics news on Phys.org
Close but not exactly right.

∑ contains on its diagonal the singular values of A. These in turn are the square roots of the eigenvalues of ATA. You can take square roots because ATA is positive semidefinite, hence has nonnegative eigenvalues.

The columns of U are the eigenvectors of AAT.

The columns of V are the eigenvectors of ATA.

If A is complex, you have to use Hermitian transposes above.

In addition to what you wrote, U and V are orthogonal matrices. (Or unitary matrices, if A is complex.)

The singular value decomposition has a nice geometric interpretation if A is real and square. It says that A can be decomposed into the following steps: rotate by some angle, then stretch or shrink along the standard axes, then rotate by some other angle. (The "rotation" can also include reflection if U or V has determinant -1 instead of 1.)
 
jbunniii said:
Close but not exactly right.

∑ contains on its diagonal the singular values of A. These in turn are the square roots of the eigenvalues of ATA. You can take square roots because ATA is positive semidefinite, hence has nonnegative eigenvalues.

The columns of U are the eigenvectors of AAT.

The columns of V are the eigenvectors of ATA.

If A is complex, you have to use Hermitian transposes above.

In addition to what you wrote, U and V are orthogonal matrices. (Or unitary matrices, if A is complex.)

The singular value decomposition has a nice geometric interpretation if A is real and square. It says that A can be decomposed into the following steps: rotate by some angle, then stretch or shrink along the standard axes, then rotate by some other angle. (The "rotation" can also include reflection if U or V has determinant -1 instead of 1.)

Thanks!
 
jbunniii said:
The columns of U are the eigenvectors of AAT.

How do I know which order to arrange them in?
 
Jamin2112 said:
How do I know which order to arrange them in?

The standard (canonical) arrangement is to sort the singular values from high to low, so the upper left corner of ∑ has the largest singular value, and the lower right corner (assuming a square matrix) is the smallest singular value.

Then arrange the rows of VT and the columns of U in the same order. i.e. the left column of U goes with the largest singular value, the right column goes with the smallest. The top row of VT goes with the largest singular value, the bottom row with the smallest.
 
jbunniii said:
The standard (canonical) arrangement is to sort the singular values from high to low, so the upper left corner of ∑ has the largest singular value, and the lower right corner (assuming a square matrix) is the smallest singular value.

Then arrange the rows of VT and the columns of U in the same order. i.e. the left column of U goes with the largest singular value, the right column goes with the smallest. The top row of VT goes with the largest singular value, the bottom row with the smallest.

Will AAT and ATA have the same eigenvalues?
 
Jamin2112 said:
Will AAT and ATA have the same eigenvalues?

Yes. You can actually prove this using the SVD.

[tex]AA^T = (U \Sigma V^T)(U \Sigma V^T)^T = U \Sigma V^T V \Sigma^T U^T = U \Sigma \Sigma^T U^T = UDU^T[/tex]

where D is a diagonal matrix whose diagonal elements are the squares of the diagonal elements of ∑. Similarly, you can show that

[tex]A^TA = V D V^T[/tex]

These two equations tell you that the diagonal elements of D are the eigenvalues of both [itex]AA^T[/itex] and [itex]A^T A[/itex], and that the eigenvectors of these matrices are the columns of U and V, respectively.
 
P.S. The above assumes that A is square. If A is not square, then AAT and ATA have the same nonzero eigenvalues, but they can have different multiplicities of 0 as an eigenvalue.
 
Thanks for the info, brah. I have a final tomorrow. Maybe you answer me a few more questions, since you seem to like this stuff.

(1) My professor says "We study three kinds of approximation: Interpolation, Data fitting, and Global approximation." The definitions of data fitting and global approximation he then gives are eerily similar; in fact I cannot tell the difference. Perhaps you could explain that. (2) Markov chains! I'm trying to use my basic knowledge of statistics to understand these. We have a problem that says the probability of rain or sun tomorrow depends on the weather today: P(R-->R) = .2, P(R-->S) = .8, P(S-->R) = .5, P(S-->S) = .5. The Markov chain looks like [.2 .5; .8 .5] * [x1; x2] = [.2x1 + .5x2; .8x1 + .5x2].

I'm trying to get a grasp of this. It says the probability of rain tomorrow is .2x1 + .5x2. I think in terms of P(rain tomorrow) = P(rain tomorrow | rain today) + P(rain tomorrow | sun today) ... Am I right?

Oh, wait ... is it P(rain tomorrow) = P(rain tomorrow & rain today) + P(rain tomorrow & sun today) = P(rain tomorrow | sun today) * P(sun today) + P(rain tomorrow | rain today) * P(rain today) = .5x2 + .2x1
 
  • #10
Jamin2112 said:
Thanks for the info, brah. I have a final tomorrow. Maybe you answer me a few more questions, since you seem to like this stuff.

(1) My professor says "We study three kinds of approximation: Interpolation, Data fitting, and Global approximation." The definitions of data fitting and global approximation he then gives are eerily similar; in fact I cannot tell the difference. Perhaps you could explain that. (2) Markov chains! I'm trying to use my basic knowledge of statistics to understand these. We have a problem that says the probability of rain or sun tomorrow depends on the weather today: P(R-->R) = .2, P(R-->S) = .8, P(S-->R) = .5, P(S-->S) = .5. The Markov chain looks like [.2 .5; .8 .5] * [x1; x2] = [.2x1 + .5x2; .8x1 + .5x2].

I'm trying to get a grasp of this. It says the probability of rain tomorrow is .2x1 + .5x2. I think in terms of P(rain tomorrow) = P(rain tomorrow | rain today) + P(rain tomorrow | sun today) ... Am I right?

Oh, wait ... is it P(rain tomorrow) = P(rain tomorrow & rain today) + P(rain tomorrow & sun today) = P(rain tomorrow | sun today) * P(sun today) + P(rain tomorrow | rain today) * P(rain today) = .5x2 + .2x1

Offhand I'm not sure about these - we have now jumped from linear algebra/matrix theory to some sort of statistics. I studied Markov chains but it was many moons ago and I don't remember most of the details anymore. You should start new threads for each question, with appropriate titles, so others with expertise in this area will have a better chance to see the questions.
 

Similar threads

  • · Replies 12 ·
Replies
12
Views
4K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 6 ·
Replies
6
Views
19K
Replies
2
Views
6K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K