Can't understand a step in an LU decomposition proof

Shirish · Jan 30, 2019

I'm reading about the LU decomposition on this page and cannot understand one of the final steps in the proof to the following:
----------------
Let ##A## be a ##K\times K## matrix. Then, there exists a permutation matrix ##P## such that ##PA## has an LU decomposition: $$PA=LU$$ where ##L## is a lower triangular ##K\times K## matrix and ##U## is an upper triangular ##K\times K## matrix.
----------------
I'll reproduce the proof here:
----------------
Through Gaussian elimination, ##A## can be reduced to a row-echelon (hence upper triangular) matrix ##U## via a series of ##n## elementary operations: $$E_n\bullet\ldots\bullet E_1\bullet A = U$$
Any elementary matrix ##E_i## used in Gaussian elimination is either a permutation matrix ##P_i## or a matrix ##L_i## used to add a multiple of one row to a row below it. Thus, ##L_i## will be lower triangular with non-zero diagonal ##\implies## invertible. Suppose that the first permutation matrix we encounter is in position ##j##, so that we have:
$$E_n\bullet\ldots\bullet E_{j+1}\bullet P_j\bullet L_{j-1}\bullet\ldots\bullet L_1\bullet A = U$$
Since a permutation matrix is orthogonal, ##P_j^T P_j = I## and hence
$$E_n\bullet\ldots\bullet E_{j+1}\bullet P_j\bullet L_{j-1}\bullet P_j^TP_j \bullet\ldots\bullet P_j^TP_j\bullet L_1\bullet P_j^TP_j\bullet A = U$$ or
$$E_n\bullet\ldots\bullet E_{j+1}\bullet\tilde{L}_{j-1}\bullet\ldots\bullet\tilde{L}_1\bullet P_j\bullet A = U$$
for ##i=1,\ldots,j-1##. A matrix ##L_i##, used to add ##\alpha## times the ##i##-th row to the ##s##-th row (in this case ##s>i##), can be written as a rank one update to the identity matrix: ##L_i=I+M##, where ##M## is a matrix whose entries are all zeros except ##M_{si} = \alpha##. We have that
$$\tilde{L}_i = P_jL_iP_j^T = P_j(I+M)P_j^T = P_jP_j^T + P_jMP_j^T = I+P_jMP_j^T$$
--------------------
So far so good. Then we have:
--------------------
The permutations performed on the rows and columns of ##M## by ##P_j## and ##P_j^T## can move the only non-zero element of ##M##, but that element remains below the main diagonal (because ##j>i##).
--------------------
I can't understand this. ##M## is derived from ##L_i##, which means that ##M_{si} = L_i - I = \alpha \neq 0## and all other elements of ##M## are ##0##. Nothing has been said about ##P_j##. I'm guessing ##P_j## swaps ##j##-th row with ##r##-th row, where ##r>j##. What do the indices ##s,i## have anything to do with the indices ##r,j##?

I'd be grateful if anyone could clear this up!

StoneTemplePython · Jan 30, 2019

Shirish said:

So far so good. Then we have:
--------------------
The permutations performed on the rows and columns of ##M## by ##P_j## and ##P_j^T## can move the only non-zero element of ##M##, but that element remains below the main diagonal (because ##j>i##).
--------------------
I can't understand this.

Please give more details on ##P_j##. What does it look like / what exactly does it do? I'm guessing it swaps row ##j## with ##j -1## (a basic transposition going backward). Put differently what if we have

##A=
\begin{bmatrix}
\tilde{ \mathbf a_1}^T \\
\tilde{ \mathbf a_2}^T \\
\vdots\\
\tilde{ \mathbf a}_{j-1}^T \\
\tilde{ \mathbf a}_{j}^T \\
\tilde{ \mathbf a}_{j+1}^T \\
\vdots\\
\tilde{ \mathbf a}_{n-1}^T \\
\tilde{ \mathbf a}_n^T
\end{bmatrix}##

what does ##P_j \cdot A## look like? Maybe you said it and I missed it, but I didn't see that in here and it is key.
- - - -
A better way to write out what is going on is as follows, using standard basis vectors:

##M = \alpha \mathbf e_s \mathbf e_i^T##
so
##P_j MP_j^T ##
##= \alpha P_j\mathbf e_s \mathbf e_i^TP_j ^T ##
##= \alpha \big(P_j \mathbf e_s\big)\big( P_j \mathbf e_i\big)^T ##

This is the point where I need you to fill in some details -- in particular what exactly does ##P_j## do? Can you write out the possible outcomes of ##P_j \mathbf e_s## and ##P_j \mathbf e_i##?

- - - -

Shirish said:

##M## is derived from ##L_i##, which means that ##M_{si} = L_i - I = \alpha \neq 0##

This is a bit troubling. You defined ##\alpha ## to be a scalar earlier on and somehow ##M_{si}## is a scalar (more common notation would be ##m_{si}##) but ##L_i## and ##I## are both matrices.

Can't understand a step in an LU decomposition proof

1. What is LU decomposition?

2. Why is LU decomposition important?

3. How does LU decomposition work?

4. What is the purpose of LU decomposition proof?

5. What can I do if I can't understand a step in the LU decomposition proof?

Similar threads

Hot Threads

Recent Insights