- 2

- 0

I'm reading about the LU decomposition on this page and cannot understand one of the final steps in the proof to the following:

----------------

Let ##A## be a ##K\times K## matrix. Then, there exists a permutation matrix ##P## such that ##PA## has an LU decomposition: $$PA=LU$$ where ##L## is a lower triangular ##K\times K## matrix and ##U## is an upper triangular ##K\times K## matrix.

----------------

I'll reproduce the proof here:

----------------

Through Gaussian elimination, ##A## can be reduced to a row-echelon (hence upper triangular) matrix ##U## via a series of ##n## elementary operations: $$E_n\bullet\ldots\bullet E_1\bullet A = U$$

Any elementary matrix ##E_i## used in Gaussian elimination is either a permutation matrix ##P_i## or a matrix ##L_i## used to add a multiple of one row to a row

$$E_n\bullet\ldots\bullet E_{j+1}\bullet P_j\bullet L_{j-1}\bullet\ldots\bullet L_1\bullet A = U$$

Since a permutation matrix is orthogonal, ##P_j^T P_j = I## and hence

$$E_n\bullet\ldots\bullet E_{j+1}\bullet P_j\bullet L_{j-1}\bullet P_j^TP_j \bullet\ldots\bullet P_j^TP_j\bullet L_1\bullet P_j^TP_j\bullet A = U$$ or

$$E_n\bullet\ldots\bullet E_{j+1}\bullet\tilde{L}_{j-1}\bullet\ldots\bullet\tilde{L}_1\bullet P_j\bullet A = U$$

for ##i=1,\ldots,j-1##. A matrix ##L_i##, used to add ##\alpha## times the ##i##-th row to the ##s##-th row (in this case ##s>i##), can be written as a rank one update to the identity matrix: ##L_i=I+M##, where ##M## is a matrix whose entries are all zeros except ##M_{si} = \alpha##. We have that

$$\tilde{L}_i = P_jL_iP_j^T = P_j(I+M)P_j^T = P_jP_j^T + P_jMP_j^T = I+P_jMP_j^T$$

--------------------

So far so good. Then we have:

--------------------

--------------------

I can't understand this. ##M## is derived from ##L_i##, which means that ##M_{si} = L_i - I = \alpha \neq 0## and all other elements of ##M## are ##0##. Nothing has been said about ##P_j##. I'm guessing ##P_j## swaps ##j##-th row with ##r##-th row, where ##r>j##. What do the indices ##s,i## have anything to do with the indices ##r,j##?

I'd be grateful if anyone could clear this up!

----------------

Let ##A## be a ##K\times K## matrix. Then, there exists a permutation matrix ##P## such that ##PA## has an LU decomposition: $$PA=LU$$ where ##L## is a lower triangular ##K\times K## matrix and ##U## is an upper triangular ##K\times K## matrix.

----------------

I'll reproduce the proof here:

----------------

Through Gaussian elimination, ##A## can be reduced to a row-echelon (hence upper triangular) matrix ##U## via a series of ##n## elementary operations: $$E_n\bullet\ldots\bullet E_1\bullet A = U$$

Any elementary matrix ##E_i## used in Gaussian elimination is either a permutation matrix ##P_i## or a matrix ##L_i## used to add a multiple of one row to a row

**below**it. Thus, ##L_i## will be lower triangular with non-zero diagonal ##\implies## invertible. Suppose that the first permutation matrix we encounter is in position ##j##, so that we have:$$E_n\bullet\ldots\bullet E_{j+1}\bullet P_j\bullet L_{j-1}\bullet\ldots\bullet L_1\bullet A = U$$

Since a permutation matrix is orthogonal, ##P_j^T P_j = I## and hence

$$E_n\bullet\ldots\bullet E_{j+1}\bullet P_j\bullet L_{j-1}\bullet P_j^TP_j \bullet\ldots\bullet P_j^TP_j\bullet L_1\bullet P_j^TP_j\bullet A = U$$ or

$$E_n\bullet\ldots\bullet E_{j+1}\bullet\tilde{L}_{j-1}\bullet\ldots\bullet\tilde{L}_1\bullet P_j\bullet A = U$$

for ##i=1,\ldots,j-1##. A matrix ##L_i##, used to add ##\alpha## times the ##i##-th row to the ##s##-th row (in this case ##s>i##), can be written as a rank one update to the identity matrix: ##L_i=I+M##, where ##M## is a matrix whose entries are all zeros except ##M_{si} = \alpha##. We have that

$$\tilde{L}_i = P_jL_iP_j^T = P_j(I+M)P_j^T = P_jP_j^T + P_jMP_j^T = I+P_jMP_j^T$$

--------------------

So far so good. Then we have:

--------------------

**The permutations performed on the rows and columns of ##M## by ##P_j## and ##P_j^T## can move the only non-zero element of ##M##, but that element remains below the main diagonal (because ##j>i##).**--------------------

I can't understand this. ##M## is derived from ##L_i##, which means that ##M_{si} = L_i - I = \alpha \neq 0## and all other elements of ##M## are ##0##. Nothing has been said about ##P_j##. I'm guessing ##P_j## swaps ##j##-th row with ##r##-th row, where ##r>j##. What do the indices ##s,i## have anything to do with the indices ##r,j##?

I'd be grateful if anyone could clear this up!

Last edited: