# I Can't understand a step in an LU decomposition proof

#### Shirish

I'm reading about the LU decomposition on this page and cannot understand one of the final steps in the proof to the following:
----------------
Let $A$ be a $K\times K$ matrix. Then, there exists a permutation matrix $P$ such that $PA$ has an LU decomposition: $$PA=LU$$ where $L$ is a lower triangular $K\times K$ matrix and $U$ is an upper triangular $K\times K$ matrix.
----------------
I'll reproduce the proof here:
----------------
Through Gaussian elimination, $A$ can be reduced to a row-echelon (hence upper triangular) matrix $U$ via a series of $n$ elementary operations: $$E_n\bullet\ldots\bullet E_1\bullet A = U$$
Any elementary matrix $E_i$ used in Gaussian elimination is either a permutation matrix $P_i$ or a matrix $L_i$ used to add a multiple of one row to a row below it. Thus, $L_i$ will be lower triangular with non-zero diagonal $\implies$ invertible. Suppose that the first permutation matrix we encounter is in position $j$, so that we have:
$$E_n\bullet\ldots\bullet E_{j+1}\bullet P_j\bullet L_{j-1}\bullet\ldots\bullet L_1\bullet A = U$$
Since a permutation matrix is orthogonal, $P_j^T P_j = I$ and hence
$$E_n\bullet\ldots\bullet E_{j+1}\bullet P_j\bullet L_{j-1}\bullet P_j^TP_j \bullet\ldots\bullet P_j^TP_j\bullet L_1\bullet P_j^TP_j\bullet A = U$$ or
$$E_n\bullet\ldots\bullet E_{j+1}\bullet\tilde{L}_{j-1}\bullet\ldots\bullet\tilde{L}_1\bullet P_j\bullet A = U$$
for $i=1,\ldots,j-1$. A matrix $L_i$, used to add $\alpha$ times the $i$-th row to the $s$-th row (in this case $s>i$), can be written as a rank one update to the identity matrix: $L_i=I+M$, where $M$ is a matrix whose entries are all zeros except $M_{si} = \alpha$. We have that
$$\tilde{L}_i = P_jL_iP_j^T = P_j(I+M)P_j^T = P_jP_j^T + P_jMP_j^T = I+P_jMP_j^T$$
--------------------
So far so good. Then we have:
--------------------
The permutations performed on the rows and columns of $M$ by $P_j$ and $P_j^T$ can move the only non-zero element of $M$, but that element remains below the main diagonal (because $j>i$).
--------------------
I can't understand this. $M$ is derived from $L_i$, which means that $M_{si} = L_i - I = \alpha \neq 0$ and all other elements of $M$ are $0$. Nothing has been said about $P_j$. I'm guessing $P_j$ swaps $j$-th row with $r$-th row, where $r>j$. What do the indices $s,i$ have anything to do with the indices $r,j$?

I'd be grateful if anyone could clear this up!

Last edited:
Related Linear and Abstract Algebra News on Phys.org

#### StoneTemplePython

Gold Member
So far so good. Then we have:
--------------------
The permutations performed on the rows and columns of $M$ by $P_j$ and $P_j^T$ can move the only non-zero element of $M$, but that element remains below the main diagonal (because $j>i$).
--------------------
I can't understand this.
Please give more details on $P_j$. What does it look like / what exactly does it do? I'm guessing it swaps row $j$ with $j -1$ (a basic transposition going backward). Put differently what if we have

$A= \begin{bmatrix} \tilde{ \mathbf a_1}^T \\ \tilde{ \mathbf a_2}^T \\ \vdots\\ \tilde{ \mathbf a}_{j-1}^T \\ \tilde{ \mathbf a}_{j}^T \\ \tilde{ \mathbf a}_{j+1}^T \\ \vdots\\ \tilde{ \mathbf a}_{n-1}^T \\ \tilde{ \mathbf a}_n^T \end{bmatrix}$

what does $P_j \cdot A$ look like? Maybe you said it and I missed it, but I didn't see that in here and it is key.
- - - -
A better way to write out what is going on is as follows, using standard basis vectors:

$M = \alpha \mathbf e_s \mathbf e_i^T$
so
$P_j MP_j^T$
$= \alpha P_j\mathbf e_s \mathbf e_i^TP_j ^T$
$= \alpha \big(P_j \mathbf e_s\big)\big( P_j \mathbf e_i\big)^T$

This is the point where I need you to fill in some details -- in particular what exactly does $P_j$ do? Can you write out the possible outcomes of $P_j \mathbf e_s$ and $P_j \mathbf e_i$?

- - - -
$M$ is derived from $L_i$, which means that $M_{si} = L_i - I = \alpha \neq 0$
This is a bit troubling. You defined $\alpha$ to be a scalar earlier on and somehow $M_{si}$ is a scalar (more common notation would be $m_{si}$) but $L_i$ and $I$ are both matrices.

"Can't understand a step in an LU decomposition proof"

### Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving