Understanding a proof of Cholesky's factorization

fluidistic · Jun 30, 2011

Homework Statement

I must understand the following proof.
Let [itex]A \in \mathbb{R}^{n \times n}[/itex] be a symmetric and positive definite matrix.
Thus there exist a unique factorization of [itex]A[/itex] such that [itex]A=LL^t[/itex] where [itex]L[/itex] is a lower triangular matrix whose diagonal is positive ([itex]l_{ii}>0[/itex])
Demonstration:
[itex]A=A^t[/itex] and [itex]x ^t A x>0[/itex] for [itex]x \neq 0[/itex]. This is the hypothesis on [itex]A[/itex].
It follows that [itex]A[/itex] is invertible, so [itex]A^{-1}[/itex] exist.
Furthermore, considering vectors of the type [itex]x=(x_1, x_2, ..., x_k, 0, ... , 0)[/itex] one can see that the minors of [itex]A[/itex] are also definite positive.
It follows from the [itex]LU[/itex] decomposition theorem that [itex]A[/itex] admits a [itex]LU[/itex] decomposition. In other words I can write [itex]A=LU \Rightarrow A^t=U^t L^t \Rightarrow U^t L^t (L^t)^{-1}=U^t[/itex] and [itex]U(L^t)^{-1}=L^{-1}U^t[/itex].
Where [itex]U(L^t)^{-1}[/itex] is an upper triangular matrix and [itex]L^{-1}U^t[/itex] is a lower triangular matrix. Thus there exist a diagonal matrix [itex]D[/itex] such that [itex]U(L^t)^{-1}=D[/itex].
From this, we have that [itex]U=DL^t[/itex]. Since [itex]A=LU[/itex], [itex]A=LDL^t[/itex]. Therefore [itex]D[/itex] is definite positive so its diagonal elements are positive.
So I can write [itex]A= L'L'^t[/itex] with [itex]L'=L \sqrt D[/itex].
Now in order to complete the proof, I must show that [itex]L'[/itex] is definite positive (I think it will imply that its diagonal entries are all positive). This is where I'm stuck.
I couldn't even show that [itex]\sqrt D[/itex] is positive definite...

Uniqueness proof:
Suppose that there exist [itex]L_1[/itex] such that [itex]A=LL^t=L_1L_1^t[/itex]. Let's show that [itex]L=L_1[/itex].
1)Multiplying by [itex]L^{-1}[/itex] we get [itex]L^t= L^{-1}L_1L_1^t[/itex].
2)Multiplying by [itex](L_1)^{-1}[/itex] we get that [itex]L^t (L_1^t)^{-1}=L_1 ^{-1}L_1L_1^t (L_1^t)^{-1}=L^{-1}L_1=D*[/itex] where [itex]D*[/itex] is diagonal with positive entries.
It follows that:
3)[itex]L_1=LD*[/itex]
4)[itex]L^t =D* L_1 ^t[/itex] I don't understand how the proof reaches this
From 3) and 4), we get (I don't understand the following implication) [itex]D*(LD*)^t=D*D*^t L^t=(D*)^2L^t \Leftrightarrow D*=I \Rightarrow L=L_1[/itex] which complete the proof.

Any help is greatly appreciated.
P.S.:D* is a matrix. * isn't a multiplying sign.

micromass · Jun 30, 2011

Hi fluidistic!

fluidistic said:

Homework Statement

I must understand the following proof.
Let [itex]A \in \mathbb{R}^{n \times n}[/itex] be a symmetric and positive definite matrix.
Thus there exist a unique factorization of [itex]A[/itex] such that [itex]A=LL^t[/itex] where [itex]L[/itex] is a lower triangular matrix whose diagonal is positive ([itex]l_{ii}>0[/itex])
Demonstration:
[itex]A=A^t[/itex] and [itex]x ^t A x>0[/itex] for [itex]x \neq 0[/itex]. This is the hypothesis on [itex]A[/itex].
It follows that [itex]A[/itex] is invertible, so [itex]A^{-1}[/itex] exist.
Furthermore, considering vectors of the type [itex]x=(x_1, x_2, ..., x_k, 0, ... , 0)[/itex] one can see that the minors of [itex]A[/itex] are also definite positive.
It follows from the [itex]LU[/itex] decomposition theorem that [itex]A[/itex] admits a [itex]LU[/itex] decomposition. In other words I can write [itex]A=LU \Rightarrow A^t=U^t L^t \Rightarrow U^t L^t (L^t)^{-1}=U^t[/itex] and [itex]U(L^t)^{-1}=L^{-1}U^t[/itex].
Where [itex]U(L^t)^{-1}[/itex] is an upper triangular matrix and [itex]L^{-1}U^t[/itex] is a lower triangular matrix. Thus there exist a diagonal matrix [itex]D[/itex] such that [itex]U(L^t)^{-1}=D[/itex].
From this, we have that [itex]U=DL^t[/itex]. Since [itex]A=LU[/itex], [itex]A=LDL^t[/itex]. Therefore [itex]D[/itex] is definite positive so its diagonal elements are positive.
So I can write [itex]A= L'L'^t[/itex] with [itex]L'=L \sqrt D[/itex].
Now in order to complete the proof, I must show that [itex]L'[/itex] is definite positive (I think it will imply that its diagonal entries are all positive). This is where I'm stuck.
I couldn't even show that [itex]\sqrt D[/itex] is positive definite...

You know that [itex]\sqrt{D}[/itex] is a diagonal matrix with positive entries. What happens if you actually perform the multiplication:

[tex]\left(\begin{array}{cccc} x_1 & x_2 & ... & x_n\\ \end{array}\right) =
\left(\begin{array}{cccc} \sqrt{d_1} & 0 & ... & 0\\ 0 & \sqrt{d_2} & ... & 0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & ... & \sqrt{d_n}\\ \end{array}\right)
\left(\begin{array}{c} x_1\\ x_2\\ ...\\ x_n\\ \end{array}\right)[/tex]

Do we get something >0?

Uniqueness proof:
Suppose that there exist [itex]L_1[/itex] such that [itex]A=LL^t=L_1L_1^t[/itex]. Let's show that [itex]L=L_1[/itex].
1)Multiplying by [itex]L^{-1}[/itex] we get [itex]L^t= L^{-1}L_1L_1^t[/itex].
2)Multiplying by [itex](L_1)^{-1}[/itex] we get that [itex]L^t (L_1^t)^{-1}=L_1 ^{-1}L_1L_1^t (L_1^t)^{-1}=L^{-1}L_1=D*[/itex] where [itex]D*[/itex] is diagonal with positive entries.
It follows that:
3)[itex]L_1=LD*[/itex]
4)[itex]L^t =D* L_1 ^t[/itex] I don't understand how the proof reaches this

Remember the equality from in (2):

[tex]]L^t (L_1^t)^{-1}=L_1 ^{-1}L_1L_1^t (L_1^t)^{-1}=L^{-1}L_1=D*[/tex]

The left and right sides say

[tex]]L^t (L_1^t)^{-1}=D*[/tex]

Thus

[tex]L^t=D*L_1^t[/tex]

From 3) and 4), we get (I don't understand the following implication) [itex]D*(LD*)^t=D*D*^t L^t=(D*)^2L^t \Leftrightarrow D*=I \Rightarrow L=L_1[/itex] which complete the proof.

Well, we know that

[tex]D*(LD*)^t=(D*)^2L^t[/tex]

but from 3 and 4, we also get

[tex]D*(LD*)^t=D*L_1^t=L^t[/tex]

Thus we get

[tex]L^t=(D*)^2L^t[/tex]

eliminate both sides gives us

[tex](D*)^2=I[/tex]

and because the entries of D* are positive, we get

[tex]D*=I[/tex]

Does that help?

fluidistic · Jun 30, 2011

Hi micromass :D

micromass said:

Hi fluidistic!
You know that [itex]\sqrt{D}[/itex] is a diagonal matrix with positive entries.

Hmm I understand that [itex]\sqrt a_i \cdot \sqrt a_i >0[/itex] for [itex]a_i \neq 0[/itex] but [itex](-\sqrt a_i)\cdot (-\sqrt a_i)>0[/itex] too. So why would [itex]\sqrt D[/itex] have positive entries? I mean, [itex]D[/itex] could still have positive entries even though [itex]\sqrt D[/itex] doesn't. Or I'm wrong?

What happens if you actually perform the multiplication:

[tex]\left(\begin{array}{cccc} x_1 & x_2 & ... & x_n\\ \end{array}\right) =
\left(\begin{array}{cccc} \sqrt{d_1} & 0 & ... & 0\\ 0 & \sqrt{d_2} & ... & 0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & ... & \sqrt{d_n}\\ \end{array}\right)
\left(\begin{array}{c} x_1\\ x_2\\ ...\\ x_n\\ \end{array}\right)[/tex]

Do we get something >0?

Yes :) assuming of course that [itex]\sqrt D[/itex] has positive entries, something I'm not yet convinced of.

Remember the equality from in (2):

[tex]]L^t (L_1^t)^{-1}=L_1 ^{-1}L_1L_1^t (L_1^t)^{-1}=L^{-1}L_1=D*[/tex]

The left and right sides say

[tex]]L^t (L_1^t)^{-1}=D*[/tex]

Thus

[tex]L^t=D*L_1^t[/tex]

Ah right! Thanks a lot.

Well, we know that

[tex]D*(LD*)^t=(D*)^2L^t[/tex]

Right.

but from 3 and 4, we also get

[tex]D*(LD*)^t=D*L_1^t=L^t[/tex]

Hmm I don't get this. What did you do to 3) and 4) to reach this?

Thus we get

[tex]L^t=(D*)^2L^t[/tex]

eliminate both sides gives us

[tex](D*)^2=I[/tex]

and because the entries of D* are positive, we get

[tex]D*=I[/tex]

Does that help?

I follow you on all this. I didn't realize that [itex]D*^t=D*[/itex]. Now I do.

micromass · Jun 30, 2011

fluidistic said:

Hi micromass :D
Hmm I understand that [itex]\sqrt a_i \cdot \sqrt a_i >0[/itex] for [itex]a_i \neq 0[/itex] but [itex](-\sqrt a_i)\cdot (-\sqrt a_i)>0[/itex] too. So why would [itex]\sqrt D[/itex] have positive entries? I mean, [itex]D[/itex] could still have positive entries even though [itex]\sqrt D[/itex] doesn't. Or I'm wrong?

Yes :) assuming of course that [itex]\sqrt D[/itex] has positive entries, something I'm not yet convinced of.

I see your problem. The thing is that the square root of a positive definite D matrix is defined as the unique positive definite matrix such that it's square is D.
So [itex]\sqrt{D}[/itex] is actually positive definite by definition.

The thing is that we can choose many different square roots of D, for every diagonal entry we can choose [itex]\sqrt{d_i}[/itex] or [itex]-\sqrt{d_i}[/itex], but by definiton, we take the positive root.

Hmm I don't get this. What did you do to 3) and 4) to reach this?

Well, from step (3) we know that [itex]L_1=LD*[/itex]. So exchange [itex]LD*[/itex] by [itex]L_1[/itex] in the equation [itex]D*(LD*)^t[/itex]. This gives [itex]D*L_1^t[/itex]. And (4) excactly says that this is [itex]L^t[/itex].

fluidistic · Jun 30, 2011

Thanks once again for the help!

micromass said:

I see your problem. The thing is that the square root of a positive definite D matrix is defined as the unique positive definite matrix such that it's square is D.
So [itex]\sqrt{D}[/itex] is actually positive definite by definition.

The thing is that we can choose many different square roots of D, for every diagonal entry we can choose [itex]\sqrt{d_i}[/itex] or [itex]-\sqrt{d_i}[/itex], but by definiton, we take the positive root.

I see. I blame myself for not knowing well the definitions!

Well, from step (3) we know that [itex]L_1=LD*[/itex]. So exchange [itex]LD*[/itex] by [itex]L_1[/itex]

I follow you until here

in the equation [itex]D*(LD*)^t[/itex]

Sorry to ask again, but in which equation? I don't find it.

This gives [itex]D*L_1^t[/itex]. And (4) excactly says that this is [itex]L^t[/itex].

Ok I understand this.

micromass · Jul 1, 2011

fluidistic said:

Sorry to ask again, but in which equation?

In the equation

[tex]D*(LD*)^t=(D*)^2L^t[/tex]

that you made in the beginning of (4)...

fluidistic · Jul 1, 2011

micromass said:

In the equation

[tex]D*(LD*)^t=(D*)^2L^t[/tex]

that you made in the beginning of (4)...

Thanks for all your help! Yeah I had realized this when I last studied this this morning before running to university and take the final exam (woke up at 7:39 am and I'm back from there, it's 7:15 pm. 4 hours exam in the morning+2.5 hours of Fortran exam in the afternoon. Then waiting for my grade :/). I had no time to reply here but I understood the proof, thanks to you I must say :D

Understanding a proof of Cholesky's factorization

Homework Statement

Homework Statement

1. What is Cholesky's factorization?

2. Why is Cholesky's factorization important?

3. How does Cholesky's factorization work?

4. What are the benefits of using Cholesky's factorization?

5. Are there any limitations to Cholesky's factorization?

Similar threads

Hot Threads

Recent Insights