Understanding Invertible Matrices and Homogenous Systems

Buffu · Jun 8, 2017

For a ##n\times n## matrix A, the following are equivalent.

1) A is invertible
2) The homogenous system ##A\bf X = 0## has only the trivial solution ##\mathbf X = 0##
3) The system of equations ##A\bf X = \bf Y## has a solution for each ##n\times 1 ## matrix ##\bf Y##.

I have problem in third part of the question so I skip proof of equivalence of first two parts as there are trivial equivalent.

If ##A## is invertible, the solutions of ##A \bf X = \bf Y## is ## A^{-1} \bf {Y} = \bf {X}##. Conversely suppose ##A\bf X = \bf Y## has a solution for each given ##\bf Y##. Let ##R## be a row reduced echelon matrix which is row equivalen to ##A##. We wish to show that ##R = I##. That amounts to showing that the last row is ##R## is not ##0##. Let ##E = \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \\ 1 \end{bmatrix}##

If the system ##R \bf X = \bf E## can be solved for ##X##, the last row of ##R## cannot be ##0##. We know that ##R = PA## where ##P## is invertible. Thus ##R\bf X = \bf E## if and only if ##A \mathbf X = P^{-1}\mathbf E##. According to (3) the latter system has a solution.

I get why last row should not be zero, if it is so then ##A## is not invertible, but I don't what is special in last row ? if any row(s) are zero ##A## will not be invertible, invalidating our claim.

fresh_42 · Jun 8, 2017

Why so complicated? 1) says ##A## is an isomorphism (bijective), 2) says ##A## is a monomorphism (injective) and 3) says ##A## is an epimorphism (surjective). The equivalence of all three follows immediately from the dimension formula
$$n=\dim \operatorname{ker} A + \dim \operatorname{im} A = \operatorname{def} A +\operatorname{rk} A$$
If it really has to be coordinates, you could take a basis ##\{Y_1,\ldots , Y_n\}## and vectors ##AX_i=Y_i## and show that a linear dependency of ##\{X_1,\ldots ,X_n\}## leads to a linear dependency of ##\{Y_1,\ldots , Y_n\}##.

Buffu · Jun 9, 2017

fresh_42 said:

Why so complicated? 1) says ##A## is an isomorphism (bijective), 2) says ##A## is a monomorphism (injective) and 3) says ##A## is an epimorphism (surjective). The equivalence of all three follows immediately from the dimension formula
$$n=\dim \operatorname{ker} A + \dim \operatorname{im} A = \operatorname{def} A +\operatorname{rk} A$$
If it really has to be coordinates, you could take a basis ##\{Y_1,\ldots , Y_n\}## and vectors ##AX_i=Y_i## and show that a linear dependency of ##\{X_1,\ldots ,X_n\}## leads to a linear dependency of ##\{Y_1,\ldots , Y_n\}##.

Literally I don't know any of this.I know why 1 implies bijective but I don't know why 2 and 3 implies injective and surjective. Not to say I don't know what dimension mean let alone dimension formula.

PeroK · Jun 9, 2017

Buffu said:

Literally I don't know any of this.I know why 1 implies bijective but I don't know why 2 and 3 implies injective and surjective. Not to say I don't know what dimension mean let alone dimension formula.

For part 3), I wonder if you could simply choose a suitable ##Y##?

Alternatively, are you allowed to use the properties of determinants?

fresh_42 · Jun 9, 2017

Buffu said:

Literally I don't know any of this.I know why 1 implies bijective but I don't know why 2 and 3 implies injective and surjective. Not to say I don't know what dimension mean let alone dimension formula.

Sorry. In this case you have to go the rough way by equations, which I'm not really good at. That's why I suggested a more general approach. Only because I've mentioned it, I'll explain the terms.
Injective means, no two values of ##X## can be mapped on the same value ##Y##. This means ##AX_1=Y=AX_2## implies ##X_1=X_2##. In our case with linear functions as ##A## is one, we have ##AX_1-AX_2=A(X_1-X_2)=0## implies ##X_1-X_2=0##. Now if we replace ##X=X_1-X_2## we have exactly condition two.
Surjective means, every vector ##Y## is hit by ##A##, that is there exists a vector ##X## such that ##AX=Y## which is exactly condition three.

Buffu · Jun 9, 2017

PeroK said:

For part 3), I wonder if you could simply choose a suitable ##Y##?

Alternatively, are you allowed to use the properties of determinants?

I think because it is for each ##\bf Y## not for some ##\bf Y##.

No determinants.

Buffu · Jun 9, 2017

fresh_42 said:

Sorry. In this case you have to go the rough way by equations, which I'm not really good at. That's why I suggested a more general approach. Only because I've mentioned it, I'll explain the terms.
Injective means, no two values of ##X## can be mapped on the same value ##Y##. This means ##AX_1=Y=AX_2## implies ##X_1=X_2##. In our case with linear functions as ##A## is one, we have ##AX_1-AX_2=A(X_1-X_2)=0## implies ##X_1-X_2=0##. Now if we replace ##X=X_1-X_2## we have exactly condition two.
Surjective means, every vector ##Y## is hit by ##A##, that is there exists a vector ##X## such that ##AX=Y## which is exactly condition three.

Oh thanks, It made the statement of the proof a bit clear. :)

fresh_42 · Jun 9, 2017

Buffu said:

I get why last row should not be zero, if it is so then ##A## is not invertible, but I don't what is special in last row ? if any row(s) are zero ##A## will not be invertible, invalidating our claim.

You can do the same for every row, say the ##i-##th row. Simply move the ##1## in ##E## to the position of the row number ##i##.

PeroK · Jun 9, 2017

Buffu said:

I think because it is for each ##\bf Y## not for some ##\bf Y##.

No determinants.

Suppose you had to prove the following, for ##a, x, y \in \mathbb{R}##:

If ##ax = y## has a solution for all ##y##, then show that ##a \ne 0##.

You could show this by considering ##y = 1##.

Whether it helps with your problem or not, you need to think about what "for all ##Y##" means and that you can logicaly choose any specific ##Y## you like and know that there is a solution.

Buffu · Jun 9, 2017

fresh_42 said:

You can do the same for every row, say the ##i-##th row. Simply move the ##1## in ##E## to the position of the row number ##i##.

I also want to know why ##E## has one 1 and other zero ? why this choice of ##E##, wouldn't any random ##E## also work ?

Buffu · Jun 9, 2017

PeroK said:

Suppose you had to prove the following, for ##a, x, y \in \mathbb{R}##:

If ##ax = y## has a solution for all ##y##, then show that ##a \ne 0##.

You could show this by considering ##y = 1##.

Whether it helps with your problem or not, you need to think about what "for all ##Y##" means and that you can logicaly choose any specific ##Y## you like and know that there is a solution.

But many inhomogeneous systems don't have any solution, so how would be sure that my choosen ##\mathbf Y## has a solution ?

PeroK · Jun 9, 2017

Buffu said:

But many inhomogeneous systems don't have any solution, so how would be sure that my choosen ##\mathbf Y## has a solution ?

Let me show you. It may not help you to do the problem but ...

Let ##Y = I##. Then, ##\exists \ X## such that ##AX = I##

Now, there is still some work to show that ##XA = I## and hence ##X = A^{-1}##.

But, that is an alternative approach to this problem.

Except, of course, ##X## and ##Y## are "vectors" not ##n \times n## matrices. My apologies, but I'll leave this post, as this idea could still be used for "vectors" ##Y##.

Buffu · Jun 9, 2017

PeroK said:

Let me show you. It may not help you to do the problem but ...

Let ##Y = I##. Then, ##\exists \ X## such that ##AX = I##

Now, there is still some work to show that ##XA = I## and hence ##X = A^{-1}##.

But, that is an alternative approach to this problem.

No you took a nice ##\bf Y##,

Let ##A := \begin{bmatrix} 1 & 1 \\ 3 & 3 \end{bmatrix}##

Now say I am proving this question for this ##A##, If I take any ##\bf Y## I would not get any solution. Then ?

PeroK · Jun 9, 2017

Buffu said:

No you took a nice ##\bf Y##,

Let ##A := \begin{bmatrix} 1 & 1 \\ 3 & 3 \end{bmatrix}##

Now say I am proving this question for this ##A##, If I take any ##\bf Y## I would not get any solution. Then ?

Your logic is all back to front here. Your ##A## does not have the stated property, so meets none of the criteria.

But, if you have an ##A## that does have the property in 3), then you can apply this property for each of the columns of the identity matrix and generate a right-inverse matrix for ##A##.

fresh_42 · Jun 9, 2017

Buffu said:

I also want to know why ##E## has one 1 and other zero ? why this choice of ##E##, wouldn't any random ##E## also work ?

If you consider ##E_1=(1,0,\ldots ,0)^T\; , \;E_2=(0,1,\ldots ,0)^T\; , \;\ldots \; , \;E_n=(0,0,\ldots ,1)^T## then you have all vectors needed to construct an arbitrary ##Y=y_1E_1+\ldots +y_nE_n##.

Anyway, the proof above uses the reduced row echelon matrix ##R## of ##A##. This procedure produces a last row all zero if ##A## is not regular and a row that cannot be all zero if ##A## is regular. This means we may concentrate on the last row of ##R## alone and especially on the very last entry of it, since this entry decides whether our row reduction produced an all zero row.

Therefore it's sufficient to look at the last row as the decisive indicator. All is needed is a vector ##E \neq 0##. To chose it's last entry different from zero, for sake of simplicity ##1##, is the natural choice, because we expect the last row of ##R## to be ##(0,0,\ldots ,1)## (by the procedure of row reduction).

Row (the last of ##R\,##) times column (##X=(x_1,\ldots x_n)^T\,##) gets us in the last row ##R_{n1}x_1 + \ldots + R_{nn} x_n = E_{n}##.
We are free to chose any vector ##E##. With ##E=(0,0,\ldots ,1)^T## we have ##R_{n1}x_1 + \ldots + R_{nn} x_n \stackrel{(*)}{=}R_{nn} x_n= 1## and condition (3) guarantees us a solution ##X\,##. So ##R_{nn} \neq 0##. The equation ##(*)## is because of the row reduction.

Buffu · Jun 9, 2017

fresh_42 said:

If you consider ##E_1=(1,0,\ldots ,0)^T\; , \;E_2=(0,1,\ldots ,0)^T\; , \;\ldots \; , \;E_n=(0,0,\ldots ,1)^T## then you have all vectors needed to construct an arbitrary ##Y=y_1E_1+\ldots +y_nE_n##.

Anyway, the proof above uses the reduced row echelon matrix ##R## of ##A##. This procedure produces a last row all zero if ##A## is not regular and a row that cannot be all zero if ##A## is regular. This means we may concentrate on the last row of ##R## alone and especially on the very last entry of it, since this entry decides whether our row reduction produced an all zero row.

Therefore it's sufficient to look at the last row as the decisive indicator. All is needed is a vector ##E \neq 0##. To chose it's last entry different from zero, for sake of simplicity ##1##, is the natural choice, because we expect the last row of ##R## to be ##(0,0,\ldots ,1)## (by the procedure of row reduction).

Row (the last of ##R\,##) times column (##X=(x_1,\ldots x_n)^T\,##) gets us in the last row ##R_{n1}x_1 + \ldots + R_{nn} x_n = E_{n}##.
We are free to chose any vector ##E##. With ##E=(0,0,\ldots ,1)^T## we have ##R_{n1}x_1 + \ldots + R_{nn} x_n \stackrel{(*)}{=}R_{nn} x_n= 1## and condition (3) guarantees us a solution ##X\,##. So ##R_{nn} \neq 0##. The equation ##(*)## is because of the row reduction.

Yes thank you very much.

Understanding Invertible Matrices and Homogenous Systems

1. What is the purpose of a proof?

2. How do I read and understand a proof?

3. What are the different types of proofs?

4. How can I check if a proof is correct?

5. What should I do if I am struggling to understand a proof?

Similar threads

Hot Threads

Recent Insights