Understanding Invertible Matrices and Homogenous Systems

In summary, In part 1, the invertible matrix A is equivalent to the homogenous system having only the trivial solution. In part 2, the system of equations ##A\bf X = \bf Y## has a solution for each ##n\times 1 ## matrix ##\bf Y##. In part 3, if ##A## is invertible, the solutions of ##A \bf X = \bf Y## is ## A^{-1} \bf {Y} = \bf {X}##. Conversely, if ##A\bf X = \bf Y## has a solution for each given ##\bf Y##, then ##R## is a row reduced echelon matrix which is row
  • #1
Buffu
849
146
For a ##n\times n## matrix A, the following are equivalent.

1) A is invertible
2) The homogenous system ##A\bf X = 0## has only the trivial solution ##\mathbf X = 0##
3) The system of equations ##A\bf X = \bf Y## has a solution for each ##n\times 1 ## matrix ##\bf Y##.

I have problem in third part of the question so I skip proof of equivalence of first two parts as there are trivial equivalent.

If ##A## is invertible, the solutions of ##A \bf X = \bf Y## is ## A^{-1} \bf {Y} = \bf {X}##. Conversely suppose ##A\bf X = \bf Y## has a solution for each given ##\bf Y##. Let ##R## be a row reduced echelon matrix which is row equivalen to ##A##. We wish to show that ##R = I##. That amounts to showing that the last row is ##R## is not ##0##. Let ##E = \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \\ 1 \end{bmatrix}##

If the system ##R \bf X = \bf E## can be solved for ##X##, the last row of ##R## cannot be ##0##. We know that ##R = PA## where ##P## is invertible. Thus ##R\bf X = \bf E## if and only if ##A \mathbf X = P^{-1}\mathbf E##. According to (3) the latter system has a solution.

I get why last row should not be zero, if it is so then ##A## is not invertible, but I don't what is special in last row ? if any row(s) are zero ##A## will not be invertible, invalidating our claim.
 
Physics news on Phys.org
  • #2
Why so complicated? 1) says ##A## is an isomorphism (bijective), 2) says ##A## is a monomorphism (injective) and 3) says ##A## is an epimorphism (surjective). The equivalence of all three follows immediately from the dimension formula
$$n=\dim \operatorname{ker} A + \dim \operatorname{im} A = \operatorname{def} A +\operatorname{rk} A$$
If it really has to be coordinates, you could take a basis ##\{Y_1,\ldots , Y_n\}## and vectors ##AX_i=Y_i## and show that a linear dependency of ##\{X_1,\ldots ,X_n\}## leads to a linear dependency of ##\{Y_1,\ldots , Y_n\}##.
 
  • #3
fresh_42 said:
Why so complicated? 1) says ##A## is an isomorphism (bijective), 2) says ##A## is a monomorphism (injective) and 3) says ##A## is an epimorphism (surjective). The equivalence of all three follows immediately from the dimension formula
$$n=\dim \operatorname{ker} A + \dim \operatorname{im} A = \operatorname{def} A +\operatorname{rk} A$$
If it really has to be coordinates, you could take a basis ##\{Y_1,\ldots , Y_n\}## and vectors ##AX_i=Y_i## and show that a linear dependency of ##\{X_1,\ldots ,X_n\}## leads to a linear dependency of ##\{Y_1,\ldots , Y_n\}##.

Literally I don't know any of this.I know why 1 implies bijective but I don't know why 2 and 3 implies injective and surjective. Not to say I don't know what dimension mean let alone dimension formula.
 
  • #4
Buffu said:
Literally I don't know any of this.I know why 1 implies bijective but I don't know why 2 and 3 implies injective and surjective. Not to say I don't know what dimension mean let alone dimension formula.

For part 3), I wonder if you could simply choose a suitable ##Y##?

Alternatively, are you allowed to use the properties of determinants?
 
  • #5
Buffu said:
Literally I don't know any of this.I know why 1 implies bijective but I don't know why 2 and 3 implies injective and surjective. Not to say I don't know what dimension mean let alone dimension formula.
Sorry. In this case you have to go the rough way by equations, which I'm not really good at. That's why I suggested a more general approach. Only because I've mentioned it, I'll explain the terms.
Injective means, no two values of ##X## can be mapped on the same value ##Y##. This means ##AX_1=Y=AX_2## implies ##X_1=X_2##. In our case with linear functions as ##A## is one, we have ##AX_1-AX_2=A(X_1-X_2)=0## implies ##X_1-X_2=0##. Now if we replace ##X=X_1-X_2## we have exactly condition two.
Surjective means, every vector ##Y## is hit by ##A##, that is there exists a vector ##X## such that ##AX=Y## which is exactly condition three.
 
  • #6
PeroK said:
For part 3), I wonder if you could simply choose a suitable ##Y##?

Alternatively, are you allowed to use the properties of determinants?

I think because it is for each ##\bf Y## not for some ##\bf Y##.

No determinants.
 
  • #7
fresh_42 said:
Sorry. In this case you have to go the rough way by equations, which I'm not really good at. That's why I suggested a more general approach. Only because I've mentioned it, I'll explain the terms.
Injective means, no two values of ##X## can be mapped on the same value ##Y##. This means ##AX_1=Y=AX_2## implies ##X_1=X_2##. In our case with linear functions as ##A## is one, we have ##AX_1-AX_2=A(X_1-X_2)=0## implies ##X_1-X_2=0##. Now if we replace ##X=X_1-X_2## we have exactly condition two.
Surjective means, every vector ##Y## is hit by ##A##, that is there exists a vector ##X## such that ##AX=Y## which is exactly condition three.

Oh thanks, It made the statement of the proof a bit clear. :)
 
  • #8
Buffu said:
I get why last row should not be zero, if it is so then ##A## is not invertible, but I don't what is special in last row ? if any row(s) are zero ##A## will not be invertible, invalidating our claim.
You can do the same for every row, say the ##i-##th row. Simply move the ##1## in ##E## to the position of the row number ##i##.
 
  • Like
Likes Buffu
  • #9
Buffu said:
I think because it is for each ##\bf Y## not for some ##\bf Y##.

No determinants.

Suppose you had to prove the following, for ##a, x, y \in \mathbb{R}##:

If ##ax = y## has a solution for all ##y##, then show that ##a \ne 0##.

You could show this by considering ##y = 1##.

Whether it helps with your problem or not, you need to think about what "for all ##Y##" means and that you can logicaly choose any specific ##Y## you like and know that there is a solution.
 
  • Like
Likes Buffu
  • #10
fresh_42 said:
You can do the same for every row, say the ##i-##th row. Simply move the ##1## in ##E## to the position of the row number ##i##.

I also want to know why ##E## has one 1 and other zero ? why this choice of ##E##, wouldn't any random ##E## also work ?
 
  • #11
PeroK said:
Suppose you had to prove the following, for ##a, x, y \in \mathbb{R}##:

If ##ax = y## has a solution for all ##y##, then show that ##a \ne 0##.

You could show this by considering ##y = 1##.

Whether it helps with your problem or not, you need to think about what "for all ##Y##" means and that you can logicaly choose any specific ##Y## you like and know that there is a solution.

But many inhomogeneous systems don't have any solution, so how would be sure that my choosen ##\mathbf Y## has a solution ?
 
  • #12
Buffu said:
But many inhomogeneous systems don't have any solution, so how would be sure that my choosen ##\mathbf Y## has a solution ?

Let me show you. It may not help you to do the problem but ...

Let ##Y = I##. Then, ##\exists \ X## such that ##AX = I##

Now, there is still some work to show that ##XA = I## and hence ##X = A^{-1}##.

But, that is an alternative approach to this problem.

Except, of course, ##X## and ##Y## are "vectors" not ##n \times n## matrices. My apologies, but I'll leave this post, as this idea could still be used for "vectors" ##Y##.
 
  • #13
PeroK said:
Let me show you. It may not help you to do the problem but ...

Let ##Y = I##. Then, ##\exists \ X## such that ##AX = I##

Now, there is still some work to show that ##XA = I## and hence ##X = A^{-1}##.

But, that is an alternative approach to this problem.

No you took a nice ##\bf Y##,

Let ##A := \begin{bmatrix} 1 & 1 \\ 3 & 3 \end{bmatrix}##

Now say I am proving this question for this ##A##, If I take any ##\bf Y## I would not get any solution. Then ?
 
  • #14
Buffu said:
No you took a nice ##\bf Y##,

Let ##A := \begin{bmatrix} 1 & 1 \\ 3 & 3 \end{bmatrix}##

Now say I am proving this question for this ##A##, If I take any ##\bf Y## I would not get any solution. Then ?

Your logic is all back to front here. Your ##A## does not have the stated property, so meets none of the criteria.

But, if you have an ##A## that does have the property in 3), then you can apply this property for each of the columns of the identity matrix and generate a right-inverse matrix for ##A##.
 
  • #15
Buffu said:
I also want to know why ##E## has one 1 and other zero ? why this choice of ##E##, wouldn't any random ##E## also work ?
If you consider ##E_1=(1,0,\ldots ,0)^T\; , \;E_2=(0,1,\ldots ,0)^T\; , \;\ldots \; , \;E_n=(0,0,\ldots ,1)^T## then you have all vectors needed to construct an arbitrary ##Y=y_1E_1+\ldots +y_nE_n##.

Anyway, the proof above uses the reduced row echelon matrix ##R## of ##A##. This procedure produces a last row all zero if ##A## is not regular and a row that cannot be all zero if ##A## is regular. This means we may concentrate on the last row of ##R## alone and especially on the very last entry of it, since this entry decides whether our row reduction produced an all zero row.

Therefore it's sufficient to look at the last row as the decisive indicator. All is needed is a vector ##E \neq 0##. To chose it's last entry different from zero, for sake of simplicity ##1##, is the natural choice, because we expect the last row of ##R## to be ##(0,0,\ldots ,1)## (by the procedure of row reduction).

Row (the last of ##R\,##) times column (##X=(x_1,\ldots x_n)^T\,##) gets us in the last row ##R_{n1}x_1 + \ldots + R_{nn} x_n = E_{n}##.
We are free to chose any vector ##E##. With ##E=(0,0,\ldots ,1)^T## we have ##R_{n1}x_1 + \ldots + R_{nn} x_n \stackrel{(*)}{=}R_{nn} x_n= 1## and condition (3) guarantees us a solution ##X\,##. So ##R_{nn} \neq 0##. The equation ##(*)## is because of the row reduction.
 
  • Like
Likes Buffu
  • #16
fresh_42 said:
If you consider ##E_1=(1,0,\ldots ,0)^T\; , \;E_2=(0,1,\ldots ,0)^T\; , \;\ldots \; , \;E_n=(0,0,\ldots ,1)^T## then you have all vectors needed to construct an arbitrary ##Y=y_1E_1+\ldots +y_nE_n##.

Anyway, the proof above uses the reduced row echelon matrix ##R## of ##A##. This procedure produces a last row all zero if ##A## is not regular and a row that cannot be all zero if ##A## is regular. This means we may concentrate on the last row of ##R## alone and especially on the very last entry of it, since this entry decides whether our row reduction produced an all zero row.

Therefore it's sufficient to look at the last row as the decisive indicator. All is needed is a vector ##E \neq 0##. To chose it's last entry different from zero, for sake of simplicity ##1##, is the natural choice, because we expect the last row of ##R## to be ##(0,0,\ldots ,1)## (by the procedure of row reduction).

Row (the last of ##R\,##) times column (##X=(x_1,\ldots x_n)^T\,##) gets us in the last row ##R_{n1}x_1 + \ldots + R_{nn} x_n = E_{n}##.
We are free to chose any vector ##E##. With ##E=(0,0,\ldots ,1)^T## we have ##R_{n1}x_1 + \ldots + R_{nn} x_n \stackrel{(*)}{=}R_{nn} x_n= 1## and condition (3) guarantees us a solution ##X\,##. So ##R_{nn} \neq 0##. The equation ##(*)## is because of the row reduction.

Yes thank you very much.:smile:
 

1. What is the purpose of a proof?

A proof is a logical argument or series of steps that demonstrates the validity of a statement or theorem. The purpose of a proof is to show that a statement is true using well-established mathematical or scientific principles.

2. How do I read and understand a proof?

To understand a proof, it is important to have a basic understanding of the relevant mathematical or scientific concepts and notation. You should also carefully read each step of the proof and make sure that you understand the logical connections between them. If necessary, you can also consult additional resources or seek help from a teacher or tutor.

3. What are the different types of proofs?

There are several different types of proofs, including direct proofs, proof by contradiction, and proof by induction. Direct proofs involve starting with known or accepted facts and using logical steps to arrive at the desired conclusion. Proof by contradiction involves assuming the opposite of the statement and showing that this leads to a contradiction. Proof by induction involves proving that a statement is true for a specific case, and then using this to show that it is true for all cases.

4. How can I check if a proof is correct?

To check the validity of a proof, you can follow the steps yourself and make sure that each step logically follows from the previous one. You can also try to find counterexamples or exceptions to the statement being proved. It is also helpful to have someone else review the proof and provide feedback.

5. What should I do if I am struggling to understand a proof?

If you are having trouble understanding a proof, it can be helpful to break it down into smaller, more manageable parts. You can also try to find additional resources or seek help from a teacher or tutor. It is important to keep in mind that understanding proofs takes practice and patience, so don't get discouraged if it doesn't come easily at first.

Similar threads

  • Linear and Abstract Algebra
Replies
6
Views
1K
  • Linear and Abstract Algebra
Replies
4
Views
887
Replies
2
Views
1K
  • Linear and Abstract Algebra
Replies
8
Views
888
  • Linear and Abstract Algebra
Replies
5
Views
945
  • Linear and Abstract Algebra
Replies
14
Views
2K
  • Linear and Abstract Algebra
Replies
9
Views
4K
  • Linear and Abstract Algebra
Replies
1
Views
2K
  • Linear and Abstract Algebra
Replies
2
Views
2K
  • Linear and Abstract Algebra
Replies
1
Views
929
Back
Top