Left and right inverses of a non-square matrix

  • Thread starter Thread starter PainterGuy
  • Start date Start date
  • Tags Tags
    Matrix
AI Thread Summary
The discussion centers on the conditions under which specific expressions involving left and right inverses of a non-square matrix A can be solutions to the equation Ax=y. It is established that if A has rank m and n>m, then (A′A)⁻¹A′y is not a solution due to singularity, while A′(AA′)⁻¹y is a valid solution. The conversation also addresses the distinction between using Ax=y versus Ax=B, clarifying that the notation reflects conventions where uppercase letters typically denote matrices and lowercase letters denote vectors. Participants express confusion over the application of right and left inverses in these contexts, particularly regarding their placement in equations. Ultimately, the discussion emphasizes the importance of understanding the implications of matrix rank and the conventions used in mathematical notation.
PainterGuy
Messages
938
Reaction score
72
Homework Statement
I was working with right inverse and left inverse of a matrix.
Relevant Equations
Please check my work below.
Hi,

It's actually not a homework problem but I still decided to post it here.

Problem:
Consider Ax=y, where A is mxn and has rank m. Is (A′A)⁻¹A′y a solution? If not, under what condition will it be a solution? Is A′(AA′)⁻¹y a solution?

The given solution is:
Consider Ax=y with A mxn and rank(A)=m, which implies n⋝m. If n>m then A′A is nxn and singular . Thus (A′A)⁻¹ is not defined and (A′A)⁻¹A′y is not a solution. Because AA′ is mxm and nonsingular, substituting A′(AA′)⁻¹y into Ax yields y. Thus A′(AA′)⁻¹y is a solution. If n=m then both reduce to A⁻¹y and are solutions of Ax=y.
Question 1:
A′(AA′)⁻¹ is right inverse but the system in question statement is written as "Ax=y". The yellow highlighted text below says that a right inverse, A′(AA′)⁻¹, is useful for solving XA=Y system. You notice the contradiction. Which source is correct? Given solution or the yellow highlighted text? Could you please help me?
1616399245942.png

Source: https://math.stackexchange.com/a/1335707/775285

Question 2:
Mostly it's written AX=B where B is a constant matrix. In the case above the system is written instead as Ax=y. Is B=y? This link is relevant here . Could you please help me with it?
 
  • Like
Likes WWGD and Delta2
Physics news on Phys.org
I cannot interpret this:
PainterGuy said:
substituting A′(AA′)⁻¹y into Ax yields y.
If they mean premultiplying by A′(AA′)⁻¹ it yields A′(AA′)⁻¹Ax=A′(AA′)⁻¹y. If A were non singular we could expand that inverse and get x=A′(AA′)⁻¹y, but not otherwise.
The veracity of the statement in yellow is easily checked.

For q2, the usual is to use uppercase for a matrix and lowercase for a vector, but of course the latter is just a special case of the former.
 
  • Like
Likes PainterGuy
Thank you.

I'm sorry but I still don't get it.

Of course, a right inverse is useful for solving an equation of the form XA=Y and a left inverse is useful for solving an equation of the form AX=Y.
Source: https://math.stackexchange.com/a/1335707/775285

In the given solution right inverse is used but the given system has the form AX=Y. The choice of using right inverse is correct, in my opinion, which means that the StackExchange source is wrong.

Consider Ax=y with A mxn and rank(A)=m, which implies n⋝m. If n>m then A′A is nxn and singular . Thus (A′A)⁻¹ is not defined and (A′A)⁻¹A′y is not a solution. Because AA′ is mxm and nonsingular, substituting A′(AA′)⁻¹y into Ax yields y. Thus A′(AA′)⁻¹y is a solution. If n=m then both reduce to A⁻¹y and are solutions of Ax=y.

haruspex said:
For q2, the usual is to use uppercase for a matrix and lowercase for a vector, but of course the latter is just a special case of the former.

Once again sorry but I think my question wasn't understood. Mostly, it's AX=B and not AX=Y. To me, AX=Y doesn't make much sense.

Question 2:
Mostly it's written AX=B where B is a constant matrix. In the case above the system is written instead as Ax=y. Is B=y?
Note to self:
1616489810622.png

...
1616489827889.png

Source: https://en.wikipedia.org/wiki/Generalized_inverse
 
these formulas you give for left and right inverses work over ##\mathbb R## but do not work over other fields -- so the thread is a bit misleading; I assume we are working over ##\mathbb R## from here forward. I also wouldn't read too much into the stackexchange linked post.

If ##A## is surjective and you are trying to solve for an ##\mathbf x## such that ##A\mathbf x = \mathbf b## then you know a solution exists (by surjectivity) and so you actually want to use the right inverse formula. The 'left inverse formula' actually corresponds to the case of ##A## being injective, so if a solution exists, it will be unique and you will find it, but a solution may not exist-- in such a case you actually have ##A\mathbf x\neq \mathbf b## for all ##\mathbf x##, and working over ##\mathbb R## you cannot 'solve' such an equation but you can find a unique candidate ##\mathbf x## that minimizes ##\Big \Vert A\mathbf x- \mathbf b\Big \Vert_2##
 
  • Like
Likes PainterGuy
Thank you!

StoneTemplePython said:
If ##A## is surjective and you are trying to solve for an ##\mathbf x## such that ##A\mathbf x = \mathbf b## then you know a solution exists (by surjectivity) and so you actually want to use the right inverse formula.

Possibly, this is something dumb but we are writing the right inverse on the left of A, isn't right inverse of A supposed to be written on the right of A?

1616548987242.png
Why is the original system written as AX=Y and not as AX=B? Could you please guide me?
 
Last edited:
you've written AX = Y vs AX = B many times and I see a distinction without a difference -- i.e. it seems irrelevant to me so I don't know what/why you are asking. It is a bit concerning though.

As for the other part of your question:
Suppose ##A\in \mathbb R^{m\times n}## is surjective and you want to find some ##\mathbf x## such that ##A\mathbf x = \mathbf b##, where of course ##A## and ##\mathbf b## are specified beforehand.

Then one valid solution is ##\mathbf x := (AA^T)^{-1}\mathbf b## because ##(AA^T)^{-1}## is a right inverse so
##A(AA^T)^{-1}= I_m## (which is the definition of a right inverse)
##\implies A\mathbf x = A\big((AA^T)^{-1}\mathbf b\big)= \big(A(AA^T)^{-1}\big)\mathbf b = I_m \mathbf b=\mathbf b##

it turns out in reals this is the minimum length (2 norm) solution for ##\mathbf x## that satisfies ##A\mathbf x = \mathbf b##. But based on the level of questioning I think this is way out side the scope.
 
  • Like
Likes PainterGuy
PainterGuy said:
Why is the original system written as AX=Y and not as AX=B? Could you please guide me?
I mentioned before that a common convention is to use uppercase for matrices and lowercase for vectors. Another is to use A, B.. for matrices and X, Y.. for vectors; but, confusingly, also using X for an unknown matrix. So in AX=B suggests matrices whereas in AX=Y X and Y may well be vectors.
But these are only conventions. In any given usage the meanings should be made clear by the context,
 
  • Like
Likes PainterGuy
Thank you very much!

I understand it now that why the right inverse was written on the left for the given case.

1616657757111.png
 
Back
Top