Solving for a Surjective Matrix

In summary, the conversation discusses a proposition that states that if a surjective matrix A is used in an optimization problem, then the transpose of A is injective and the matrix A^T*A is invertible. However, the participants of the conversation agree that this proposition is incorrect, as shown by a counterexample provided. They also discuss how this error affects the solution to the optimization problem and note that the correct statement would involve the transpose of A being multiplied with A instead of A^T*A.
  • #1
lauratyso11n
8
0
I saw this in a book as a Proposition but I think it's an error:

Assume that the (n-by-k) matrix, [tex]A[/tex], is surjective as a mapping,

[tex]A:\mathbb{R}^{k}\rightarrow \mathbb{R}^{n}[/tex].

For any [tex]y \in \mathbb{R}^{n} [/tex], consider the optimization problem

[tex]min_{x \in \mathbb{R}^{k}}\left{||x||^2\right}[/tex]

such that [tex] Ax = y[/tex].

Then, the following hold:
(i) The transpose of [tex]A[/tex], call it [tex]A^{T}[/tex] is injective.
(ii) The matrix [tex]A^{T}A[/tex] is invertible.
(iii) etc etc etc...

I have a problem with point (ii), take as an example the (2-by-3) surjective matrix
[tex]A = \begin{pmatrix}
1 & 0 & 0\\
0 & 1 & 0
\end{pmatrix}[/tex]

[tex]A^{T}A[/tex] in this case is not invertible.

Can anyone confirm that part (ii) of this Proposition is indeed incorrect ?
 
Last edited:
Physics news on Phys.org
  • #2
I would agree with you that part (ii) of the proposition is incorrect (unless matrices are not acting on vectors from the left).

If you look at bijection part
http://en.wikipedia.org/wiki/Bijection,_injection_and_surjection
it reads: "If g o f is a bijection, then it can only be concluded that f is injective and g is surjective."

Working right to left with matrices and composition of functions says if A^{T}A was invertible (i.e. a bijection) then A would be injective and A^{T} would be surjective. Thus something is wrong!

[edit]
P.S. I didn't see the bit where it clearly said the matrices were acting from the left so I would say that it is definitely wrong.
[edit]
 
  • #3
lauratyso11n said:
Can anyone confirm that part (ii) of this Proposition is indeed incorrect ?
Yes, you are right. (i) is true but (ii) is false. But (ii) is true if A^TA is replaced by AA^T, so maybe it's a typo? I don't understand what "the optimization problem" has to do with this, or are the other parts of the proposition about this?
 
  • #4
Landau said:
Yes, you are right. (i) is true but (ii) is false. But (ii) is true if A^TA is replaced by AA^T, so maybe it's a typo? I don't understand what "the optimization problem" has to do with this, or are the other parts of the proposition about this?

The full Proposition is as follows:

Assume that the (n-by-k) matrix, [tex]A[/tex], is surjective as a mapping,

[tex]A:\mathbb{R}^{k}\rightarrow \mathbb{R}^{n}[/tex].

For any [tex]y \in \mathbb{R}^{n} [/tex], consider the optimization problem

[tex]min_{x \in \mathbb{R}^{k}}\left{\left||x|\right|^2\right}[/tex]

such that [tex] Ax = y[/tex].

Then, the following hold:
(i) The transpose of [tex]A[/tex], call it [tex]A^{T}[/tex] is injective.
(ii) The matrix [tex]A^{T}A[/tex] is invertible.
(iii) The unique optimal solution of the minimum norm problem is given by
[tex](A^TA)^{-1}A^Ty[/tex]

I have a problem with point (ii), take as an example the (2-by-3) surjective matrix
[tex]A = \begin{pmatrix}
1 & 0 & 0\\
0 & 1 & 0
\end{pmatrix}[/tex]

[tex]A^{T}A[/tex] in this case is not invertible.
 
Last edited:
  • #5
Landau said:
Yes, you are right. (i) is true but (ii) is false. But (ii) is true if A^TA is replaced by AA^T, so maybe it's a typo?

I don't think it's a typo as he uses this result to carry some other analysis further. The crux of it is:

[tex]\sigma\lambda = \alpha - \bf{r}[/tex]

where [tex]\sigma \in \mathbb{R}^{n-by-k}[/tex] is surjective, and [tex]\lambda \in \mathbb{R}^{k}[/tex], [tex]\alpha , \bf{r} \in \mathbb{R}^{n}[/tex].

How would you solve for [tex]\lambda[/tex] ? Isn't is critical that the 'typo' has to be correct to be able to solve for this ?

The author's solution as you might expect is [tex]\lambda = \left(\sigma^{T}\sigma\right)^{-1}\sigma^{T}\left[\alpha - \bf{r}\right][/tex].

So it's either a HUGE mistake on his part or I'm missing something. The author is actually quite insightful, and this error would be quite out of character for him.

BTW, thanks for making the effort to look at the problem. Much appreciated.
 
Last edited:
  • #6
Looking at dimension counting with your example, AtA is a 3x3 matrix, which means (AtA)-1 A wouldn't make sense even if the inverse was defined because the sizes of the matrices don't match up. On the other hand A At and A are compatible matrices which suggests that he just put the transpose on the wrong one and carried the error through.
 
  • #7
lauratyso11n said:
The author's solution as you might expect is [tex]\lambda = \left(\sigma^{T}\sigma\right)^{-1}\sigma^{T}[\alpha - \bf{r}][/tex].
This is correct provided that [itex]\sigma^{T}\sigma[/itex] is invertible, i.e. provided that [itex]\left(\sigma^{T}\sigma\right)^{-1}[/itex] makes sense.

The least square solution of Ax=y satisfies the normal equation:
[tex]A^TAx=A^Ty.[/tex]
This solution is unique if and only if [itex]A^TA[/itex] is invertible. In this case, it is given by
[tex]x=(A^TA)^{-1}A^Tb[/tex], just like the author asserts.

But [itex]A^TA[/itex] is not necessarily invertible, contrary to the proposition. Note that [itex]A^TA:\mathbb{R}^k\to\mathbb{R}^k[/itex] is bijective if and only if its rank equals k. But since [itex]A^TA[/itex] and A always have equal rank, this happens if and only if A has rank k. Since [itex]A:\mathbb{R}^k\to\mathbb{R}^n[/itex] is assumed to be surjective, it has rank n and we must have [itex]k\geq n[/itex]. So in fact, [itex]A^TA[/itex] is invertible if and only if k=n! Indeed, in your counter-example k and n are not equal.
Office_Shredder said:
which means (AtA)-1 A wouldn't make sense even if the inverse was defined because the sizes of the matrices don't match up.
This is not the expression the author uses; it is (AtA)-1 At.
 
Last edited:
  • #8
Perhaps you could post a link to the book?
 
  • #9
Landau said:
This is not the expression the author uses; it is (AtA)-1 At.

I assumed the * was just referring to matrix multiplication
 
  • #10
I assumed * means adjoint, so (in this real case) the transpose. This is also what lauratyso11n writes in post #5 (where A is called sigma).
 
  • #11
Office_Shredder said:
I assumed the * was just referring to matrix multiplication

Landau said:
I assumed * means adjoint, so (in this real case) the transpose. This is also what lauratyso11n writes in post #5 (where A is called sigma).

SORRYYYYYY, the [tex]A^*[/tex] is actually an [tex]A^T[/tex]. I've corrected it.
 
  • #12
The correct statement is that if [tex]A[/tex] is surjective then [tex]A^T[/tex] is injective and [tex]AA^T[/tex] is invertible.

The formula for the optimal [tex]x[/tex] is

[tex]\hat{x}=A^T(AA^T)^{-1}y[/tex]
 
  • #13
lauratyso11n said:
The correct statement is that if [tex]A[/tex] is surjective then [tex]A^T[/tex] is injective and [tex]AA^T[/tex] is invertible.
So it was a typo :)
Landau said:
Yes, you are right. (i) is true but (ii) is false. But (ii) is true if A^TA is replaced by AA^T, so maybe it's a typo?
 

What is a surjective matrix?

A surjective matrix is a type of matrix in linear algebra where every element in the output space (codomain) is mapped to by at least one element in the input space (domain). This means that every element in the output space has at least one pre-image in the input space.

How do you solve for a surjective matrix?

To solve for a surjective matrix, you need to first determine the dimensions of the matrix and then use various techniques such as row reduction, matrix multiplication, and inverse matrices to manipulate the matrix until it satisfies the surjectivity property. This involves ensuring that every output element has at least one corresponding input element.

What is the importance of solving for a surjective matrix?

Solving for a surjective matrix is important as it helps to determine if a given linear transformation is onto, meaning that it covers the entire output space. This is useful in various applications, such as data analysis, computer graphics, and engineering, as it allows for the efficient manipulation and analysis of data and systems.

What are the applications of surjective matrices?

Surjective matrices have various applications in linear algebra, engineering, and computer science. They are used in data compression, image processing, and cryptography, among others. They are also used in solving systems of linear equations and in determining the invertibility of matrices.

Are there any limitations to solving for a surjective matrix?

One limitation of solving for a surjective matrix is that it can be a computationally intensive process, especially for larger matrices. Additionally, the existence of a solution does not necessarily guarantee the uniqueness of the solution. It is also important to note that solving for a surjective matrix only applies to matrices with real or complex entries.

Similar threads

  • Linear and Abstract Algebra
Replies
10
Views
1K
  • Linear and Abstract Algebra
2
Replies
52
Views
2K
  • Linear and Abstract Algebra
Replies
34
Views
1K
Replies
6
Views
2K
  • Linear and Abstract Algebra
Replies
2
Views
826
Replies
31
Views
2K
  • Linear and Abstract Algebra
Replies
8
Views
1K
  • Linear and Abstract Algebra
Replies
10
Views
925
  • Linear and Abstract Algebra
Replies
15
Views
2K
  • Linear and Abstract Algebra
Replies
5
Views
896
Back
Top