Problem involving matrix multiplication and dot product in one proof

Click For Summary
SUMMARY

The discussion centers on proving that for a real m x n matrix A, and vectors x in R^n and y in R^m, the dot product of Ax with y equals the dot product of x with A^Ty. Participants clarify the proof by using matrix multiplication and properties of transposes, demonstrating that (Ax)·y = y^TAx = x^TA^Ty. The proof is established through index manipulation and the properties of dot products in real inner-product spaces, confirming that A^T is the unique matrix satisfying this relationship.

PREREQUISITES
  • Understanding of matrix multiplication and its properties
  • Familiarity with the concept of dot products in vector spaces
  • Knowledge of matrix transposition and its implications
  • Basic linear algebra concepts, particularly in real vector spaces
NEXT STEPS
  • Study the properties of matrix transposition in linear algebra
  • Learn about the implications of the dot product in real inner-product spaces
  • Explore proofs involving matrix identities and their applications
  • Investigate the role of linear transformations in relation to matrix representations
USEFUL FOR

Mathematicians, students of linear algebra, and anyone interested in understanding matrix operations and their proofs in real vector spaces.

gucci1
Messages
12
Reaction score
0
The problem is:

Let A be a real m x n matrix and let x be in R^n and y be in R^m (n and m dimensional real vector spaces, respectively). Show that the dot product of Ax with y equals the dot product of x with A^Ty (A^T is the transpose of A).

The way I went about starting this problem is to use the definitions where the definition of the dot product on real numbers is: the sum with k from 1 to n of ak * bk and the definition of matrix multiplication for the entries of Ax would each be of the form: sum from k=1 to k=n of Aik * xk1.

Hopefully that was clear enough, but what I come up with when I plug the second definition into the first is one sum inside another and it seems like either I'm missing something that I can simplify or I went in the wrong direction! Does anyone have any suggestions for what I could try here? Any help is appreciated :D
 
Physics news on Phys.org
Have you tried relabelling the indices of your nested sum? That and using the fact that $A^T$ is the transpose of $A$ should lead you to a proof that the two expressions are the same.

What I mean is in your nested sum, reverse the two indices $i$ and $j$ (or whatever their name is) and then state that $A_{ij} = A^T_{ji}$, and that should give you the desired result.​
 
Last edited:
gucci said:
The problem is:

Let A be a real m x n matrix and let x be in R^n and y be in R^m (n and m dimensional real vector spaces, respectively). Show that the dot product of Ax with y equals the dot product of x with A^Ty (A^T is the transpose of A).

The way I went about starting this problem is to use the definitions where the definition of the dot product on real numbers is: the sum with k from 1 to n of ak * bk and the definition of matrix multiplication for the entries of Ax would each be of the form: sum from k=1 to k=n of Aik * xk1.

Hopefully that was clear enough, but what I come up with when I plug the second definition into the first is one sum inside another and it seems like either I'm missing something that I can simplify or I went in the wrong direction! Does anyone have any suggestions for what I could try here? Any help is appreciated :D

Here's an alternative to index labeling; it involves using matrix multiplication and noting that you can express the dot product as follows:

\[(A\mathbf{x})\cdot \mathbf{y} = \mathbf{y}^TA\mathbf{x}.\]

Likewise, $(A^T\mathbf{y})\cdot\mathbf{x} = \mathbf{x}^TA^T\mathbf{y}=(\mathbf{y}^TA\mathbf{x})^T$. Since $\mathbf{x}^TA^T\mathbf{y}$ and $\mathbf{y}^TA\mathbf{x}$ are $1\times 1$ matrices, we must have
\[(\mathbf{x}^TA^T\mathbf{y})^T =\mathbf{x}^TA^T\mathbf{y}\quad\text{and} \quad (\mathbf{y}^TA\mathbf{x})^T = \mathbf{y}^TA\mathbf{x}\]
and thus
\[\mathbf{x}^TA^T\mathbf{y} = (\mathbf{y}^TA\mathbf{x})^T=\mathbf{y}^TA\mathbf{x}.\]

Therefore, it follows that
\[(A\mathbf{x})\cdot\mathbf{y} = \mathbf{y}^TA\mathbf{x} = (\mathbf{x}^T A^T\mathbf{y})^T = \mathbf{x}^TA^T\mathbf{y} = (A^T\mathbf{y})\cdot\mathbf{x}\]
which is what you were after.

I hope this makes sense!
 
I would simplify the presentation by noting that for any 1x1 matrix $M$, $M^T = M$

(there is only the single diagonal element).

Thus:

$Ax \cdot y = (y^T)(Ax) = [(y^T)(Ax)]^T = (Ax)^T(y^T)^T = (x^TA^T)y = x^T(A^Ty) = A^Ty \cdot x = x \cdot A^Ty$

(this only works with REAL inner-product spaces, by the way)

This is pretty much the same as what Chris L T521 posted, but has fewer steps.

Working with just elements, we have, for:

$A = (a_{ij}), x = (x_1,\dots,x_m), y = (y_1,\dots,y_n)$

$\displaystyle Ax \cdot y = \sum_{i = 1}^m \left(\sum_{j = 1}^n a_{ij}x_j\right) y_i$

$ = (a_{11}x_1 + \cdots + a_{1n}x_n)y_1 + \cdots + (a_{m1}x_1 + \cdots + a_{mn}x_n)y_m$

$ = (a_{11}y_1 + \cdots + a_{m1}y_m)x_1 + \cdots (a_{1n}y_1 + \cdots + a_{mn}y_m)x_n$

(make sure you understand that each term $a_{ij}x_jy_i = a_{ij}y_ix_j$ only occurs once in each sum, we're just grouping them differently...in the first sum we're matching the row entry index of $A$ with the index of $y$, in the second sum we're matching the column index of $A$ with the index of $x$, and the transpose just switches rows with columns).

$\displaystyle = \sum_{j = 1}^n \left(\sum_{i = 1}^m a_{ji}y_i \right)x_j$

$= A^Ty \cdot x = x \cdot A^Ty$
 
Chris L T521 said:
Here's an alternative to index labeling; it involves using matrix multiplication and noting that you can express the dot product as follows:

\[(A\mathbf{x})\cdot \mathbf{y} = \mathbf{y}^TA\mathbf{x}.\]

So I get why everything after this step would follow, and thank you all very much for that. I am just missing why this is a true assumption :-/
 
gucci said:
So I get why everything after this step would follow, and thank you all very much for that. I am just missing why this is a true assumption :-/

Well if $\mathbf{u},\mathbb{v}\in\mathbb{R}^m$ where $\mathbf{u}=\begin{pmatrix} u_1 \\ u_2 \\ \vdots \\ u_m\end{pmatrix}$ and $\mathbf{v}=\begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_m\end{pmatrix}$, then we know that $\mathbf{u}\cdot \mathbf{v} = u_1v_1 + u_2v_2 + \ldots + u_mv_m$
(note that as a matrix, a scalar quantity is a $1\times 1$ matrix).

However, $\mathbf{u}$ and $\mathbf{v}$ are $m\times 1$ matrices; thus, if we were to express the dot product in terms of matrix multiplication, we must have one $m\times 1$ and one $1\times m$ matrix. Hence, we need to take the transpose of one of the vectors to accomplish this. With that said, we can then say that

\[\mathbf{u}\cdot\mathbf{v} = \mathbf{v}^T\mathbf{u} = \begin{pmatrix} v_1 & v_2 & \cdots & v_m \end{pmatrix} \begin{pmatrix} u_1 \\ u_2 \\ \vdots \\ u_m\end{pmatrix} = \begin{pmatrix}v_1u_1 + v_2u_2 + \ldots + v_mu_m\end{pmatrix} = \begin{pmatrix} u_1 & u_2 & \cdots & u_m\end{pmatrix} \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_m\end{pmatrix} = \mathbf{u}^T\mathbf{v} = \mathbf{v}\cdot\mathbf{u}.\]

I hope this clarifies things!
 
I'd like to point out that $B = A^T$ is the ONLY $n\times m$ matrix that makes:

$Ax \cdot y = x \cdot By$ true for ALL $x \in \Bbb R^n, y \in \Bbb R^m$.

For suppose $B$ is such a matrix. Since it holds for ALL $x,y$ it certainly must hold when:

$x = e_j = (0,\dots,1,\dots,0)$ (1 is in the $j$-th place) <--this vector is in $\Bbb R^n$
$y = e_i = (0,\dots,1,\dots,0)$ (1 is in the $i$-th place) <--this vector is in $\Bbb R^m$

Now $Ae_j = (a_{1j},\dots,a_{mj})$ (the $j$-th column of $A$) so

$Ae_j \cdot e_i = a_{ij}$ (the $i$-th entry of the $j$-th column of $A$).

By the same token, $Be_i = (b_{1i},\dots,b_{ni})$ and

$e_j \cdot Be_i = b_{ji}$.

Comparing the two (and using that $Ae_j \cdot e_i = e_j \cdot Be_i$), we see that: $b_{ji} = a_{ij}$, that is $B = A^T$.

So the equation $Ax \cdot y = x \cdot A^Ty$ can be used to DEFINE the transpose. This is actually useful later on when one is trying to define things in a "coordinates-free" way (Vector spaces don't come equipped with a "preferred" basis, we have to pick one, which is somewhat arbitrary. If we can prove things without using a particular chosen basis, our proof is "more general", which is typically seen as a GOOD thing in mathematics).
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K