Problem involving matrix multiplication and dot product in one proof

Click For Summary

Discussion Overview

The discussion revolves around a mathematical problem involving matrix multiplication and the dot product. Participants are tasked with proving that the dot product of the matrix-vector product \(Ax\) with a vector \(y\) is equal to the dot product of the vector \(x\) with the transpose of the matrix \(A\) applied to \(y\). The conversation includes various approaches to the proof, as well as clarifications on the definitions involved.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant begins by outlining their approach using definitions of the dot product and matrix multiplication, expressing confusion about simplifying nested sums.
  • Another participant suggests relabeling indices in the nested sum and using properties of the transpose to help prove the equivalence of the two expressions.
  • A different participant presents an alternative method involving expressing the dot product in terms of matrix multiplication, leading to a series of equalities that demonstrate the desired result.
  • Another contribution emphasizes the simplification of the proof by noting properties of \(1 \times 1\) matrices and rearranging terms in the sums to show equivalence.
  • One participant expresses uncertainty about the validity of a specific assumption regarding the dot product expressed as a matrix multiplication.
  • A later reply clarifies the reasoning behind expressing the dot product in terms of transposed matrices, reinforcing the necessity of transposing one of the vectors for proper matrix multiplication.
  • Another participant asserts that the transpose \(B = A^T\) is the only matrix that satisfies the equality for all vectors \(x\) and \(y\), providing a proof by considering standard basis vectors.

Areas of Agreement / Disagreement

Participants present multiple approaches and perspectives on the proof, with some expressing uncertainty about specific steps or assumptions. There is no consensus on a single method or resolution of the problem, as various viewpoints and techniques are explored.

Contextual Notes

Some participants note the importance of understanding the definitions and properties of matrix operations and dot products, while others highlight the need for careful handling of indices and transpositions in proofs.

gucci1
Messages
12
Reaction score
0
The problem is:

Let A be a real m x n matrix and let x be in R^n and y be in R^m (n and m dimensional real vector spaces, respectively). Show that the dot product of Ax with y equals the dot product of x with A^Ty (A^T is the transpose of A).

The way I went about starting this problem is to use the definitions where the definition of the dot product on real numbers is: the sum with k from 1 to n of ak * bk and the definition of matrix multiplication for the entries of Ax would each be of the form: sum from k=1 to k=n of Aik * xk1.

Hopefully that was clear enough, but what I come up with when I plug the second definition into the first is one sum inside another and it seems like either I'm missing something that I can simplify or I went in the wrong direction! Does anyone have any suggestions for what I could try here? Any help is appreciated :D
 
Physics news on Phys.org
Have you tried relabelling the indices of your nested sum? That and using the fact that $A^T$ is the transpose of $A$ should lead you to a proof that the two expressions are the same.

What I mean is in your nested sum, reverse the two indices $i$ and $j$ (or whatever their name is) and then state that $A_{ij} = A^T_{ji}$, and that should give you the desired result.​
 
Last edited:
gucci said:
The problem is:

Let A be a real m x n matrix and let x be in R^n and y be in R^m (n and m dimensional real vector spaces, respectively). Show that the dot product of Ax with y equals the dot product of x with A^Ty (A^T is the transpose of A).

The way I went about starting this problem is to use the definitions where the definition of the dot product on real numbers is: the sum with k from 1 to n of ak * bk and the definition of matrix multiplication for the entries of Ax would each be of the form: sum from k=1 to k=n of Aik * xk1.

Hopefully that was clear enough, but what I come up with when I plug the second definition into the first is one sum inside another and it seems like either I'm missing something that I can simplify or I went in the wrong direction! Does anyone have any suggestions for what I could try here? Any help is appreciated :D

Here's an alternative to index labeling; it involves using matrix multiplication and noting that you can express the dot product as follows:

\[(A\mathbf{x})\cdot \mathbf{y} = \mathbf{y}^TA\mathbf{x}.\]

Likewise, $(A^T\mathbf{y})\cdot\mathbf{x} = \mathbf{x}^TA^T\mathbf{y}=(\mathbf{y}^TA\mathbf{x})^T$. Since $\mathbf{x}^TA^T\mathbf{y}$ and $\mathbf{y}^TA\mathbf{x}$ are $1\times 1$ matrices, we must have
\[(\mathbf{x}^TA^T\mathbf{y})^T =\mathbf{x}^TA^T\mathbf{y}\quad\text{and} \quad (\mathbf{y}^TA\mathbf{x})^T = \mathbf{y}^TA\mathbf{x}\]
and thus
\[\mathbf{x}^TA^T\mathbf{y} = (\mathbf{y}^TA\mathbf{x})^T=\mathbf{y}^TA\mathbf{x}.\]

Therefore, it follows that
\[(A\mathbf{x})\cdot\mathbf{y} = \mathbf{y}^TA\mathbf{x} = (\mathbf{x}^T A^T\mathbf{y})^T = \mathbf{x}^TA^T\mathbf{y} = (A^T\mathbf{y})\cdot\mathbf{x}\]
which is what you were after.

I hope this makes sense!
 
I would simplify the presentation by noting that for any 1x1 matrix $M$, $M^T = M$

(there is only the single diagonal element).

Thus:

$Ax \cdot y = (y^T)(Ax) = [(y^T)(Ax)]^T = (Ax)^T(y^T)^T = (x^TA^T)y = x^T(A^Ty) = A^Ty \cdot x = x \cdot A^Ty$

(this only works with REAL inner-product spaces, by the way)

This is pretty much the same as what Chris L T521 posted, but has fewer steps.

Working with just elements, we have, for:

$A = (a_{ij}), x = (x_1,\dots,x_m), y = (y_1,\dots,y_n)$

$\displaystyle Ax \cdot y = \sum_{i = 1}^m \left(\sum_{j = 1}^n a_{ij}x_j\right) y_i$

$ = (a_{11}x_1 + \cdots + a_{1n}x_n)y_1 + \cdots + (a_{m1}x_1 + \cdots + a_{mn}x_n)y_m$

$ = (a_{11}y_1 + \cdots + a_{m1}y_m)x_1 + \cdots (a_{1n}y_1 + \cdots + a_{mn}y_m)x_n$

(make sure you understand that each term $a_{ij}x_jy_i = a_{ij}y_ix_j$ only occurs once in each sum, we're just grouping them differently...in the first sum we're matching the row entry index of $A$ with the index of $y$, in the second sum we're matching the column index of $A$ with the index of $x$, and the transpose just switches rows with columns).

$\displaystyle = \sum_{j = 1}^n \left(\sum_{i = 1}^m a_{ji}y_i \right)x_j$

$= A^Ty \cdot x = x \cdot A^Ty$
 
Chris L T521 said:
Here's an alternative to index labeling; it involves using matrix multiplication and noting that you can express the dot product as follows:

\[(A\mathbf{x})\cdot \mathbf{y} = \mathbf{y}^TA\mathbf{x}.\]

So I get why everything after this step would follow, and thank you all very much for that. I am just missing why this is a true assumption :-/
 
gucci said:
So I get why everything after this step would follow, and thank you all very much for that. I am just missing why this is a true assumption :-/

Well if $\mathbf{u},\mathbb{v}\in\mathbb{R}^m$ where $\mathbf{u}=\begin{pmatrix} u_1 \\ u_2 \\ \vdots \\ u_m\end{pmatrix}$ and $\mathbf{v}=\begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_m\end{pmatrix}$, then we know that $\mathbf{u}\cdot \mathbf{v} = u_1v_1 + u_2v_2 + \ldots + u_mv_m$
(note that as a matrix, a scalar quantity is a $1\times 1$ matrix).

However, $\mathbf{u}$ and $\mathbf{v}$ are $m\times 1$ matrices; thus, if we were to express the dot product in terms of matrix multiplication, we must have one $m\times 1$ and one $1\times m$ matrix. Hence, we need to take the transpose of one of the vectors to accomplish this. With that said, we can then say that

\[\mathbf{u}\cdot\mathbf{v} = \mathbf{v}^T\mathbf{u} = \begin{pmatrix} v_1 & v_2 & \cdots & v_m \end{pmatrix} \begin{pmatrix} u_1 \\ u_2 \\ \vdots \\ u_m\end{pmatrix} = \begin{pmatrix}v_1u_1 + v_2u_2 + \ldots + v_mu_m\end{pmatrix} = \begin{pmatrix} u_1 & u_2 & \cdots & u_m\end{pmatrix} \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_m\end{pmatrix} = \mathbf{u}^T\mathbf{v} = \mathbf{v}\cdot\mathbf{u}.\]

I hope this clarifies things!
 
I'd like to point out that $B = A^T$ is the ONLY $n\times m$ matrix that makes:

$Ax \cdot y = x \cdot By$ true for ALL $x \in \Bbb R^n, y \in \Bbb R^m$.

For suppose $B$ is such a matrix. Since it holds for ALL $x,y$ it certainly must hold when:

$x = e_j = (0,\dots,1,\dots,0)$ (1 is in the $j$-th place) <--this vector is in $\Bbb R^n$
$y = e_i = (0,\dots,1,\dots,0)$ (1 is in the $i$-th place) <--this vector is in $\Bbb R^m$

Now $Ae_j = (a_{1j},\dots,a_{mj})$ (the $j$-th column of $A$) so

$Ae_j \cdot e_i = a_{ij}$ (the $i$-th entry of the $j$-th column of $A$).

By the same token, $Be_i = (b_{1i},\dots,b_{ni})$ and

$e_j \cdot Be_i = b_{ji}$.

Comparing the two (and using that $Ae_j \cdot e_i = e_j \cdot Be_i$), we see that: $b_{ji} = a_{ij}$, that is $B = A^T$.

So the equation $Ax \cdot y = x \cdot A^Ty$ can be used to DEFINE the transpose. This is actually useful later on when one is trying to define things in a "coordinates-free" way (Vector spaces don't come equipped with a "preferred" basis, we have to pick one, which is somewhat arbitrary. If we can prove things without using a particular chosen basis, our proof is "more general", which is typically seen as a GOOD thing in mathematics).
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K