Problem involving matrix multiplication and dot product in one proof

gucci1 · Sep 16, 2013

The problem is:

Let A be a real m x n matrix and let x be in R^n and y be in R^m (n and m dimensional real vector spaces, respectively). Show that the dot product of Ax with y equals the dot product of x with A^Ty (A^T is the transpose of A).

The way I went about starting this problem is to use the definitions where the definition of the dot product on real numbers is: the sum with k from 1 to n of ak * bk and the definition of matrix multiplication for the entries of Ax would each be of the form: sum from k=1 to k=n of Aik * xk1.

Hopefully that was clear enough, but what I come up with when I plug the second definition into the first is one sum inside another and it seems like either I'm missing something that I can simplify or I went in the wrong direction! Does anyone have any suggestions for what I could try here? Any help is appreciated :D

Nono713 · Sep 16, 2013

Have you tried relabelling the indices of your nested sum? That and using the fact that $A^T$ is the transpose of $A$ should lead you to a proof that the two expressions are the same.

What I mean is in your nested sum, reverse the two indices $i$ and $j$ (or whatever their name is) and then state that $A_{ij} = A^T_{ji}$, and that should give you the desired result.

Chris L T521 · Sep 16, 2013

gucci said:

The problem is:

Let A be a real m x n matrix and let x be in R^n and y be in R^m (n and m dimensional real vector spaces, respectively). Show that the dot product of Ax with y equals the dot product of x with A^Ty (A^T is the transpose of A).

The way I went about starting this problem is to use the definitions where the definition of the dot product on real numbers is: the sum with k from 1 to n of ak * bk and the definition of matrix multiplication for the entries of Ax would each be of the form: sum from k=1 to k=n of Aik * xk1.

Hopefully that was clear enough, but what I come up with when I plug the second definition into the first is one sum inside another and it seems like either I'm missing something that I can simplify or I went in the wrong direction! Does anyone have any suggestions for what I could try here? Any help is appreciated :D

Here's an alternative to index labeling; it involves using matrix multiplication and noting that you can express the dot product as follows:

\[(A\mathbf{x})\cdot \mathbf{y} = \mathbf{y}^TA\mathbf{x}.\]

Likewise, $(A^T\mathbf{y})\cdot\mathbf{x} = \mathbf{x}^TA^T\mathbf{y}=(\mathbf{y}^TA\mathbf{x})^T$. Since $\mathbf{x}^TA^T\mathbf{y}$ and $\mathbf{y}^TA\mathbf{x}$ are $1\times 1$ matrices, we must have
\[(\mathbf{x}^TA^T\mathbf{y})^T =\mathbf{x}^TA^T\mathbf{y}\quad\text{and} \quad (\mathbf{y}^TA\mathbf{x})^T = \mathbf{y}^TA\mathbf{x}\]
and thus
\[\mathbf{x}^TA^T\mathbf{y} = (\mathbf{y}^TA\mathbf{x})^T=\mathbf{y}^TA\mathbf{x}.\]

Therefore, it follows that
\[(A\mathbf{x})\cdot\mathbf{y} = \mathbf{y}^TA\mathbf{x} = (\mathbf{x}^T A^T\mathbf{y})^T = \mathbf{x}^TA^T\mathbf{y} = (A^T\mathbf{y})\cdot\mathbf{x}\]
which is what you were after.

I hope this makes sense!

Deveno · Sep 17, 2013

I would simplify the presentation by noting that for any 1x1 matrix $M$, $M^T = M$

(there is only the single diagonal element).

Thus:

$Ax \cdot y = (y^T)(Ax) = [(y^T)(Ax)]^T = (Ax)^T(y^T)^T = (x^TA^T)y = x^T(A^Ty) = A^Ty \cdot x = x \cdot A^Ty$

(this only works with REAL inner-product spaces, by the way)

This is pretty much the same as what Chris L T521 posted, but has fewer steps.

Working with just elements, we have, for:

$A = (a_{ij}), x = (x_1,\dots,x_m), y = (y_1,\dots,y_n)$

$\displaystyle Ax \cdot y = \sum_{i = 1}^m \left(\sum_{j = 1}^n a_{ij}x_j\right) y_i$

$ = (a_{11}x_1 + \cdots + a_{1n}x_n)y_1 + \cdots + (a_{m1}x_1 + \cdots + a_{mn}x_n)y_m$

$ = (a_{11}y_1 + \cdots + a_{m1}y_m)x_1 + \cdots (a_{1n}y_1 + \cdots + a_{mn}y_m)x_n$

(make sure you understand that each term $a_{ij}x_jy_i = a_{ij}y_ix_j$ only occurs once in each sum, we're just grouping them differently...in the first sum we're matching the row entry index of $A$ with the index of $y$, in the second sum we're matching the column index of $A$ with the index of $x$, and the transpose just switches rows with columns).

$\displaystyle = \sum_{j = 1}^n \left(\sum_{i = 1}^m a_{ji}y_i \right)x_j$

$= A^Ty \cdot x = x \cdot A^Ty$

gucci1 · Sep 17, 2013

Chris L T521 said:

Here's an alternative to index labeling; it involves using matrix multiplication and noting that you can express the dot product as follows:

\[(A\mathbf{x})\cdot \mathbf{y} = \mathbf{y}^TA\mathbf{x}.\]

So I get why everything after this step would follow, and thank you all very much for that. I am just missing why this is a true assumption :-/

Chris L T521 · Sep 17, 2013

gucci said:

So I get why everything after this step would follow, and thank you all very much for that. I am just missing why this is a true assumption :-/

Well if $\mathbf{u},\mathbb{v}\in\mathbb{R}^m$ where $\mathbf{u}=\begin{pmatrix} u_1 \\ u_2 \\ \vdots \\ u_m\end{pmatrix}$ and $\mathbf{v}=\begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_m\end{pmatrix}$, then we know that $\mathbf{u}\cdot \mathbf{v} = u_1v_1 + u_2v_2 + \ldots + u_mv_m$
(note that as a matrix, a scalar quantity is a $1\times 1$ matrix).

However, $\mathbf{u}$ and $\mathbf{v}$ are $m\times 1$ matrices; thus, if we were to express the dot product in terms of matrix multiplication, we must have one $m\times 1$ and one $1\times m$ matrix. Hence, we need to take the transpose of one of the vectors to accomplish this. With that said, we can then say that

\[\mathbf{u}\cdot\mathbf{v} = \mathbf{v}^T\mathbf{u} = \begin{pmatrix} v_1 & v_2 & \cdots & v_m \end{pmatrix} \begin{pmatrix} u_1 \\ u_2 \\ \vdots \\ u_m\end{pmatrix} = \begin{pmatrix}v_1u_1 + v_2u_2 + \ldots + v_mu_m\end{pmatrix} = \begin{pmatrix} u_1 & u_2 & \cdots & u_m\end{pmatrix} \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_m\end{pmatrix} = \mathbf{u}^T\mathbf{v} = \mathbf{v}\cdot\mathbf{u}.\]

I hope this clarifies things!

Deveno · Sep 18, 2013

I'd like to point out that $B = A^T$ is the ONLY $n\times m$ matrix that makes:

$Ax \cdot y = x \cdot By$ true for ALL $x \in \Bbb R^n, y \in \Bbb R^m$.

For suppose $B$ is such a matrix. Since it holds for ALL $x,y$ it certainly must hold when:

$x = e_j = (0,\dots,1,\dots,0)$ (1 is in the $j$-th place) <--this vector is in $\Bbb R^n$
$y = e_i = (0,\dots,1,\dots,0)$ (1 is in the $i$-th place) <--this vector is in $\Bbb R^m$

Now $Ae_j = (a_{1j},\dots,a_{mj})$ (the $j$-th column of $A$) so

$Ae_j \cdot e_i = a_{ij}$ (the $i$-th entry of the $j$-th column of $A$).

By the same token, $Be_i = (b_{1i},\dots,b_{ni})$ and

$e_j \cdot Be_i = b_{ji}$.

Comparing the two (and using that $Ae_j \cdot e_i = e_j \cdot Be_i$), we see that: $b_{ji} = a_{ij}$, that is $B = A^T$.

So the equation $Ax \cdot y = x \cdot A^Ty$ can be used to DEFINE the transpose. This is actually useful later on when one is trying to define things in a "coordinates-free" way (Vector spaces don't come equipped with a "preferred" basis, we have to pick one, which is somewhat arbitrary. If we can prove things without using a particular chosen basis, our proof is "more general", which is typically seen as a GOOD thing in mathematics).

Problem involving matrix multiplication and dot product in one proof

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad About the existence of Hamel basis for vector spaces

Undergrad How to define a vector field?

Graduate Confusion about the Moyal-Weyl twist

Undergrad The vector to which a dual vector corresponds

Undergrad 2 interpretations of bra-ket expression: equal, & isomorphic, but...

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight