Why Do Tensor Products Seem So Challenging?

Math Amateur · Feb 17, 2014

I have recently begun the task of trying to understand tensor products, but I must admit, I have found the going difficult.

I have been working (mainly) from Dummit and Foote, which I have previously found to be a fairly "friendly" text for the person engaged in self study ... but the treatment of tensor products I have found, as I mentioned above, a challenge ...

I wonder if others have found this topic challenging?

Maybe I have found the topic difficult because I do not have enough agility and knowledge with the basic algebra that tensor products relies upon?

It would be truly wonderful if there was a math note/tutorial on tensor products ...

Peter

Note: I have also looked at the online note: Tensor Products I by Keith Conrad and the treatments of tensor products in the following books

1. Paul B. Garret: Abstract Algebra
2. Paulo Aluffi: Algebra: Chapter 0
3. Joseph J. Rotman: Advanced Modern Algebra (Second Edition)

In Aluffi and Rotman I keep bumping into notions from category theory that worry me a bit!

ThePerfectHacker · Feb 18, 2014

It is challenging when you think of it abstractly.

Maybe this is helpful. Say you have a set $\{a,b\}$. You can use this set to form a free-abelian group.

Here is the naive (easy) way of thinking. The free-abelian group of $\{a,b\}$ is simply all possible commutative sums you can form. So it is the group consists of the following elements:
$$a,2a,3a,..a+b,2a+b,3a+b,...,2a+b,2a+2b,2a+3b,...,-a,-b,-2a,...$$
You add them in an ordinary way, so for instance, $a+2b + 3a+b = 4a+3b$. This defines a commutative group structure and it consists of inverses. So this is a group. Call the free-abelian group formed by the set $\{a,b\}$. (It is called "free" because there is no relationship between $a$ and $b$, they never cancel out in any way, so the terminology is that $a$ and $b$ are free from any relation between them).

But saying "all possible combinations of the form", is not exactly rigorous mathematics. So mathematicians have to make thinks complicated for everyone by doing something like this:

Consider all functions $f:\{a,b\} \to \mathbb{Z}$. Suppose that $f(a) = n$ and $f(b) = m$ we use the notation $na + mb$ to denote such a function, where $n,m\in \mathbb{Z}$. For example, $3a-b$ is the function $f:\{a,b\}\to \mathbb{Z}$ such that $f(a) =3$ and $f(b) = -1$. Now let $G$ be the set of all such functions under pointwise addition. This is what we formally mean by the free-abelian group of $\{a,b\}$ above.

See how complicated mathematicians make things with their rigor? Now say $S$ is an arbitrary set then how do we define $G(S)$, the free-abelian group generated by $S$?

The naive idea is simple we just consider all sums $n_1a_1 + ... + n_ka_k$ where $a_j \in S$ and $n_j\in \mathbb{Z}$ and add them like we add ordinary variables. But again this is not precise. So mathematicians get crazy and define,

Definition: Given a non-empty set $S$ the free-abelian group generated by $S$, denoted by $G(S)$,
$$ G(S) = \left\{ f \in \mathbb{Z}^S | f(x) = 0 \text{ for almost all }x\in S\right\}$$
We give $G(S)$ group structure by pointwise addition.

[To avoid confusion, $\mathbb{Z}^S$ is notation for all functions from $S$ to $\mathbb{Z}$ and almost-all means that $f = 0$ for every point in $S$ except for finitely many.]

The concept of a tensor-product is very similar. You have two modules $M,N$ over a ring $R$. The tensor product is essentially just like a free-group. You basically consider all expressions of the form $m\otimes n$ where $m\in M$ and $n\in N$. With the rules that $r(m\otimes n) = (rm)\otimes n = m\otimes(rn)$ and that $(m_1+m_2)\otimes n = m_1\otimes n + m_2\otimes n$ (and similar distributive law on other side). Also you are allowed to have something like $m_1\otimes n_1 + m_2\otimes n_2$. There is no way to combine those two together so you just leave a form sum between those two. And that is basically it.

This naive way of thinking about it aint bad. And that is how I, and I imagine just about everyone else thinks about it.

But the secret is that mathematicians are embarrased to admit that they think of the tensor product in this way. So they make things insanely complicated by saying, "Consider the free-module formed by $M\times N$ and the submodule generated by ... mod out by the submodule ...".

Math Amateur · Feb 18, 2014

ThePerfectHacker said:

It is challenging when you think of it abstractly.

Maybe this is helpful. Say you have a set $\{a,b\}$. You can use this set to form a free-abelian group.

Here is the naive (easy) way of thinking. The free-abelian group of $\{a,b\}$ is simply all possible commutative sums you can form. So it is the group consists of the following elements:
$$a,2a,3a,..a+b,2a+b,3a+b,...,2a+b,2a+2b,2a+3b,...,-a,-b,-2a,...$$
You add them in an ordinary way, so for instance, $a+2b + 3a+b = 4a+3b$. This defines a commutative group structure and it consists of inverses. So this is a group. Call the free-abelian group formed by the set $\{a,b\}$. (It is called "free" because there is no relationship between $a$ and $b$, they never cancel out in any way, so the terminology is that $a$ and $b$ are free from any relation between them).

But saying "all possible combinations of the form", is not exactly rigorous mathematics. So mathematicians have to make thinks complicated for everyone by doing something like this:

Consider all functions $f:\{a,b\} \to \mathbb{Z}$. Suppose that $f(a) = n$ and $f(b) = m$ we use the notation $na + mb$ to denote such a function, where $n,m\in \mathbb{Z}$. For example, $3a-b$ is the function $f:\{a,b\}\to \mathbb{Z}$ such that $f(a) =3$ and $f(b) = -1$. Now let $G$ be the set of all such functions under pointwise addition. This is what we formally mean by the free-abelian group of $\{a,b\}$ above.

See how complicated mathematicians make things with their rigor? Now say $S$ is an arbitrary set then how do we define $G(S)$, the free-abelian group generated by $S$?

The naive idea is simple we just consider all sums $n_1a_1 + ... + n_ka_k$ where $a_j \in S$ and $n_j\in \mathbb{Z}$ and add them like we add ordinary variables. But again this is not precise. So mathematicians get crazy and define,

Definition: Given a non-empty set $S$ the free-abelian group generated by $S$, denoted by $G(S)$,
$$ G(S) = \left\{ f \in \mathbb{Z}^S | f(x) = 0 \text{ for almost all }x\in S\right\}$$
We give $G(S)$ group structure by pointwise addition.

[To avoid confusion, $\mathbb{Z}^S$ is notation for all functions from $S$ to $\mathbb{Z}$ and almost-all means that $f = 0$ for every point in $S$ except for finitely many.]

The concept of a tensor-product is very similar. You have two modules $M,N$ over a ring $R$. The tensor product is essentially just like a free-group. You basically consider all expressions of the form $m\otimes n$ where $m\in M$ and $n\in N$. With the rules that $r(m\otimes n) = (rm)\otimes n = m\otimes(rn)$ and that $(m_1+m_2)\otimes n = m_1\otimes n + m_2\otimes n$ (and similar distributive law on other side). Also you are allowed to have something like $m_1\otimes n_1 + m_2\otimes n_2$. There is no way to combine those two together so you just leave a form sum between those two. And that is basically it.

This naive way of thinking about it aint bad. And that is how I, and I imagine just about everyone else thinks about it.

But the secret is that mathematicians are embarrased to admit that they think of the tensor product in this way. So they make things insanely complicated by saying, "Consider the free-module formed by $M\times N$ and the submodule generated by ... mod out by the submodule ...".

Thanks so much ThePerfectHacker ...

Will work through this carefully now ...

Peter

EDIT - Just read through it - very helpful! Thanks again!

ThePerfectHacker · Feb 18, 2014

I suggest to maybe ignore the formal definition and try to solve some exercises involving the tensor product in this naive way.

Here is an exercise. Consider $\mathbb{Z}_n$ and $\mathbb{Z}_m$ are $\mathbb{Z}$-modules in the usual sense. Prove that if $(n,m) = 1$ then,
$$ \mathbb{Z}_n \otimes \mathbb{Z}_m = 0 $$ Of course, what does it mean by $0$? It just means that the tensor product, which is a module over $\mathbb{Z}$, is a zero-module.

Def: A module $M$ over a ring $R$ is called a zero-module if $M = \{ 0 \}$ where $0$ is its identity element.

Try to prove this doing it naively and the hint that there exists $x,y$, integers, so that $nx + my = 1$.

Math Amateur · Feb 18, 2014

ThePerfectHacker said:

I suggest to maybe ignore the formal definition and try to solve some exercises involving the tensor product in this naive way.

Here is an exercise. Consider $\mathbb{Z}_n$ and $\mathbb{Z}_m$ are $\mathbb{Z}$-modules in the usual sense. Prove that if $(n,m) = 1$ then,
$$ \mathbb{Z}_n \otimes \mathbb{Z}_m = 0 $$ Of course, what does it mean by $0$? It just means that the tensor product, which is a module over $\mathbb{Z}$, is a zero-module.

Def: A module $M$ over a ring $R$ is called a zero-module if $M = \{ 0 \}$ where $0$ is its identity element.

Try to prove this doing it naively and the hint that there exists $x,y$, integers, so that $nx + my = 1$.

Hi ThePerfectHacker,

I will be finishing work on maths for the day as it is about 8pm here in Tasmania ... will get onto this problem tomorrow ... (generally work through the day on maths ...)

Excellent to have your help ... any other problems/exercises that you post I will work on ... most helpful!

Peter
Peter

Deveno · Feb 19, 2014

I think that even Keith Conrad points out in his notes that the "explicit" construction of the tensor product is not all that helpful (it's not easy to work in free objects, and it's even harder to work in the quotient via representatives and then look for "cancellation").

The point is: the tensor product EXISTS (the whole point of the explicit construction is solely to show this), and that it is a BILINEAR product. Now forget "what it is" and just use its PROPERITES.

For example, to show that a cyclic group exists, one would need to produce one, first. So one takes a generating set (which one is "secretly" thinking of as being related to the generator of the cyclic group), and forms the free group on that generator (which turns out to be isomorphic to $\Bbb Z$, but that's irrelevant right now).

Now we take the quotient by all "formal products" of our generator (let's call it $a$) of our free group of the form $w^n$ (where $w = a^k$ for any integer $k$). Call this subgroup $H$.

Just as our free group is large, so is this quotient. But from it, out pops:

$[a^n] = [e]$ (since $a^nH = H$ in the quotient).

Now we have a FINITE group, generated by $[a]$, and we can prove that $[a]$ has order $n$. Voila! We've created a cyclic group of order $n$.

But...no one actually does this: instead we create the group $\Bbb Z/n\Bbb Z$.

Or, we just say:

Let $G = \{\langle a\rangle: a^n = e\}$ (generator and relation form).

It really DOES NOT MATTER HOW (by hook or by crook) you wind up "creating" a cyclic group of order $n$. They all act the same, so who cares?

Tensor products are the same way: if you create a bilinear map $B$ from two modules $M,N$, such that any OTHER bilinear map $F:M\times N \to P$ induces a linear map $L$ with:

$L(B(m,n)) = F(m,n)$

then "$B$" IS the tensor product.

ANY WAY YOU DO THIS IS OK.

For example, take the (bilinear) map:

$B:\Bbb R \times \Bbb R \to \Bbb R$ given by: $B(x,y) = xy$.

Suppose $F$ is any other bilinear map. Since $F$ is bilinear:

$F(a,b) = aF(1,b) = abF(1,1)$.

Say $F(1,1) = r \in \Bbb R$. We then have:

$F(x,y) = (xy)r = r(B(x,y))$.

And the map $L_r:s \mapsto rs$ is a linear map $\Bbb R \to \Bbb R$, so we have:

$F = L_r \circ B$.

This means that $\Bbb R \otimes_{\Bbb R} \Bbb R = \Bbb R$

with $x \otimes y = xy$, because this mapping satisfies our universal bilinearity property (we didn't even have to figure out what $F$ actually was!).

A little thought will reveal that $\Bbb R$ is indeed spanned by the simple tensor $1\otimes1 = (1)(1) = 1$ over itself.

ThePerfectHacker · Feb 19, 2014

Here is a non-constructive way of defining the tensor product. Maybe this will be easier. I think the constructive way (free-module over R mod out by some ideal generated along bilinear relations) is confusing.

Definition 1: Let $A,B,C$ be $R$-modules. A map $b:A\times B\to C$ is called a bilinear map iff it satisfies the following properties:
(i) $b(x_1+x_2,y) = b(x_1,y) + b(x_2,y)$
(ii) $b(x,y_1+y_2) = b(x,y_1) + b(x,y_2)$
(iii) $b(rx,y) = rb(x,y)$
(iv) $b(x,ry) = rb(x,y)$
Sometimes it is also called bilinear $R$-module homomorphism.

Theorem 1: Given two $R$-modules, $M$ and $N$, there exists an $R$-module $T$ together with a bilinear map $b:M\times N\to T$ so that given any $R$-module $P$ and a bilinear map $f:M\times N \to P$ there exists a unique $R$-module homomorphism $g:T\to P$ such that the diagram commutes i.e. $f = gb$.

Definition 2: Given two $R$-modules, $M$ and $N$, the tensor product of $M$ and $N$, is defined by a module $T$ guaranteed by the theorem above.

One should ask the natural question. Perhaps there are many different modules $T$ that satisfy the properties of Theorem 1? If so then we cannot call it THE tensor product between $M$ and $N$, but rather A tensor product between $M$ and $N$. It turns out that if there are two such $R$-modules which satisfy the conditions of the theorem then they are isomorphic as $R$-modules. So in a sense we are justified to call it the tensor product.

Notation: Given two $R$-modules $M$ and $N$ let $T$ be the module guaranteed by the theorem together with the bilinear map $b$ which satisfies that universal property of the theorem. We denote $m\otimes n$ to be the image of $(m,n)$ under $b$, in other words, $m\otimes n = b(m,n)$.

Now since $b$ is a bilinear map we know, for example, $b(m_1+m_2,n) = b(m_1,n) + b(m_2,n)$. Using our fancy notation what we are saying is that $(m_1+m_2)\otimes n = m_1\otimes n + m_2\otimes n$. [To avoid confusion, the $+$ sign on the LHS is taking place in the module $M$ and the $+$ sign on the RHS is taken place in the module $T$.]

Based on the same reasoning it also follows that $(rm)\otimes n = r(m\otimes n)$. [Again were scalar multiplication on LHS is taking place in $M$ and scalar multiplication on RHS is taking place in $T$.]

Why Do Tensor Products Seem So Challenging?

1. What are tensors?

2. How are tensors different from other mathematical objects?

3. What is the notation used for tensors?

4. What are some common operations performed on tensors?

5. How are tensors used in machine learning and data analysis?

Similar threads

Hot Threads

Recent Insights