# Tensor product of vector spaces

1. Jun 12, 2008

### Fredrik

Staff Emeritus
I'm reading the Wikipedia article, trying to understand the definition of the tensor product $V\otimes W$ of two vector spaces V and W. The first step is to take the cartesian product $V\times W$. The next step is to define the "free vector space" $F(V\times W)$ as the set of all linear combinations of members of $V\times W$. But how does that make sense when we haven't even defined the sum of two members of $V\times W$?

I'm tempted to interpret the linear combination as just a string of text at this point, but then I can't make sense of the claim that the $e_{v\times w}$ are taken to be linearly independent for distinct $v\times w$.

Can someone help me make sense of this definition?

2. Jun 12, 2008

### eastside00_99

Remember a vector space involves always specifying field F first. If V and W are vector spaces over a field F, then indeed the Cartesian product of V and W is a vector space over F via the following rule: let v,v' be in V and w,w' be in W, then define (v,w) + (v',w')=(v+v',w+w') where v+v' is the addition operation in V and w+w' is the addition operation in W.

Let h be a bilinear functional from VxW to F. The tensor product of V and W is the unique space $$V \otimes W$$ such that there exists a linear functional $$h_{*} : V \otimes W \rightarrow F$$ such that $$h_{*} (x \otimes y) =h(x,y)$$. This space always exists and as said before is unique. The construction of the space is what you are considering. I should mention that my definition is a little bit restrictive. I chose to work with F but you could extend the definition to replace F with any vector space over F (note F is a vector space over itself). The construction is simple:

Consider the sub space Z generated by all elements of the form (v+v', w)-(v,w) -(v',w') and (v,w+w')-(v,w)-(v,w'). Then take the tensor product of V and W to be VxW/Z.

3. Jun 12, 2008

### Fredrik

Staff Emeritus
Is that really the correct rule here? It's a very natural and obvious way to define a vector space structure on the Cartesian product, and it is the way to do it when we're defining the "direct sum" of two vector spaces. But this is the tensor product. This is supposed to be the way we use the Hilbert space of one-particle states in quantum mechanics to construct the Hilbert space of two-particle states (by taking the tensor product of the space of one-particle states with itself). And doesn't this hold for two-particle states?

$$(|v\rangle+|v'\rangle)(|w\rangle+|w'\rangle) =|v\rangle|w\rangle+|v\rangle|w'\rangle+|v'\rangle|w\rangle+|v'\rangle|w'\rangle$$​

The left-hand side is what corresponds to (v+v',w+w'), but the right-hand side doesn't look like (v,w) + (v',w'). It has two extra terms.

Thanks for replying by the way. I haven't given much thought to the other things you said yet, but I will.

4. Jun 12, 2008

### Hurkyl

Staff Emeritus
That's a good way to think of it -- one of the most general notions of 'algebraic structure' consists of two parts:

(1) Given any set S, a definition of all valid 'algebraic expressions' generated from S

In this particular case, you can (almost) view F(S) as the set of all textual strings of the form $r_1 s_1 + r_2 s_2 + \cdots + r_n s_n$, where the r's are scalars and the s's are elements of S, together with the textual string 0. You just have to take some care to avoid redundancy; for example, you could insist each of the s's are distinct, none of the r's are zero, and that two strings with permuted terms are the same.

(2) A rule for simplifying 'expressions of expressions'

For example, if v and w are two elements of F(S), you could form the 'formal' expression v + w (i.e. an element of F(F(S))), and there is an obvious way to 'simplify' such an expression back into F(S). The same is true for scalar multiplication, and this gives you a vector space strcutre on F(S)

Last edited: Jun 12, 2008
5. Jun 12, 2008

### eastside00_99

Sorry, my last sentence should read:

Consider the sub space Z generated by all elements of the form (v+v', w)-(v,w) -(v',w)
[not (v+v', w)-(v,w) -(v',w')] and (v,w+w')-(v,w)-(v,w'). Then take the tensor product of V and W to be VxW/Z.

Your question is what is addition in VxW so that you could form F(VxW). What F(VxW) is is the vector space over F whose bases is {(v,w)| v in V and w in W}. So, I was wrong about addition. But, in the case the two definitions will coincide in the finite case and where V intersect W = empty set.

6. Jun 12, 2008

### Hurkyl

Staff Emeritus
I guess I forgot to add how what I was talking about (monads) connects with the usual way of specifying a vector space.

The main point is that to say V is a vector space with underlying set V amounts to specifying a map F(V) --> V -- i.e. a way to 'evaluate' a linear combination of elements of V. The expressions v + w and r v are both elements of F(V), so this map does indeed tell you how to add vectors and how to perform scalar multiplication.

7. Jun 12, 2008

### Fredrik

Staff Emeritus
Thanks for the contributions so far, both of you. I still don't get it, but I'll take another look at it tomorrow.

8. Jun 13, 2008

### Fredrik

Staff Emeritus
OK, I think I understand the definition now. This comment was helpful:
This means that F(S) (and I assume that by S you mean VxW) isn't the set of all strings that look like linear combinations of members of S. The members of F(S) are some equivalence classes of such strings. In particular s+s' is considered equivalent to s'+s, and (s+s')+s'' is considered equivalent to s+(s'+s''). Wikipedia didn't even mention this equivalence relation. Instead they just said something cryptic about linear independence as an explanation of why they wrote $e_{u\times v}$ instead of $(u,v)$.

Wikipedia then defines a second equivalence class, ~, and defines the tensor product space to be F(UxV)/~. I think I understand that part (now, not yesterday).

I looked up the word "monad" and it was one of those "head asplode" moments. It would probably take a week or more to understand the definition, so I'm not going to try any time soon. I actually find this stuff interesting, but it's just too much work.

9. Jun 13, 2008

### Fredrik

Staff Emeritus
I don't understand this. How can we even make sense of the notation $x\otimes y$ when we define $V \otimes W$ as "the unique space such that..."? Hm, I suppose you must have meant "the unique space with members that are equivalence classes of strings of text that look like linear combinations of members of UxV", or something like that.

I don't understand this either. I'm not sure what you mean by "generated by" in this context. Is it the same as "spanned by", i.e. that you take elements of that form to be basis vectors of Z? Is this method less complicated than Wikipedia's method?

10. Jun 13, 2008

### eastside00_99

technically you can write $$x\otimes y$$. The tensor product is the bilinear map denoted $$\otimes$$ from VxW to some vector space H over F such that given another vector space N over F and a bilinear map f: VxW --> N, you have a homomorphism f_* from H to N such that $$f = f_{*} \circ \otimes$$. the image of a point (x,y) in VxW under $$\otimes$$ is denoted by $$x\otimes y$$. So technically it is okay to write $$x\otimes y$$ and is the correct thing to say, but I probably should have stressed that the tensor product is a surjective bilinear function from VxW to this mysterious space H though.

Anyway, the two definitions are equivalent -- i.e., you can show this construction you are considering satisfies the above and is UNIQUE -- so do it which ever way you like, but I would start with think functorially about it and then do the construction.

Yes, by generated, I mean the span.

Last edited: Jun 13, 2008
11. Jun 13, 2008

### Fredrik

Staff Emeritus
So a formal definition would look something like this?

If V,W,X,Y are vector spaces over the same field F and there exist two bilinear surjections $f:V\times W\rightarrow X$ and $\otimes:V\times W\rightarrow Y$ and a function $f_*:Y\rightarrow X$ such that $f=f_*\circ\otimes$, then Y is said to be the tensor product space of V and W, and we write $V\otimes W$ instead of Y.

And the uniqueness would mean something like this?

If, in addition to the above, X',Y' are vector spaces over the same field F and there exist two bilinear surjections $f':V\times W\rightarrow X'$ and $\otimes':V\times W\rightarrow Y'$ and a function $f'_*:Y'\rightarrow X'$ such that $f'=f'_*\circ\otimes'$, then Y' is isomorphic to Y.

12. Jun 14, 2008

### eastside00_99

This is very close.

Let V,W, X be vector spaces over F. Let $\tau: V \times W \rightarrow X$ be a bilinear map. We say that $\tau$ is a direct product if given any vector space Y over F and any bilinear map $f: V \times W \rightarrow Y$, then there exists a linear map $f_{*}: X \rightarrow Y$ such that $f=f_{*}\circ\tau$.

It is easy to show, abstractly--i.e., without constructing the tensor product of V and W--that $\tau$ and X are unique. And, so when we have such a bilinear function $\tau$ and a F-vector space X, we may write $\otimes$ instead of $\tau$ and $V\otimes W$ instead of X. Instead of $\otimes(x,y)$, we write $x\otimes y$.

Last edited: Jun 14, 2008
13. Jun 15, 2008

### George Jones

Staff Emeritus
I'm going to make things a little more concrete (but still abstract) by outlining the construction of a tensor product space.

First, a little about vector spaces.

A (Hamel) basis for a vector space $V$ is a subset $B$ of $V$ that is linearly independent, and that spans $V$. Even if $V$ is infinite-dimensional, the concepts of linear independence and span involve only linear combinations of a finite number of vectors. In fact, without extra structure on $V$, it doesn't even make sense to talk about the sum of an infinite number of vectors. An infinite sum is a limit of the sequence of partial sums, and, without extra structure, the concept of limit can't be defined.

If $S$ is a set (of, e.g., distinguishable oranges), then the free vector space $F \left( S \right)$ is the vector space that has $S$ as a basis, i.e., the set of all (formal) finite linear combinations of elements of $S$.

What is a linear combination of oranges?

A concrete, rigorous realization of the above follows.

Let $S$ be a set. Define $F \left( S \right)$ to be the set of functions that map into a field, say $\mathbb{R}$, and such that each function is non-zero for only fintely many elements (in general, different for different functions). This finiteness will be used to reflect the fact that only sums of finite numbers of vectors are allowed. The definitions

$$\left( f + g \right) \left( s \right) := f \left( s \right) + g \left( s \right)$$

$$\left( \alpha f \right) \left( s \right) := \alpha f \left( s \right)$$

for $f$ and $g$ in $F \left( S \right)$ and $\alpha \in \mathbb{R}$ give $F \left( S \right)$ vector space structure.

$F \left( S \right)$ is the free vector space on set $S$. To see how $F \left( S \right)$ captures the idea of linear combinations of set $S$, consider the following functions.

For each $s \in S$, define an element $e_s \in F \left( S \right)$ by

$$e_s \left( s' \right) = \left( \begin{array}{cc} 1 & s=s'\\ 0 & s \neq s' \end{array} \right$$

Clearly, there is a bijection from $S$ to the set of all such functions. But each of these functions does live in vector space $F \left( S \right)$, so when we talk about linear combinations of elements of $S$, we really mean linear combinations of the appropriate functions $e_s$. It is fairly easy to show that the set of all such $e_s$ is a basis for $F \left( S \right)$.

If $V$ and $W$ are vector spaces, then applying the above to set $V \times W$ produces vector space $F \left( V \times W \right)$. The tensor product space $V \otimes W$ is found by forming a quotient vector space of the free vector space $F \left( V \times W \right)$ with an appropriate subspace. The subspace acts as the zero vector in the quotient space.

Since $\left( v , w \right)$, $\left( \alpha v' , w \right)$, and $\left( v + \alpha v' , w \right)$ are all distinct elements of $V \times W$, $e_{\left( v , w \right)}$, $e_{\left( \alpha v' , w \right)|$, and $e_{\left( v + \alpha v' , w \right)}$ are linearly independent in $F \left( V \times W \right)$, since they are all basis elements. Consequently,

$$e_{\left( v , w \right)} + e_{\left( \alpha v' , w \right)} - e_{\left( v + \alpha v' , w \right)}$$

and, similarly,

$$e_{\left( v , w \right)} + e_{\left( v , \alpha w' \right)} - e_{\left( v , w + \alpha w' \right)}$$

are non-zero in $F \left( V \times W \right)$. But, we want

$$v \otimes w + \alpha v' \otimes w - \left( v + \alpha v' \right) \otimes w = 0$$

$$v \otimes w + v \otimes \alpha w' - v \otimes \left( w + \alpha w' \right) = 0.$$

Consequently, use

$$e_{\left( v , w \right)} + e_{\left( \alpha v' , w \right)} - e_{\left( v + \alpha v' , w \right)}$$

$$e_{\left( v , w \right)} + e_{\left( v , \alpha w' \right)} - e_{\left( v , w + \alpha w' \right)}$$

to generate a subspace $U$ of $F \left( V \times W \right)$.

Then $V \otimes W$ is $F \left( V \times W \right) / U$.

Another way to think of quotient vector spaces is in terms of groups. Any vector space is an abelian group with vector addition the group product and the zero vector the group identity. Any subspace is a normal subgroup, and thus can be used to form a quotient group, with the subsapce the identity of the quotient group.

If, as in relativity, $V$ and $W$ are both finite-dimensional spaces, then $V \otimes W$ is (naturally) isomorphic to the vector space of bilinear maps from $V* \otimes W*$ to $\mathbb{R}$. For infinite-dimensional spaces $V \otimes W$ is isomorphic to a proper subspace of bilinear maps from $V* \otimes W*$ to $\mathbb{R}$. Therefore, this space of bilinear mapping is often taken to be the tensor product space.

14. Jun 15, 2008

### Fredrik

Staff Emeritus
Thanks guys. I haven't tried to understand all the details yet, but I will. George, that's a very good explanation.

15. Jun 16, 2008

### George Jones

Staff Emeritus
I just noticed this.

If, as in relativity, $V$ and $W$ are both finite-dimensional spaces, then $V \otimes W$ is (naturally) isomorphic to the vector space of bilinear maps from $V* \times W*$ to $\mathbb{R}$. For infinite-dimensional spaces $V \otimes W$ is isomorphic to a proper subspace of bilinear maps from $V* \times W*$ to $\mathbb{R}$. Therefore, this space of bilinear mapping is often taken to be the tensor product space.

16. Jun 16, 2008

### Fredrik

Staff Emeritus
I don't see how to prove the uniqueness. Maybe I'm just overlooking something really trivial. Is the idea to prove that the condition $f=f_*\circ\tau=f'_*\circ\tau'$ implies that there exists a linear bijection (i.e. a vector space isomorphism) from X' into X? How do I even begin?

It would be trivial if $f_*$ and $f'_*$ were bijections, but we haven't assumed that they are. Is that implied by something else?.

17. Jun 16, 2008

### Fredrik

Staff Emeritus
I don't really understand this part. (I understand everything before it though, even the details you omitted). This is what you seem to be doing:

We choose one specific member of the field and two specific members of each vector space, and use them to define a two-dimensional subspace U of F(VxW). Then we define two members x and y of F(VxW) to be equivalent if x-y is in U. This means that the equivalence class that x belongs to is the set [x]={x+u|u in U}. Now we define the tensor product space as the set of all equivalence classes, with multiplication by a scalar and addition defined by a[x]=[ax], [x]+[y]=[x+y].

How can I use this to verify e.g. that

$$(ax)\otimes y=a(x\otimes y)$$ ?​

I guess I would have to do it by showing that

$$e_{(ax,y)}-ae_{(x,y)}\in U$$​

but that doesn't seem possible since x,y are completely unrelated to the specific v,v',w,w' used in the construction of U.

18. Jun 17, 2008

### George Jones

Staff Emeritus
I used poor wording, and I wrote the relations down incorrectly. It should read:

Consequently, use *all* elements of the form

$$e_{\left( v , w \right)} + \alpha e_{\left( v' , w \right)} - e_{\left( v + \alpha v' , w \right)}$$

$$e_{\left( v , w \right)} + \alpha e_{\left( v , w' \right)} - e_{\left( v , w + \alpha w' \right)}$$

to generate (by taking all possible linear combinations of all such elements) a subspace $U$ of $F \left( V \times W \right)$. In particular, any of $v$, $v'$, $w$, $w'$ can be zero.

Then $V \otimes W$ is $F \left( V \times W \right)$.

Note that if the field is $\mathbb{R}$ or $\mathbb{C}$, then $U$ is infinite-dimensional even if $V$ and $W$ are finite-dimensional.

19. Jun 17, 2008

### eastside00_99

It's easy because the tensor product is what is known as a universal repelling object in the category of Vector Spaces (or more generally, the category of Modules). In fact, you could reformulate the definition using terminology from category theory which is to be expected. Anyway, the way you do it is assume that P and P' both satisfy the definition as I recently gave before. Remember this means that there exist a bilinear maps $\tau: V\times W \rightarrow P$ and $\tau^{\prime}: V\times W \rightarrow P'$. Since P and P' both satisfies the defintion, there must be a homomorphism f from P to P' such that $f \circ \tau = \tau^{\prime}$ and a homomorphism g from P' to P such that $g \circ \tau^{\prime} = \tau$ which implies $f\circ g \circ \tau^{\prime}= f \circ \tau =\tau^{\prime} \implies f \circ g = Id_{P'}$ and $g \circ f \circ \tau^{\prime} = g \circ \tau = \tau^{\prime} \implies g \circ f = Id_{P}$ This says that P and P' are isomorphic. So, perhaps it would be best to say the tensor product is unique up to isomorphism. Really, though it would be best to draw the commutative diagram(s) in the proof above and it will become immediately obvious that we have an isomorphism. I don't really feel like going through the effort to create the diagram and then posting it here though.

Last edited: Jun 17, 2008
20. Jun 17, 2008

### Fredrik

Staff Emeritus
That doesn't sound so easy.

This is not at all obvious to me (even after drawing the diagram). I agree that there must exist a function $f:P\rightarrow P'$ of course, but I don't see how we can choose it so that $f\circ\tau=\tau'$ unless $\tau$ is injective.

If $\tau$ takes two different points of $V\times W$ to the same point $p\in P$, then we can't define f at that point by $f(\tau(v,w))=\tau'(v,w)$ because we don't know which (v,w) to use on the right-hand side.

It seems to me that this is hopeless unless we change the definition so that we require some or all of the functions to be injective.