What is the issue with tensor terminology and variable slot order?

Fredrik · Jun 4, 2013

Definitions like this one are common in books:

For all ##k,l\in\mathbb N##, a multilinear map $$T:\underbrace{V^*\times\cdots\times V^*}_{k\text{ factors}}\times \underbrace{V\times\cdots\times V}_{l\text{ factors}}\to\mathbb R$$ is said to be a tensor of type ##(k,l)## on ##V##.

Lee calls this a tensor of type ##l\choose k## instead of type ##(k,l)##, but that's not the issue I want to talk about. These definitions are often followed by a comment that says that even if we change the order of the variable slots, we would still consider the map a tensor of type ##(k,l)##. In other words,

For all ##k,l\in\mathbb N##, a multilinear map ##T:W_1\times\cdots \times W_{k+l}\to\mathbb R## is said to be a tensor of type ##(k,l)## if

(a) ##W_i=V^*## for ##k## different values of ##i##, all of which are in ##\{1,\dots,k+l\}##.
(b) ##W_i=V## for ##l## different values of ##i##, all of which are in ##\{1,\dots,k+l\}##.

This creates a problem. If we had assigned the term "tensor of type ##(k,l)##" only to multilinear maps with all the V* variable slots to the left of all the V variable slots, then we could have defined a vector space structure on the set of tensors of type ##(k,l)##, and used such vector spaces to define vector bundles over a manifold. Then we could have defined tensor fields as sections of such bundles. But when we assign the term "tensor of type ##(k,l)##" to multilinear maps with the slots in any weird order, we can't define a vector space structure on the set of tensors on a given type.

We obviously want the term "tensor" to apply to multilinear maps with the variable slots in any weird order. What I'm wondering is if there's a terminology for this sort of thing, that makes it easier to talk about it. Something like "Two tensors of type ##(k,l)## are said to be equiflubbery if they have their variable slots in the same order".

Note that if we choose to use that first definition I mentioned (without the supplementary comment), so that the variable slots must be in a specific order, then we can't even say that the tensor product of a tensor of type (m,n) with a tensor of type (m',n') is a tensor of type (m+m',n+n'). It would just be some multilinear map that isn't a tensor, or at least isn't a tensor that has a type.

micromass · Jun 4, 2013

From a pure math point of view:

Fredrik said:

These definitions are often followed by a comment that says that even if we change the order of the variable slots, we would still consider the map a tensor of type ##(k,l)##.

People do that, but I don't consider it rigorous. I would much rather define a tensor as something wih a fixed order. In principle, all tensors with mixed variables can be written as such, so we don't lose anything.

If you don't like this, then perhaps you like the following approach. Let ##\mathcal{A}## be the set of all ##(k,l)## tensors in the sense that the order doesn't matter. Define ##T\sim S## if and only if ##T## and ##S## are the same type. Then the set of tensors you want is ##\mathcal{A}/\sim##. But I don't know a name for the equivalence ##\sim##.

Note that if we choose to use that first definition I mentioned (without the supplementary comment), so that the variable slots must be in a specific order, then we can't even say that the tensor product of a tensor of type (m,n) with a tensor of type (m',n') is a tensor of type (m+m',n+n'). It would just be some multilinear map that isn't a tensor, or at least isn't a tensor that has a type.

We can say it if we define the tensor product the right way. Let ##T:\underbrace{V^*\times\cdots\times V^*}_{k\text{ factors}}\times \underbrace{V\times\cdots\times V}_{l\text{ factors}}\to\mathbb R## and ##S:\underbrace{V^*\times\cdots\times V^*}_{a\text{ factors}}\times \underbrace{V\times\cdots\times V}_{b\text{ factors}}\to\mathbb R##. Then define

[tex]T\otimes S:\underbrace{V^*\times\cdots\times V^*}_{k+a\text{ factors}}\times \underbrace{V\times\cdots\times V}_{l+b\text{ factors}}\to\mathbb R[/tex]

by

[tex](T\otimes S)(\omega_1,...,\omega_k,\nu_1,...,\nu_a,v_1,...,v_l,w_1,...,w_b) = T(\omega_1,...,\omega_k,v_1,...,v_l)S(\nu_1,...,\nu_a,w_1,...,w_b)[/tex]

This definition is a bit more involved, but it does make things work.

robphy · Jun 4, 2013

Penrose and Rindler use "labeling set" or "labelling set"
www.google.com/search?q=%22labeling+set%22+penrose
www.google.com/search?q=%22labelling+set%22+penrose

See also
http://books.google.com/books?id=Cz...g=PA84&dq="It+is+the+pair+of+unordered+sets+"

Ben Niehoff · Jun 4, 2013

I'm a pretty hardline advocate of index-free notation, but even I think that if a tensor has many slots, then the abstract-index notation is best. Then one can just say, "a tensor of the form ##T_a{}^{bc}{}_e{}^f##", and there is no ambiguity.

Fortunately, in any real-world problems, one rarely encounters a tensor with more than 2 indices. The Riemann tensor, with 4 indices, is best thought of as a Lie algebra valued 2-form.

fzero · Jun 4, 2013

The resolution of this issue seems more a matter of algebra than geometry. Identify the tensors with representations of an appropriate group acting on ##V##. Then we can identify the decomposition of any tensor into a direct sum of irreducible representations. This decomposition specifies the equivalence of two representations. The order with which we choose to write down indices is not relevant to the algebraic properties of the representation.

WannabeNewton · Jun 4, 2013

Ben Niehoff said:

I'm a pretty hardline advocate of index-free notation, but even I think that if a tensor has many slots, then the abstract-index notation is best. Then one can just say, "a tensor of the form ##T_a{}^{bc}{}_e{}^f##", and there is no ambiguity.

As long as it doesn't get used in the usual unbearably annoying way for Spinors

Fredrik · Jun 4, 2013

Thanks guys.

micromass said:

In principle, all tensors with mixed variables can be written as such, so we don't lose anything.

I agree, but it feels weird to use the term "tensor" only for that, almost as if those other multilinear maps are fundamentally different objects. It's like defining the term "sequence" to be a function with domain ##\mathbb N##...but of course people do that too.

micromass said:

If you don't like this, then perhaps you like the following approach. Let ##\mathcal{A}## be the set of all ##(k,l)## tensors in the sense that the order doesn't matter. Define ##T\sim S## if and only if ##T## and ##S## are the same type. Then the set of tensors you want is ##\mathcal{A}/\sim##. But I don't know a name for the equivalence ##\sim##.

I guess it would make sense to say that two tensors in the same equivalence class are of the same "ordered type", or something like that, but since no one has mentioned a standard term, I'm going to assume there isn't one.

micromass said:

We can say it if we define the tensor product the right way.

Ah, yes. This is not the definition I once learned from Wald (who uses the abstract index notation), but it certainly seems appropriate, if we use the simpler definition of "tensor of type ##(k,l)##". Edit: I remembered this wrong. He does define it this way, in the section before the one that introduces the abstract index notation. And he calls it an "outer product", not a tensor product as I did.

robphy said:

Penrose and Rindler use "labeling set" or "labelling set"
www.google.com/search?q=%22labeling+set%22+penrose
www.google.com/search?q=%22labelling+set%22+penrose

See also
http://books.google.com/books?id=Cz...g=PA84&dq="It+is+the+pair+of+unordered+sets+"

Ben Niehoff said:

I'm a pretty hardline advocate of index-free notation, but even I think that if a tensor has many slots, then the abstract-index notation is best. Then one can just say, "a tensor of the form ##T_a{}^{bc}{}_e{}^f##", and there is no ambiguity.

Fortunately, in any real-world problems, one rarely encounters a tensor with more than 2 indices. The Riemann tensor, with 4 indices, is best thought of as a Lie algebra valued 2-form.

It appears that he abstract index notation was invented specifically to deal with this sort of thing, and the generalization to multilinear functionals whose domains are cartesian products of possibly different vector spaces and their duals.

The "labeling set" terminology comes from the definition of the abstract index notation. The labeling set is specifically the set $$L=\{a,b,\dots,a_0,b_0,\dots,a_1,\dots\}.$$ (I got that from one of the links that turned up in robphy's search).

WannabeNewton · Jun 4, 2013

Fredrik check out section 2.4 in Wald. He basically voices your same concerns and explains why the abstract index notation helps "resolve" it.

Fredrik · Jun 4, 2013

WannabeNewton said:

Fredrik check out section 2.4 in Wald. He basically voices your same concerns and explains why the abstract index notation helps "resolve" it.

I checked it out. He talks about some other things that I have mentioned in defense of the abstract index notation in other threads, but not about the specific issue I brought up here. In fact, he doesn't seem to consider other ways to order the variable slots at all. I'm extremely surprised to discover that, because that means I must have misunderstood section 2.3 the first time I read it, and then never noticed. I always thought e.g. that by his definitions, the domain of
$$\mathrm dx^\mu|_p\otimes \frac{\partial}{\partial x^\nu}\bigg|_p$$ is ##T_pM\times T_pM^*##, but it's not. It's ##T_pM^*\times T_pM##.

Fredrik · Jun 6, 2013

Fredrik said:

It appears that he abstract index notation was invented specifically to deal with this sort of thing,

I have to correct myself again. Even when the abstract index notation is used, the ##V^*##-variable slots should be to the left of all the ##V##-variable slots. For example, the notation ##T^{ab}_{c}## denotes a map from ##V^*\times V^*\times V## into ##\mathbb R##. Penrose wouldn't even write this as ##T^{ab}{}_c##. As far as I can tell, the only time we need to position the indices differently than in ##T^{ab}_c## is when the metric is used to raise or lower indices.

lurflurf · Jun 6, 2013

This is not specific to tensors, any function can manifest this difficulty. Suppose we have a function from (A^m)x(B^n) where A and B are sets. It might be even worse as we might have an infinite number of copies of an infinite number of sets. Let's just suppose we have 3 A's and 2 B's
f:AxAxAxBxB->C
g:AxBxAxBxA->C
f and g might be more of less the same thing
f(x)=g(h(x))
where h(x) is a permutation
order matters even among the same sets for example
f(x,*,y,*,z)
and
f(y,x,*,z,*)
are different things
* is a open slot
letters represent fixing a slot
The best thing to do is label all slots and don't mix them up.

Fredrik · Jun 13, 2013

I think I found a somewhat ugly consequence of the convention to keep all the V*-variable slots to the left of all the V-variable slots. I would like to know if I'm doing something wrong, or if this is just a price we have pay for having all the V*-variable slots to the left.

Let g be an arbitrary symmetric nondegenerate bilinear form, i.e. a metric tensor. The map ##v\mapsto g(v,\cdot)## from V into V* is an isomorphism. I will denote it by ##\phi## and its inverse by ##\sigma##. We have ##\phi(e_i)=g_{ij}e^j## and ##\sigma(e^i)=g^{ij}e_j##.

If ##B:V\times V\to\mathbb R## is a (0,2) tensor, then the maps ##(f,v)\mapsto B(\sigma(f),v)## and ##(f,v)\mapsto B(v,\sigma(f))##, both from ##V^*\times V## into ##\mathbb R##, are tensors of type (1,1). Let's denote them by C and D respectively. The components of these tensors are
\begin{align}
&C^i{}_j=C(e^i,e_j)=B(\sigma(e^i),e_j)=B(g^{ik}e_k,e_j)=g^{ik}B_{kj}=B^i{}_j\\
&D^i{}_j=D(e^i,e_j)=B(e_j,\sigma(e^i))=B(e_j,g^{ik}e_k) =g^{ik}B_{jk}=B_i{}^j.
\end{align} At this point, it looks like my definitions of C and D together form a basis-independent definition of how to "raise an index" of B. But look at what happens when we raise the other index in the same way. Define ##E,F:V^*\times V^*\to\mathbb R## by ##E(f,f')=C(f,\sigma(f'))## and ##F(f,f')=D(f,\sigma(f'))##.
\begin{align}
& E^{ij}=E(e^i,e^j)=C(e^i,\sigma(e^j)) =C(e^i,g^{jk}e_k) =g^{jk}C^i{}_k =g^{jk}B^i_k =B^{ij}\\
& F^{ij}=F(e^i,e^j) =D(e^i,\sigma(e^j)) =D(e^i,g^{jk}e_k) =g^{jk}D^i{}_k =g^{jk}B_k{}^i =B^{ji}
\end{align} This means that the final result of the operation of "raising both indices" depends on which index we raise first. We don't get the result we want if we start by raising the second index.

We can obviously define "B with both indices raised" by ##B^{ij}=g^{ik}g^{jl}B_{kl}##, but I would prefer to have a nice basis-independent definition.

D'oh, I was hoping that I would see the answer when I typed this up. I didn't, so I guess I'll hit the submit button and let you guys point and laugh.

By the way, the reason why I think of this as a consequence of the convention to the keep the V*-variable slots to the left, is that things look better if I change the definition of D so that the variables are in the opposite order. I'll use the notation ##\overline D## instead. We define ##\overline D:V\times V^*## by ##D(v,f)=B(v,\sigma(f))##. We have
$$\overline D_i{}^j =\overline D(e_i,e^j) =B(e_i,\sigma(e^j)) =B(e_i,g^{jk}e_k) =g^{jk}B_{ik}=B_i{}^j.$$ Now we define ##\overline F:V^*\times V^*\to\mathbb R## by ##\overline F(f,f')=\overline D(\sigma(f),f')##, and we have
$$\overline F^{ij} =\overline F(e^i,e^j) =\overline D(\sigma(e^i),e^j)=\overline D(g^{ik}e_k,e^j) =g^{ik}\overline D_k{}^j =g^{ik}B_k{}^j =B^{ij}.$$ This time the indices end up in the right order.

What is the issue with tensor terminology and variable slot order?

1. What is a tensor in scientific terms?

2. What is the difference between a tensor and a matrix?

3. What is the significance of tensor terminology in machine learning?

4. What is the difference between a scalar, vector, and tensor?

5. How are tensors used in deep learning?

Similar threads

Hot Threads

Recent Insights