What is the issue with tensor terminology and variable slot order?

In summary: Again, this is a matter of representing a tensor as a section of a bundle. If the index order changes, then the section changes as well. In summary, definitions of tensors can vary but often the comment is made that even if the order of the variable slots changes, the map is still considered a tensor of a certain type. However, this is not always considered rigorous and some prefer to define tensors with a fixed order. If the order does not matter, then the set of tensors can be represented as equivalence classes. However, if the order does matter, then defining tensor products can become more complicated. To resolve this issue, some suggest using abstract-index notation, which specifies the equivalence of two representations regardless of the order of the indices
  • #1
Fredrik
Staff Emeritus
Science Advisor
Gold Member
10,877
422
Definitions like this one are common in books:
For all ##k,l\in\mathbb N##, a multilinear map $$T:\underbrace{V^*\times\cdots\times V^*}_{k\text{ factors}}\times \underbrace{V\times\cdots\times V}_{l\text{ factors}}\to\mathbb R$$ is said to be a tensor of type ##(k,l)## on ##V##.​
Lee calls this a tensor of type ##l\choose k## instead of type ##(k,l)##, but that's not the issue I want to talk about. These definitions are often followed by a comment that says that even if we change the order of the variable slots, we would still consider the map a tensor of type ##(k,l)##. In other words,
For all ##k,l\in\mathbb N##, a multilinear map ##T:W_1\times\cdots \times W_{k+l}\to\mathbb R## is said to be a tensor of type ##(k,l)## if

(a) ##W_i=V^*## for ##k## different values of ##i##, all of which are in ##\{1,\dots,k+l\}##.
(b) ##W_i=V## for ##l## different values of ##i##, all of which are in ##\{1,\dots,k+l\}##.​
This creates a problem. If we had assigned the term "tensor of type ##(k,l)##" only to multilinear maps with all the V* variable slots to the left of all the V variable slots, then we could have defined a vector space structure on the set of tensors of type ##(k,l)##, and used such vector spaces to define vector bundles over a manifold. Then we could have defined tensor fields as sections of such bundles. But when we assign the term "tensor of type ##(k,l)##" to multilinear maps with the slots in any weird order, we can't define a vector space structure on the set of tensors on a given type.

We obviously want the term "tensor" to apply to multilinear maps with the variable slots in any weird order. What I'm wondering is if there's a terminology for this sort of thing, that makes it easier to talk about it. Something like "Two tensors of type ##(k,l)## are said to be equiflubbery if they have their variable slots in the same order".

Note that if we choose to use that first definition I mentioned (without the supplementary comment), so that the variable slots must be in a specific order, then we can't even say that the tensor product of a tensor of type (m,n) with a tensor of type (m',n') is a tensor of type (m+m',n+n'). It would just be some multilinear map that isn't a tensor, or at least isn't a tensor that has a type.
 
Physics news on Phys.org
  • #2
From a pure math point of view:

Fredrik said:
These definitions are often followed by a comment that says that even if we change the order of the variable slots, we would still consider the map a tensor of type ##(k,l)##.

People do that, but I don't consider it rigorous. I would much rather define a tensor as something wih a fixed order. In principle, all tensors with mixed variables can be written as such, so we don't lose anything.

If you don't like this, then perhaps you like the following approach. Let ##\mathcal{A}## be the set of all ##(k,l)## tensors in the sense that the order doesn't matter. Define ##T\sim S## if and only if ##T## and ##S## are the same type. Then the set of tensors you want is ##\mathcal{A}/\sim##. But I don't know a name for the equivalence ##\sim##.

Note that if we choose to use that first definition I mentioned (without the supplementary comment), so that the variable slots must be in a specific order, then we can't even say that the tensor product of a tensor of type (m,n) with a tensor of type (m',n') is a tensor of type (m+m',n+n'). It would just be some multilinear map that isn't a tensor, or at least isn't a tensor that has a type.

We can say it if we define the tensor product the right way. Let ##T:\underbrace{V^*\times\cdots\times V^*}_{k\text{ factors}}\times \underbrace{V\times\cdots\times V}_{l\text{ factors}}\to\mathbb R## and ##S:\underbrace{V^*\times\cdots\times V^*}_{a\text{ factors}}\times \underbrace{V\times\cdots\times V}_{b\text{ factors}}\to\mathbb R##. Then define

[tex]T\otimes S:\underbrace{V^*\times\cdots\times V^*}_{k+a\text{ factors}}\times \underbrace{V\times\cdots\times V}_{l+b\text{ factors}}\to\mathbb R[/tex]

by

[tex](T\otimes S)(\omega_1,...,\omega_k,\nu_1,...,\nu_a,v_1,...,v_l,w_1,...,w_b) = T(\omega_1,...,\omega_k,v_1,...,v_l)S(\nu_1,...,\nu_a,w_1,...,w_b)[/tex]

This definition is a bit more involved, but it does make things work.
 
  • #4
I'm a pretty hardline advocate of index-free notation, but even I think that if a tensor has many slots, then the abstract-index notation is best. Then one can just say, "a tensor of the form ##T_a{}^{bc}{}_e{}^f##", and there is no ambiguity.

Fortunately, in any real-world problems, one rarely encounters a tensor with more than 2 indices. The Riemann tensor, with 4 indices, is best thought of as a Lie algebra valued 2-form.
 
  • #5
The resolution of this issue seems more a matter of algebra than geometry. Identify the tensors with representations of an appropriate group acting on ##V##. Then we can identify the decomposition of any tensor into a direct sum of irreducible representations. This decomposition specifies the equivalence of two representations. The order with which we choose to write down indices is not relevant to the algebraic properties of the representation.
 
  • #6
Ben Niehoff said:
I'm a pretty hardline advocate of index-free notation, but even I think that if a tensor has many slots, then the abstract-index notation is best. Then one can just say, "a tensor of the form ##T_a{}^{bc}{}_e{}^f##", and there is no ambiguity.
As long as it doesn't get used in the usual unbearably annoying way for Spinors :wink:
 
  • #7
Thanks guys.

micromass said:
In principle, all tensors with mixed variables can be written as such, so we don't lose anything.
I agree, but it feels weird to use the term "tensor" only for that, almost as if those other multilinear maps are fundamentally different objects. It's like defining the term "sequence" to be a function with domain ##\mathbb N##...but of course people do that too.

micromass said:
If you don't like this, then perhaps you like the following approach. Let ##\mathcal{A}## be the set of all ##(k,l)## tensors in the sense that the order doesn't matter. Define ##T\sim S## if and only if ##T## and ##S## are the same type. Then the set of tensors you want is ##\mathcal{A}/\sim##. But I don't know a name for the equivalence ##\sim##.
I guess it would make sense to say that two tensors in the same equivalence class are of the same "ordered type", or something like that, but since no one has mentioned a standard term, I'm going to assume there isn't one.

micromass said:
We can say it if we define the tensor product the right way.
Ah, yes. This is not the definition I once learned from Wald (who uses the abstract index notation), but it certainly seems appropriate, if we use the simpler definition of "tensor of type ##(k,l)##". Edit: I remembered this wrong. He does define it this way, in the section before the one that introduces the abstract index notation. And he calls it an "outer product", not a tensor product as I did.

robphy said:

Ben Niehoff said:
I'm a pretty hardline advocate of index-free notation, but even I think that if a tensor has many slots, then the abstract-index notation is best. Then one can just say, "a tensor of the form ##T_a{}^{bc}{}_e{}^f##", and there is no ambiguity.

Fortunately, in any real-world problems, one rarely encounters a tensor with more than 2 indices. The Riemann tensor, with 4 indices, is best thought of as a Lie algebra valued 2-form.
It appears that he abstract index notation was invented specifically to deal with this sort of thing, and the generalization to multilinear functionals whose domains are cartesian products of possibly different vector spaces and their duals.

The "labeling set" terminology comes from the definition of the abstract index notation. The labeling set is specifically the set $$L=\{a,b,\dots,a_0,b_0,\dots,a_1,\dots\}.$$ (I got that from one of the links that turned up in robphy's search).
 
Last edited:
  • #8
Fredrik check out section 2.4 in Wald. He basically voices your same concerns and explains why the abstract index notation helps "resolve" it.
 
  • #9
WannabeNewton said:
Fredrik check out section 2.4 in Wald. He basically voices your same concerns and explains why the abstract index notation helps "resolve" it.
I checked it out. He talks about some other things that I have mentioned in defense of the abstract index notation in other threads, but not about the specific issue I brought up here. In fact, he doesn't seem to consider other ways to order the variable slots at all. I'm extremely surprised to discover that, because that means I must have misunderstood section 2.3 the first time I read it, and then never noticed. I always thought e.g. that by his definitions, the domain of
$$\mathrm dx^\mu|_p\otimes \frac{\partial}{\partial x^\nu}\bigg|_p$$ is ##T_pM\times T_pM^*##, but it's not. It's ##T_pM^*\times T_pM##.
 
  • #10
Fredrik said:
It appears that he abstract index notation was invented specifically to deal with this sort of thing,
I have to correct myself again. Even when the abstract index notation is used, the ##V^*##-variable slots should be to the left of all the ##V##-variable slots. For example, the notation ##T^{ab}_{c}## denotes a map from ##V^*\times V^*\times V## into ##\mathbb R##. Penrose wouldn't even write this as ##T^{ab}{}_c##. As far as I can tell, the only time we need to position the indices differently than in ##T^{ab}_c## is when the metric is used to raise or lower indices.
 
  • #11
This is not specific to tensors, any function can manifest this difficulty. Suppose we have a function from (A^m)x(B^n) where A and B are sets. It might be even worse as we might have an infinite number of copies of an infinite number of sets. Let's just suppose we have 3 A's and 2 B's
f:AxAxAxBxB->C
g:AxBxAxBxA->C
f and g might be more of less the same thing
f(x)=g(h(x))
where h(x) is a permutation
order matters even among the same sets for example
f(x,*,y,*,z)
and
f(y,x,*,z,*)
are different things
* is a open slot
letters represent fixing a slot
The best thing to do is label all slots and don't mix them up.
 
  • #12
I think I found a somewhat ugly consequence of the convention to keep all the V*-variable slots to the left of all the V-variable slots. I would like to know if I'm doing something wrong, or if this is just a price we have pay for having all the V*-variable slots to the left.

Let g be an arbitrary symmetric nondegenerate bilinear form, i.e. a metric tensor. The map ##v\mapsto g(v,\cdot)## from V into V* is an isomorphism. I will denote it by ##\phi## and its inverse by ##\sigma##. We have ##\phi(e_i)=g_{ij}e^j## and ##\sigma(e^i)=g^{ij}e_j##.

If ##B:V\times V\to\mathbb R## is a (0,2) tensor, then the maps ##(f,v)\mapsto B(\sigma(f),v)## and ##(f,v)\mapsto B(v,\sigma(f))##, both from ##V^*\times V## into ##\mathbb R##, are tensors of type (1,1). Let's denote them by C and D respectively. The components of these tensors are
\begin{align}
&C^i{}_j=C(e^i,e_j)=B(\sigma(e^i),e_j)=B(g^{ik}e_k,e_j)=g^{ik}B_{kj}=B^i{}_j\\
&D^i{}_j=D(e^i,e_j)=B(e_j,\sigma(e^i))=B(e_j,g^{ik}e_k) =g^{ik}B_{jk}=B_i{}^j.
\end{align} At this point, it looks like my definitions of C and D together form a basis-independent definition of how to "raise an index" of B. But look at what happens when we raise the other index in the same way. Define ##E,F:V^*\times V^*\to\mathbb R## by ##E(f,f')=C(f,\sigma(f'))## and ##F(f,f')=D(f,\sigma(f'))##.
\begin{align}
& E^{ij}=E(e^i,e^j)=C(e^i,\sigma(e^j)) =C(e^i,g^{jk}e_k) =g^{jk}C^i{}_k =g^{jk}B^i_k =B^{ij}\\
& F^{ij}=F(e^i,e^j) =D(e^i,\sigma(e^j)) =D(e^i,g^{jk}e_k) =g^{jk}D^i{}_k =g^{jk}B_k{}^i =B^{ji}
\end{align} This means that the final result of the operation of "raising both indices" depends on which index we raise first. We don't get the result we want if we start by raising the second index.

We can obviously define "B with both indices raised" by ##B^{ij}=g^{ik}g^{jl}B_{kl}##, but I would prefer to have a nice basis-independent definition.

D'oh, I was hoping that I would see the answer when I typed this up. I didn't, so I guess I'll hit the submit button and let you guys point and laugh.

By the way, the reason why I think of this as a consequence of the convention to the keep the V*-variable slots to the left, is that things look better if I change the definition of D so that the variables are in the opposite order. I'll use the notation ##\overline D## instead. We define ##\overline D:V\times V^*## by ##D(v,f)=B(v,\sigma(f))##. We have
$$\overline D_i{}^j =\overline D(e_i,e^j) =B(e_i,\sigma(e^j)) =B(e_i,g^{jk}e_k) =g^{jk}B_{ik}=B_i{}^j.$$ Now we define ##\overline F:V^*\times V^*\to\mathbb R## by ##\overline F(f,f')=\overline D(\sigma(f),f')##, and we have
$$\overline F^{ij} =\overline F(e^i,e^j) =\overline D(\sigma(e^i),e^j)=\overline D(g^{ik}e_k,e^j) =g^{ik}\overline D_k{}^j =g^{ik}B_k{}^j =B^{ij}.$$ This time the indices end up in the right order.
 
Last edited:

1. What is a tensor in scientific terms?

A tensor is a mathematical object that represents a physical quantity, such as force or velocity, in a coordinate-independent way. It is often used in physics and engineering to describe the properties of objects and their interactions.

2. What is the difference between a tensor and a matrix?

While both tensors and matrices are arrays of numbers, tensors have a higher level of mathematical complexity and can represent more complex relationships between variables. Matrices are limited to two dimensions, while tensors can have any number of dimensions, making them more versatile for representing physical phenomena.

3. What is the significance of tensor terminology in machine learning?

In machine learning, tensors are used to represent data, such as images or text, in a format that can be easily processed by algorithms. The terminology used to describe tensors, such as rank, shape, and dimension, is important for understanding how the data is organized and how it can be manipulated for machine learning tasks.

4. What is the difference between a scalar, vector, and tensor?

A scalar is a single numerical value, such as temperature or time. A vector is an array of values with both magnitude and direction, such as velocity or force. A tensor is a more complex mathematical structure that can represent multiple quantities and their relationships, making it more useful for describing physical phenomena.

5. How are tensors used in deep learning?

In deep learning, tensors are used to represent the data, parameters, and outputs of neural networks. They allow for efficient computation and manipulation of large amounts of data, making them essential for training and running deep learning models. The terminology used to describe tensors, such as layers, channels, and filters, is important for understanding how neural networks are structured and how they process information.

Similar threads

  • Differential Geometry
Replies
7
Views
2K
  • Linear and Abstract Algebra
Replies
10
Views
359
  • Linear and Abstract Algebra
Replies
7
Views
250
  • Linear and Abstract Algebra
Replies
32
Views
3K
  • Differential Geometry
Replies
5
Views
2K
  • Differential Geometry
Replies
1
Views
2K
Replies
2
Views
512
  • Linear and Abstract Algebra
Replies
5
Views
1K
  • Differential Geometry
Replies
1
Views
2K
  • Special and General Relativity
Replies
10
Views
2K
Back
Top