Confusion About Notation with Tensors

B3NR4Y · Feb 23, 2015

I wasn't sure where to post this, and I hope this is the right place. I've been reading ahead of my lectures, and I've gotten a book that introduces tensors. It very quickly introduces Einstein Summation Convention, which I think I understand, [itex] \sum_{i=1}^{3} x_{i} y_{i} = x_{i} y_{i} = x \cdot y [/itex], or, for instance the gradient of a scalar function, φ, [itex] \nabla \varphi = \sum_{i=1}^{3} \frac{1}{h_{i}} \partial_{ i } \varphi = \frac{1}{h_{i}} \partial_{ i } [/itex] . I feel like I understand this well enough to use it, but when it introduces the curl it uses a symbol, ε_ijk that is 1 when ijk is an even permutation of (1,2,3) and -1 when ijk is an odd permutation of (1,2,3), and zero if two or more indices are equal. I thought I understood this well enough, for instance ε_2,1,3 = -1, but I guess I don't or I'm too confused. Directly after that it says ε_ijk = δ_ilδ_jm-δ_jlδ_lm, and I'm confused from here on. Where do m and l come from? It mentions these are just "dummy indices" but I'm unsure what even that means. Am I summing the repeated indices or something? I'm sorry if I sound dumb, but I'm teaching ahead and don't really have a professor to ask these questions. It uses this epsilon to define the cross product symbolically, and here is where another confusion hits.

[itex] \vec{e_{ i }} \times \vec{e_{ j }} = \epsilon_{ijk} \vec{e_{k}} [/itex], I get this, the permutation of ijk is even, and it's positive so it makes sense. Then it goes on to define a left-handed basis as follows [itex] \vec{e_{ i }} \times \vec{e_{ j }} = -\epsilon_{ijk} \vec{e_{k}} [/itex], why not just replace the ijk with jik or some other odd permutation and drop the negative sign? Or is the epsilon there for some other reason and the negative is applied to the k basis vector?

And the last thing is when it uses this to define the curl, it does as follows [itex] (\nabla \times \vec{V})_{ i } = \epsilon_{ijk} \partial_{j} \vec{V_{k}} [/itex], but I know from previous works, the curl has three terms and the sign alternates. I assumed the alternating sign would come from the epsilon function, and more confusion hits me. Am I summing multiple indices? so

[tex] \sum_{ i = 1}^{3} \sum_{ j = 1}^{3} \sum_{ k = 1}^{3} \epsilon_{ijk} \partial_{j} \vec{V}_k =\epsilon_{ijk} \partial_{j} \vec{V}_k = [\partial_{2}(\vec{V_{3}})-\partial_{3}(\vec{V_{2}})]\vec{e_{1}} + [\partial_{3}(\vec{V_{1}})-\partial_{1}(\vec{V_{3}})]\vec{e_{2}} + [\partial_{1}(\vec{V_{2}})-\partial_{2}(\vec{V_{1}})]\vec{e_{3}}[/tex]

Or am I way off? Sorry for the lengthy post

slider142 · Feb 23, 2015

B3NR4Y said:

I wasn't sure where to post this, and I hope this is the right place. I've been reading ahead of my lectures, and I've gotten a book that introduces tensors. It very quickly introduces Einstein Summation Convention, which I think I understand, [itex] \sum_{i=1}^{3} x_{i} y_{i} = x_{i} y_{i} = x \cdot y [/itex], or, for instance the gradient of a scalar function, φ, [itex] \nabla \varphi = \sum_{i=1}^{3} \frac{1}{h_{i}} \partial_{ i } \varphi = \frac{1}{h_{i}} \partial_{ i } [/itex] . I feel like I understand this well enough to use it, but when it introduces the curl it uses a symbol, ε_ijk that is 1 when ijk is an even permutation of (1,2,3) and -1 when ijk is an odd permutation of (1,2,3), and zero if two or more indices are equal. I thought I understood this well enough, for instance ε_2,1,3 = -1, but I guess I don't or I'm too confused. Directly after that it says ε_ijk = δ_ilδ_jm-δ_jlδ_lm, and I'm confused from here on. Where do m and l come from? It mentions these are just "dummy indices" but I'm unsure what even that means. Am I summing the repeated indices or something? I'm sorry if I sound dumb, but I'm teaching ahead and don't really have a professor to ask these questions. It uses this epsilon to define the cross product symbolically, and here is where another confusion hits.

That's fine; the thing about many advanced texts is that they expect the reader to fill in the intermediate arguments so that they can get more quickly to the main topics. In this case, you are meant to question that identity and justify it yourself by finding a concrete example that you personally understand, and then proving the identity in the general case. This is pretty much the expected way to read any advanced academic text that has advanced mathematical content. So confusion is good. It means you are learning something new and not just rehashing something you've already mastered. :-)

In detail, the term "dummy index" means a letter that is only present in order to be summed over. It has no relative on the other side of the equation to give it meaning. You are already familiar with this concept. For example, in definite integration, we can write [tex]\int_0^1 f(x) \, dx = \int_0^1 f(y) \, dy[/tex]. The variables "x" and "y" have no extrinsic meaning: "x" can be replaced with any other symbol, and the expression will still be equal to the right side. So "x" is a dummy variable, and by the same argument, so is "y".
As you may have surmised, in your expression, i, j and k are not dummy variables. Replace i on the right side with any other letter, and the two sides will no longer be equal. You would have to replace i with the same letter on both sides in order to preserve the equality. These types of variables are sometimes called "free variables", and you may see the term "free index" in the text.
So what to do with m and l? In the summation convention, when repeated indices occur, we must sum over every value of the repeated index. In 3-dimensional space, it is customary to work with index values of 1, 2, and 3, so you would have 3 possible values for m and 3 possible values for l. If you go ahead and write out the sum for a particular triple of values (i, j k), such as (1, 2, 3), you should get a sum of 8 terms. Be sure you understand this by writing out the terms in full if necessary.
Edit: That being said, I think your text may have a misprint, or the context is not apparent, as I just noticed the equation does not have k on both sides, which makes it invalid. The closest identity I know to that one is the contracted epsilon identity:
[tex]\epsilon_{ijk}\epsilon_{ist} = \delta_{js}\delta_{kt} - \delta_{jt}\delta_{ks}[/tex]
Is this the one you meant?

B3NR4Y said:

[itex] \vec{e_{ i }} \times \vec{e_{ j }} = \epsilon_{ijk} \vec{e_{k}} [/itex], I get this, the permutation of ijk is even, and it's positive so it makes sense. Then it goes on to define a left-handed basis as follows [itex] \vec{e_{ i }} \times \vec{e_{ j }} = -\epsilon_{ijk} \vec{e_{k}} [/itex], why not just replace the ijk with jik or some other odd permutation and drop the negative sign? Or is the epsilon there for some other reason and the negative is applied to the k basis vector?

That's perfectly valid reasoning. The book just chose to present it that way. There tend to be many ways to present the same idea.

B3NR4Y said:

And the last thing is when it uses this to define the curl, it does as follows [itex] (\nabla \times \vec{V})_{ i } = \epsilon_{ijk} \partial_{j} \vec{V_{k}} [/itex], but I know from previous works, the curl has three terms and the sign alternates. I assumed the alternating sign would come from the epsilon function, and more confusion hits me. Am I summing multiple indices? so

[tex] \sum_{ i = 1}^{3} \sum_{ j = 1}^{3} \sum_{ k = 1}^{3} \epsilon_{ijk} \partial_{j} \vec{V}_k =\epsilon_{ijk} \partial_{j} \vec{V}_k = [\partial_{2}(\vec{V_{3}})-\partial_{3}(\vec{V_{2}})]\vec{e_{1}} + [\partial_{3}(\vec{V_{1}})-\partial_{1}(\vec{V_{3}})]\vec{e_{2}} + [\partial_{1}(\vec{V_{2}})-\partial_{2}(\vec{V_{1}})]\vec{e_{3}}[/tex]

Or am I way off? Sorry for the lengthy post

We do not sum indexes that appear on separate sides of an equation, only dummy indices that appear multiple times on only one side of an equation. The way they define the curl defines only the ith component of the curl. For example, your equation says that the 1st component of the curl is:
[tex](\nabla \times \vec{V})_1 = \epsilon_{1jk} \partial_{j} \vec{V_{k}}[/tex]
[tex]= \epsilon_{123} \partial_{2} \vec{V_{3}} + \epsilon_{132} \partial_{3} \vec{V_{2}}[/tex]
[tex]= \partial_{2}\vec{V_{3}} - \partial_{3}\vec{V_{2}}[/tex]
Each value of i will give you only that ith component.

B3NR4Y · Feb 24, 2015

Okay, I feel like I understand a bit more now. The dummy indices thing seems so obvious, and yeah I did get the identity wrong now that I have my book to look at.

So, to check my understanding of the cross product fully, the notation is
[tex] (\nabla \times \vec{V})_{ i } = \epsilon_{ijk} \partial_{j}V_{k} \vec{e}_{ i } [/tex]
Which means the i-th component of the curl of the vector field V is equal to [itex]\epsilon_{ijk} \partial_{j}V_{k} \vec{e}_{ i }[/itex] and this says that the first component (the i-th component if (1,2,3) ⇔ (i,j,k) in the basis vectors) is equal to
[tex] (\nabla \times \vec{V})_{ 1 } = \epsilon_{1jk} \partial_{j}V_{k} \vec{e}_{ 1 } = \sum_{j=1}^{3} \sum_{k=1}^{3} \epsilon_{1jk}\partial_{j}V_{k}\vec{e}_1 = (\partial_{2} V_{3}-\partial_3 V_{2}) \vec{e}_{1} [/tex]

This makes more sense to me now, and I suppose I see why Einstein called it his greatest contribution to mathematics. I also read on wikipedia that the indices you're summing over for the dot product, for example, should have one low and one high x_iyⁱ.

And a question about tensors themselves, the book says cartesian tensors are the most important for physics so it will neglect other types, it defines a cartesian tensor like this
[tex]
T=
\left[ {\begin{array}{ccc}
{T_{xx}} & {T_{xy}} & {T_{xz}} \\
{T_{yx}} & {T_{yy}} & {T_{yz}} \\
{T_{zx}} & {T_{zy}} & {T_{zz}} \\
\end{array} } \right] = \sum_{ij} T_{ij} \vec{e_{i}} \otimes \vec{e_{j}}
= T_{ij}\vec{e_{ij}}[/tex]
It also says to note, which I did and thought was very interesting, that if you multiply this tensor (in matix form) by a vector in column matrix form, you get another column vector. So this had me thinking, if multiplying a tensor by a vector gives a transformation of a vector, why not just call tensors transformations? I know this a shallow look at tensor and I realize that you can multiply tensors together and get something that is not a vector so they aren't just transformations, but I can't quite conceptualize what tensors (of rank higher than 1) "mean", like I can with vectors.

Thanks for your time

slider142 · Feb 24, 2015

B3NR4Y said:

Okay, I feel like I understand a bit more now. The dummy indices thing seems so obvious, and yeah I did get the identity wrong now that I have my book to look at.

So, to check my understanding of the cross product fully, the notation is
[tex] (\nabla \times \vec{V})_{ i } = \epsilon_{ijk} \partial_{j}V_{k} \vec{e}_{ i } [/tex]
Which means the i-th component of the curl of the vector field V is equal to [itex]\epsilon_{ijk} \partial_{j}V_{k} \vec{e}_{ i }[/itex] and this says that the first component (the i-th component if (1,2,3) ⇔ (i,j,k) in the basis vectors) is equal to
[tex] (\nabla \times \vec{V})_{ 1 } = \epsilon_{1jk} \partial_{j}V_{k} \vec{e}_{ 1 } = \sum_{j=1}^{3} \sum_{k=1}^{3} \epsilon_{1jk}\partial_{j}V_{k}\vec{e}_1 = (\partial_{2} V_{3}-\partial_3 V_{2}) \vec{e}_{1} [/tex]

This makes more sense to me now, and I suppose I see why Einstein called it his greatest contribution to mathematics. I also read on wikipedia that the indices you're summing over for the dot product, for example, should have one low and one high x_iyⁱ.

That's perfectly right. The convention for summing only high and low pairs of indices is meant to emphasize the fact that the proper environment for these interactions is within the context of dual vector spaces. In basic courses, the dot product is usually just referred to the way you wrote it, as products of components of two vectors from the same vector space being summed over. However, you may note that if we look at matrix multiplication, this outcome only makes sense if one of the vectors is written as a row matrix and the other is a column matrix. We can also look at the action of a matrix on a column vector as a column of row vectors acting on a single column vector to give a list of scalars. So we get the idea of row vectors being operators for column vectors: each row vector is a function that maps column vectors to real numbers. So we may write the dot product as an interaction between a row vector and a column vector, hence the lowered and raised indices.
The question then arises as to whether it makes sense to talk about dot products as being related to actions of these types of "linear functionals", which themselves form a vector space, on vectors. The Riesz representation theorem, which talks about more general vector spaces than can be covered by matrix algebra, covered and started this whole notion of dual vector spaces. However, it is more than mere convention that we use the extended notion of dual vector spaces. It becomes increasingly unavoidable to start separating vectors belonging to different types and different places when we start talking about curved spaces and infinite dimensional spaces, such as the curved manifolds of general relativity and the rigged Hilbert spaces of quantum mechanics. I think I will let the book or someone more qualified introduce that notion, though, as it is very involved, and I will probably mess it up. :-)

B3NR4Y said:

And a question about tensors themselves, the book says cartesian tensors are the most important for physics so it will neglect other types, it defines a cartesian tensor like this
[tex]
T=
\left[ {\begin{array}{ccc}
{T_{xx}} & {T_{xy}} & {T_{xz}} \\
{T_{yx}} & {T_{yy}} & {T_{yz}} \\
{T_{zx}} & {T_{zy}} & {T_{zz}} \\
\end{array} } \right] = \sum_{ij} T_{ij} \vec{e_{i}} \otimes \vec{e_{j}}
= T_{ij}\vec{e_{ij}}[/tex]
It also says to note, which I did and thought was very interesting, that if you multiply this tensor (in matix form) by a vector in column matrix form, you get another column vector. So this had me thinking, if multiplying a tensor by a vector gives a transformation of a vector, why not just call tensors transformations? I know this a shallow look at tensor and I realize that you can multiply tensors together and get something that is not a vector so they aren't just transformations, but I can't quite conceptualize what tensors (of rank higher than 1) "mean", like I can with vectors.

Thanks for your time

I have to say, I don't really like texts that introduce tensors in terms of coordinates before talking about either geometry or linear algebra. The coordinate representation of a tensor is secondary to its nature. It is as if a physics text told you that a vector is an ordered list of real numbers and then never mentioned the fact that it could be regarded as a direction and magnitude in space, but then went on to talk about forces, derivatives, and so forth using only the algebra of ordered lists of numbers. What a nightmare that would be.
A tensor, in regards to physics, is a multilinear functional. That means it takes as input an ordered list of vectors, which may be from different vector spaces, and outputs a single real number. The function on each vector is linear, in the same way as you are familiar with linear transformations on single vectors.
A simple tensor, therefore, is one which takes as input two vectors, and outputs their dot product. It is important that we regard the dot product as related to the geometric vectors, not their coordinates, so that the formula for the dot product when using different coordinate systems must be adjusted to still equal the product of the magnitudes of the vectors multiplied by the cosine of the angle between them, not merely the list of products we get from a rectangular coordinate system. Therefore, the tensor is a geometric object: independent of any particular coordinate system. This is the point of using tensors in the first place. As you already know, the dot product is linear in each input, and it is a real number. A tensor which has only two arguments from the same vector space V (or a vector space and a copy of the same space, depending on how pedantic you want to be), is a map from VxV into R. As you should verify yourself, we may write all tensors like this using an nxn matrix if the vector space V has dimension n. You should also verify that the matrix form of this tensor with respect to rectangular coordinates for the space V is the nxn identity matrix. Use a particular value of n, like n = 2 or n = 3 to start.
A tensor which has three inputs from the same vector space is the triple scalar product, which returns the signed volume of the parallelepiped that can be constructed by three vectors, assuming they are in flat Euclidean space. Unlike tensors that have only two inputs, we would need 3 indices, one for each vector, in order to write the components of this tensor with respect to a particular coordinate system. So this is an example of a tensor that cannot be reduced to a simple matrix transformation. The determinant is another tensor, if we regard a matrix as an ordered list of row vectors or column vectors. As you study more, you will learn about the Riemann curvature tensor, which has 4 inputs, and no doubt many more.
I have greatly simplified this by only showing tensors that act on vectors, and not linear functionals. It is quite easy to turn the previous argument around and look at the actions of vectors on linear functionals as the operation, which makes vectors tensors as well. It turns out that, just like a dot product between two vectors is related to the action of a particular linear functional on the "second vector", we can consistently relate elements of dual vector spaces with particular elements from a vector space. When we do this, we either raise or lower the index related to that space to show that we are considering the related quantity (related through the Riesz representation theorem). Since we still want to regard the tensor as essentially the same geometric object, regardless of whether it is acting on vector spaces or their duals, and regardless of its coordinate representation, we call these musical isomorphisms. "Musical" because raising and lowering is whimsically related to sharpening or flattening a musical note.
So tensors encapsulate scalars, vectors, linear transformations, and multilinear transformations into one theory. I will try to find a good free intro text for you if you want to read one on the side.
Edit: Okay, this looks nice, and is probably much better than my rambling: https://www.math.ethz.ch/education/bachelor/lectures/fs2014/other/mla/ma.pdf . :-)
Edit 2: This looks slightly better: http://rbowen.tamu.edu/ . Of course, you can just search for "Multilinear algebra" and read as much as you want about tensors.

B3NR4Y · Feb 24, 2015

Woah, that second link looks perfect. I'll definitely give it a read, I also have a book called "Tensor Analysis on Manifolds" now, that I bought yesterday. My university doesn't offer a multilinear algebra course, they do offer a differential geometry course though. I'm double majoring in math and physics and have taught reasonably well ahead.
By this I mean, I am taking linear algebra/differential equations and calculus III. However, I've always been interested in math and haven't really slowed down teaching myself, I feel I have a confident foundation in math up to linear algebra (which includes vector spaces, at least here). The sequence suggested by the university is Abstract algebra, and Topology; then Algebra I (Group actions and Sylow Theorems, finitely generated abelian groups; rings and modules: PIDs, UFDs, finitely generated modules over a PID, applications to Jordan canonical form, exact sequences.) and from there differential geometry. I also plan on taking PDEs, and the advanced calc sequence (three semesters).
There's also a finite dimensional vector spaces course that I may or may not find interesting.
Which one of these do you think I'd encounter tensors in first? Or not at all? Maybe physics courses before I encounter them in math. In the electrodynamics book we use in our junior year (Griffiths) I notice he uses tensors in the last chapter, maybe that will be my first taste of them in a formal class setting.

slider142 · Feb 24, 2015

B3NR4Y said:

Woah, that second link looks perfect. I'll definitely give it a read, I also have a book called "Tensor Analysis on Manifolds" now, that I bought yesterday. My university doesn't offer a multilinear algebra course, they do offer a differential geometry course though. I'm double majoring in math and physics and have taught reasonably well ahead.
...
Which one of these do you think I'd encounter tensors in first? Or not at all? Maybe physics courses before I encounter them in math. In the electrodynamics book we use in our junior year (Griffiths) I notice he uses tensors in the last chapter, maybe that will be my first taste of them in a formal class setting.

With a double major in math/physics (I did the same!) you will definitely encounter tensors informally at first in electrodynamics, especially if you elect to take an early course in general relativity as I did. For the full formal theory, you will usually not encounter them in a course on only multilinear algebra. Their natural utility arises when you study differential geometry: it is their native introductory habitat, where the study of curved spaces requires a natural way of talking about tangent spaces, cotangent spaces, and invariant operators on those spaces. In fact, differential geometry (and topology) is the native habitat of many of the basic tools used in modern theoretical physics, so you may want to study math up to at least that level of comfort, if you're more on the physical side of things, although there is no barrier other than time to studying further. :-)

Confusion About Notation with Tensors

1. What is a tensor?

2. How is tensor notation different from regular vector notation?

3. What is the difference between a covariant and a contravariant tensor?

4. How are tensors used in machine learning and artificial intelligence?

5. What are some common applications of tensors in physics?

Similar threads

Hot Threads

Recent Insights