Confusion About Notation with Tensors

In summary, the conversation discusses confusion over the use of the Einstein Summation Convention and the symbol εijk in the context of tensors and the curl operation. The concept of "dummy indices" is explained and it is suggested that the confusion is a result of the reader's own understanding and ability to derive the concepts being presented. The role of m and l as indices to be summed over is also clarified.
  • #1
B3NR4Y
Gold Member
170
8
I wasn't sure where to post this, and I hope this is the right place. I've been reading ahead of my lectures, and I've gotten a book that introduces tensors. It very quickly introduces Einstein Summation Convention, which I think I understand, [itex] \sum_{i=1}^{3} x_{i} y_{i} = x_{i} y_{i} = x \cdot y [/itex], or, for instance the gradient of a scalar function, φ, [itex] \nabla \varphi = \sum_{i=1}^{3} \frac{1}{h_{i}} \partial_{ i } \varphi = \frac{1}{h_{i}} \partial_{ i } [/itex] . I feel like I understand this well enough to use it, but when it introduces the curl it uses a symbol, εijk that is 1 when ijk is an even permutation of (1,2,3) and -1 when ijk is an odd permutation of (1,2,3), and zero if two or more indices are equal. I thought I understood this well enough, for instance ε2,1,3 = -1, but I guess I don't or I'm too confused. Directly after that it says εijk = δilδjmjlδlm, and I'm confused from here on. Where do m and l come from? It mentions these are just "dummy indices" but I'm unsure what even that means. Am I summing the repeated indices or something? I'm sorry if I sound dumb, but I'm teaching ahead and don't really have a professor to ask these questions. It uses this epsilon to define the cross product symbolically, and here is where another confusion hits.

[itex] \vec{e_{ i }} \times \vec{e_{ j }} = \epsilon_{ijk} \vec{e_{k}} [/itex], I get this, the permutation of ijk is even, and it's positive so it makes sense. Then it goes on to define a left-handed basis as follows [itex] \vec{e_{ i }} \times \vec{e_{ j }} = -\epsilon_{ijk} \vec{e_{k}} [/itex], why not just replace the ijk with jik or some other odd permutation and drop the negative sign? Or is the epsilon there for some other reason and the negative is applied to the k basis vector?

And the last thing is when it uses this to define the curl, it does as follows [itex] (\nabla \times \vec{V})_{ i } = \epsilon_{ijk} \partial_{j} \vec{V_{k}} [/itex], but I know from previous works, the curl has three terms and the sign alternates. I assumed the alternating sign would come from the epsilon function, and more confusion hits me. Am I summing multiple indices? so

[tex] \sum_{ i = 1}^{3} \sum_{ j = 1}^{3} \sum_{ k = 1}^{3} \epsilon_{ijk} \partial_{j} \vec{V}_k =\epsilon_{ijk} \partial_{j} \vec{V}_k = [\partial_{2}(\vec{V_{3}})-\partial_{3}(\vec{V_{2}})]\vec{e_{1}} + [\partial_{3}(\vec{V_{1}})-\partial_{1}(\vec{V_{3}})]\vec{e_{2}} + [\partial_{1}(\vec{V_{2}})-\partial_{2}(\vec{V_{1}})]\vec{e_{3}}[/tex]

Or am I way off? Sorry for the lengthy post
 
Last edited:
Mathematics news on Phys.org
  • #2
B3NR4Y said:
I wasn't sure where to post this, and I hope this is the right place. I've been reading ahead of my lectures, and I've gotten a book that introduces tensors. It very quickly introduces Einstein Summation Convention, which I think I understand, [itex] \sum_{i=1}^{3} x_{i} y_{i} = x_{i} y_{i} = x \cdot y [/itex], or, for instance the gradient of a scalar function, φ, [itex] \nabla \varphi = \sum_{i=1}^{3} \frac{1}{h_{i}} \partial_{ i } \varphi = \frac{1}{h_{i}} \partial_{ i } [/itex] . I feel like I understand this well enough to use it, but when it introduces the curl it uses a symbol, εijk that is 1 when ijk is an even permutation of (1,2,3) and -1 when ijk is an odd permutation of (1,2,3), and zero if two or more indices are equal. I thought I understood this well enough, for instance ε2,1,3 = -1, but I guess I don't or I'm too confused. Directly after that it says εijk = δilδjmjlδlm, and I'm confused from here on. Where do m and l come from? It mentions these are just "dummy indices" but I'm unsure what even that means. Am I summing the repeated indices or something? I'm sorry if I sound dumb, but I'm teaching ahead and don't really have a professor to ask these questions. It uses this epsilon to define the cross product symbolically, and here is where another confusion hits.

That's fine; the thing about many advanced texts is that they expect the reader to fill in the intermediate arguments so that they can get more quickly to the main topics. In this case, you are meant to question that identity and justify it yourself by finding a concrete example that you personally understand, and then proving the identity in the general case. This is pretty much the expected way to read any advanced academic text that has advanced mathematical content. So confusion is good. It means you are learning something new and not just rehashing something you've already mastered. :-)

In detail, the term "dummy index" means a letter that is only present in order to be summed over. It has no relative on the other side of the equation to give it meaning. You are already familiar with this concept. For example, in definite integration, we can write [tex]\int_0^1 f(x) \, dx = \int_0^1 f(y) \, dy[/tex]. The variables "x" and "y" have no extrinsic meaning: "x" can be replaced with any other symbol, and the expression will still be equal to the right side. So "x" is a dummy variable, and by the same argument, so is "y".
As you may have surmised, in your expression, i, j and k are not dummy variables. Replace i on the right side with any other letter, and the two sides will no longer be equal. You would have to replace i with the same letter on both sides in order to preserve the equality. These types of variables are sometimes called "free variables", and you may see the term "free index" in the text.
So what to do with m and l? In the summation convention, when repeated indices occur, we must sum over every value of the repeated index. In 3-dimensional space, it is customary to work with index values of 1, 2, and 3, so you would have 3 possible values for m and 3 possible values for l. If you go ahead and write out the sum for a particular triple of values (i, j k), such as (1, 2, 3), you should get a sum of 8 terms. Be sure you understand this by writing out the terms in full if necessary.
Edit: That being said, I think your text may have a misprint, or the context is not apparent, as I just noticed the equation does not have k on both sides, which makes it invalid. The closest identity I know to that one is the contracted epsilon identity:
[tex]\epsilon_{ijk}\epsilon_{ist} = \delta_{js}\delta_{kt} - \delta_{jt}\delta_{ks}[/tex]
Is this the one you meant?
B3NR4Y said:
[itex] \vec{e_{ i }} \times \vec{e_{ j }} = \epsilon_{ijk} \vec{e_{k}} [/itex], I get this, the permutation of ijk is even, and it's positive so it makes sense. Then it goes on to define a left-handed basis as follows [itex] \vec{e_{ i }} \times \vec{e_{ j }} = -\epsilon_{ijk} \vec{e_{k}} [/itex], why not just replace the ijk with jik or some other odd permutation and drop the negative sign? Or is the epsilon there for some other reason and the negative is applied to the k basis vector?
That's perfectly valid reasoning. The book just chose to present it that way. There tend to be many ways to present the same idea.

B3NR4Y said:
And the last thing is when it uses this to define the curl, it does as follows [itex] (\nabla \times \vec{V})_{ i } = \epsilon_{ijk} \partial_{j} \vec{V_{k}} [/itex], but I know from previous works, the curl has three terms and the sign alternates. I assumed the alternating sign would come from the epsilon function, and more confusion hits me. Am I summing multiple indices? so

[tex] \sum_{ i = 1}^{3} \sum_{ j = 1}^{3} \sum_{ k = 1}^{3} \epsilon_{ijk} \partial_{j} \vec{V}_k =\epsilon_{ijk} \partial_{j} \vec{V}_k = [\partial_{2}(\vec{V_{3}})-\partial_{3}(\vec{V_{2}})]\vec{e_{1}} + [\partial_{3}(\vec{V_{1}})-\partial_{1}(\vec{V_{3}})]\vec{e_{2}} + [\partial_{1}(\vec{V_{2}})-\partial_{2}(\vec{V_{1}})]\vec{e_{3}}[/tex]

Or am I way off? Sorry for the lengthy post
We do not sum indexes that appear on separate sides of an equation, only dummy indices that appear multiple times on only one side of an equation. The way they define the curl defines only the ith component of the curl. For example, your equation says that the 1st component of the curl is:
[tex](\nabla \times \vec{V})_1 = \epsilon_{1jk} \partial_{j} \vec{V_{k}}[/tex]
[tex]= \epsilon_{123} \partial_{2} \vec{V_{3}} + \epsilon_{132} \partial_{3} \vec{V_{2}}[/tex]
[tex]= \partial_{2}\vec{V_{3}} - \partial_{3}\vec{V_{2}}[/tex]
Each value of i will give you only that ith component.
 
Last edited:
  • #3
Okay, I feel like I understand a bit more now. The dummy indices thing seems so obvious, and yeah I did get the identity wrong now that I have my book to look at.

So, to check my understanding of the cross product fully, the notation is
[tex] (\nabla \times \vec{V})_{ i } = \epsilon_{ijk} \partial_{j}V_{k} \vec{e}_{ i } [/tex]
Which means the i-th component of the curl of the vector field V is equal to [itex]\epsilon_{ijk} \partial_{j}V_{k} \vec{e}_{ i }[/itex] and this says that the first component (the i-th component if (1,2,3) ⇔ (i,j,k) in the basis vectors) is equal to
[tex] (\nabla \times \vec{V})_{ 1 } = \epsilon_{1jk} \partial_{j}V_{k} \vec{e}_{ 1 } = \sum_{j=1}^{3} \sum_{k=1}^{3} \epsilon_{1jk}\partial_{j}V_{k}\vec{e}_1 = (\partial_{2} V_{3}-\partial_3 V_{2}) \vec{e}_{1} [/tex]

This makes more sense to me now, and I suppose I see why Einstein called it his greatest contribution to mathematics. I also read on wikipedia that the indices you're summing over for the dot product, for example, should have one low and one high xiyi.

And a question about tensors themselves, the book says cartesian tensors are the most important for physics so it will neglect other types, it defines a cartesian tensor like this
[tex]
T=
\left[ {\begin{array}{ccc}
{T_{xx}} & {T_{xy}} & {T_{xz}} \\
{T_{yx}} & {T_{yy}} & {T_{yz}} \\
{T_{zx}} & {T_{zy}} & {T_{zz}} \\
\end{array} } \right] = \sum_{ij} T_{ij} \vec{e_{i}} \otimes \vec{e_{j}}
= T_{ij}\vec{e_{ij}}[/tex]
It also says to note, which I did and thought was very interesting, that if you multiply this tensor (in matix form) by a vector in column matrix form, you get another column vector. So this had me thinking, if multiplying a tensor by a vector gives a transformation of a vector, why not just call tensors transformations? I know this a shallow look at tensor and I realize that you can multiply tensors together and get something that is not a vector so they aren't just transformations, but I can't quite conceptualize what tensors (of rank higher than 1) "mean", like I can with vectors.

Thanks for your time
 
  • #4
B3NR4Y said:
Okay, I feel like I understand a bit more now. The dummy indices thing seems so obvious, and yeah I did get the identity wrong now that I have my book to look at.

So, to check my understanding of the cross product fully, the notation is
[tex] (\nabla \times \vec{V})_{ i } = \epsilon_{ijk} \partial_{j}V_{k} \vec{e}_{ i } [/tex]
Which means the i-th component of the curl of the vector field V is equal to [itex]\epsilon_{ijk} \partial_{j}V_{k} \vec{e}_{ i }[/itex] and this says that the first component (the i-th component if (1,2,3) ⇔ (i,j,k) in the basis vectors) is equal to
[tex] (\nabla \times \vec{V})_{ 1 } = \epsilon_{1jk} \partial_{j}V_{k} \vec{e}_{ 1 } = \sum_{j=1}^{3} \sum_{k=1}^{3} \epsilon_{1jk}\partial_{j}V_{k}\vec{e}_1 = (\partial_{2} V_{3}-\partial_3 V_{2}) \vec{e}_{1} [/tex]

This makes more sense to me now, and I suppose I see why Einstein called it his greatest contribution to mathematics. I also read on wikipedia that the indices you're summing over for the dot product, for example, should have one low and one high xiyi.

That's perfectly right. The convention for summing only high and low pairs of indices is meant to emphasize the fact that the proper environment for these interactions is within the context of dual vector spaces. In basic courses, the dot product is usually just referred to the way you wrote it, as products of components of two vectors from the same vector space being summed over. However, you may note that if we look at matrix multiplication, this outcome only makes sense if one of the vectors is written as a row matrix and the other is a column matrix. We can also look at the action of a matrix on a column vector as a column of row vectors acting on a single column vector to give a list of scalars. So we get the idea of row vectors being operators for column vectors: each row vector is a function that maps column vectors to real numbers. So we may write the dot product as an interaction between a row vector and a column vector, hence the lowered and raised indices.
The question then arises as to whether it makes sense to talk about dot products as being related to actions of these types of "linear functionals", which themselves form a vector space, on vectors. The Riesz representation theorem, which talks about more general vector spaces than can be covered by matrix algebra, covered and started this whole notion of dual vector spaces. However, it is more than mere convention that we use the extended notion of dual vector spaces. It becomes increasingly unavoidable to start separating vectors belonging to different types and different places when we start talking about curved spaces and infinite dimensional spaces, such as the curved manifolds of general relativity and the rigged Hilbert spaces of quantum mechanics. I think I will let the book or someone more qualified introduce that notion, though, as it is very involved, and I will probably mess it up. :-)
B3NR4Y said:
And a question about tensors themselves, the book says cartesian tensors are the most important for physics so it will neglect other types, it defines a cartesian tensor like this
[tex]
T=
\left[ {\begin{array}{ccc}
{T_{xx}} & {T_{xy}} & {T_{xz}} \\
{T_{yx}} & {T_{yy}} & {T_{yz}} \\
{T_{zx}} & {T_{zy}} & {T_{zz}} \\
\end{array} } \right] = \sum_{ij} T_{ij} \vec{e_{i}} \otimes \vec{e_{j}}
= T_{ij}\vec{e_{ij}}[/tex]
It also says to note, which I did and thought was very interesting, that if you multiply this tensor (in matix form) by a vector in column matrix form, you get another column vector. So this had me thinking, if multiplying a tensor by a vector gives a transformation of a vector, why not just call tensors transformations? I know this a shallow look at tensor and I realize that you can multiply tensors together and get something that is not a vector so they aren't just transformations, but I can't quite conceptualize what tensors (of rank higher than 1) "mean", like I can with vectors.

Thanks for your time
I have to say, I don't really like texts that introduce tensors in terms of coordinates before talking about either geometry or linear algebra. The coordinate representation of a tensor is secondary to its nature. It is as if a physics text told you that a vector is an ordered list of real numbers and then never mentioned the fact that it could be regarded as a direction and magnitude in space, but then went on to talk about forces, derivatives, and so forth using only the algebra of ordered lists of numbers. What a nightmare that would be.
A tensor, in regards to physics, is a multilinear functional. That means it takes as input an ordered list of vectors, which may be from different vector spaces, and outputs a single real number. The function on each vector is linear, in the same way as you are familiar with linear transformations on single vectors.
A simple tensor, therefore, is one which takes as input two vectors, and outputs their dot product. It is important that we regard the dot product as related to the geometric vectors, not their coordinates, so that the formula for the dot product when using different coordinate systems must be adjusted to still equal the product of the magnitudes of the vectors multiplied by the cosine of the angle between them, not merely the list of products we get from a rectangular coordinate system. Therefore, the tensor is a geometric object: independent of any particular coordinate system. This is the point of using tensors in the first place. As you already know, the dot product is linear in each input, and it is a real number. A tensor which has only two arguments from the same vector space V (or a vector space and a copy of the same space, depending on how pedantic you want to be), is a map from VxV into R. As you should verify yourself, we may write all tensors like this using an nxn matrix if the vector space V has dimension n. You should also verify that the matrix form of this tensor with respect to rectangular coordinates for the space V is the nxn identity matrix. Use a particular value of n, like n = 2 or n = 3 to start.
A tensor which has three inputs from the same vector space is the triple scalar product, which returns the signed volume of the parallelepiped that can be constructed by three vectors, assuming they are in flat Euclidean space. Unlike tensors that have only two inputs, we would need 3 indices, one for each vector, in order to write the components of this tensor with respect to a particular coordinate system. So this is an example of a tensor that cannot be reduced to a simple matrix transformation. The determinant is another tensor, if we regard a matrix as an ordered list of row vectors or column vectors. As you study more, you will learn about the Riemann curvature tensor, which has 4 inputs, and no doubt many more.
I have greatly simplified this by only showing tensors that act on vectors, and not linear functionals. It is quite easy to turn the previous argument around and look at the actions of vectors on linear functionals as the operation, which makes vectors tensors as well. It turns out that, just like a dot product between two vectors is related to the action of a particular linear functional on the "second vector", we can consistently relate elements of dual vector spaces with particular elements from a vector space. When we do this, we either raise or lower the index related to that space to show that we are considering the related quantity (related through the Riesz representation theorem). Since we still want to regard the tensor as essentially the same geometric object, regardless of whether it is acting on vector spaces or their duals, and regardless of its coordinate representation, we call these musical isomorphisms. "Musical" because raising and lowering is whimsically related to sharpening or flattening a musical note.
So tensors encapsulate scalars, vectors, linear transformations, and multilinear transformations into one theory. I will try to find a good free intro text for you if you want to read one on the side.
Edit: Okay, this looks nice, and is probably much better than my rambling: https://www.math.ethz.ch/education/bachelor/lectures/fs2014/other/mla/ma.pdf . :-)
Edit 2: This looks slightly better: http://rbowen.tamu.edu/ . Of course, you can just search for "Multilinear algebra" and read as much as you want about tensors.
 
Last edited:
  • #5
Woah, that second link looks perfect. I'll definitely give it a read, I also have a book called "Tensor Analysis on Manifolds" now, that I bought yesterday. My university doesn't offer a multilinear algebra course, they do offer a differential geometry course though. I'm double majoring in math and physics and have taught reasonably well ahead.
By this I mean, I am taking linear algebra/differential equations and calculus III. However, I've always been interested in math and haven't really slowed down teaching myself, I feel I have a confident foundation in math up to linear algebra (which includes vector spaces, at least here). The sequence suggested by the university is Abstract algebra, and Topology; then Algebra I (Group actions and Sylow Theorems, finitely generated abelian groups; rings and modules: PIDs, UFDs, finitely generated modules over a PID, applications to Jordan canonical form, exact sequences.) and from there differential geometry. I also plan on taking PDEs, and the advanced calc sequence (three semesters).
There's also a finite dimensional vector spaces course that I may or may not find interesting.
Which one of these do you think I'd encounter tensors in first? Or not at all? Maybe physics courses before I encounter them in math. In the electrodynamics book we use in our junior year (Griffiths) I notice he uses tensors in the last chapter, maybe that will be my first taste of them in a formal class setting.
 
  • #6
B3NR4Y said:
Woah, that second link looks perfect. I'll definitely give it a read, I also have a book called "Tensor Analysis on Manifolds" now, that I bought yesterday. My university doesn't offer a multilinear algebra course, they do offer a differential geometry course though. I'm double majoring in math and physics and have taught reasonably well ahead.
...
Which one of these do you think I'd encounter tensors in first? Or not at all? Maybe physics courses before I encounter them in math. In the electrodynamics book we use in our junior year (Griffiths) I notice he uses tensors in the last chapter, maybe that will be my first taste of them in a formal class setting.

With a double major in math/physics (I did the same!) you will definitely encounter tensors informally at first in electrodynamics, especially if you elect to take an early course in general relativity as I did. For the full formal theory, you will usually not encounter them in a course on only multilinear algebra. Their natural utility arises when you study differential geometry: it is their native introductory habitat, where the study of curved spaces requires a natural way of talking about tangent spaces, cotangent spaces, and invariant operators on those spaces. In fact, differential geometry (and topology) is the native habitat of many of the basic tools used in modern theoretical physics, so you may want to study math up to at least that level of comfort, if you're more on the physical side of things, although there is no barrier other than time to studying further. :-)
 

1. What is a tensor?

A tensor is a mathematical object that describes the linear relationships between different physical quantities. It is represented by a multi-dimensional array of numbers and can be used to model complex systems in physics, engineering, and other fields.

2. How is tensor notation different from regular vector notation?

Tensor notation uses a combination of upper and lower indices to represent the different components of a tensor, while vector notation typically uses a single index. This allows for a more compact representation of higher-dimensional tensors and simplifies calculations involving tensor operations.

3. What is the difference between a covariant and a contravariant tensor?

Covariant tensors are those that transform in the same way as the coordinates of a space, while contravariant tensors transform in the opposite way. This distinction is important in general relativity and other fields where the metric of a space is not constant.

4. How are tensors used in machine learning and artificial intelligence?

Tensors are used in machine learning and AI to represent and manipulate high-dimensional data, such as images and text. They are also used in deep learning networks to perform operations such as convolution and pooling.

5. What are some common applications of tensors in physics?

Tensors are used in physics to model and analyze a wide range of systems, including fluids, electromagnetism, and general relativity. They are also used in quantum mechanics to represent the state of a quantum system and its observables.

Similar threads

  • General Math
Replies
1
Views
1K
Replies
27
Views
2K
Replies
5
Views
674
  • Advanced Physics Homework Help
Replies
20
Views
2K
  • Calculus and Beyond Homework Help
Replies
14
Views
3K
  • Advanced Physics Homework Help
Replies
1
Views
804
  • Quantum Physics
Replies
1
Views
615
  • General Math
Replies
3
Views
1K
Replies
1
Views
859
  • Advanced Physics Homework Help
Replies
1
Views
1K
Back
Top