Contravariant first index, covariant on second, Vice versa?

Click For Summary

Discussion Overview

The discussion revolves around the transformation properties of tensors, specifically the Lorentz transformation, and the implications of covariant and contravariant indices. Participants explore the definitions and roles of transformation coefficients in the context of tensor analysis, particularly as they relate to the Dirac matrix transformation properties.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Conceptual clarification

Main Points Raised

  • One participant is deriving the Dirac matrix transformation properties and questions whether a tensor that is covariant on the first index and contravariant on the second is the same as one that is vice versa.
  • Another participant asserts that the Lorentz transformation coefficients are not tensor elements but describe how tensor components transform between coordinate systems.
  • Some participants discuss the definition of the transformation matrix ##\Lambda## and whether it qualifies as a tensor, with differing opinions on its classification.
  • There is a suggestion that the reference material being used may not adequately teach relativity concepts, as it focuses more on applications in particle physics.
  • Participants debate the notation of indices in the transformation matrix, questioning whether it is clearer to use different placements for indices to denote their roles.
  • Some participants express uncertainty about the implications of index placement and its relevance to the transformation properties of vectors and covectors.

Areas of Agreement / Disagreement

Participants do not reach a consensus on whether the transformation coefficients can be classified as tensors. There are multiple competing views regarding the notation and implications of covariant and contravariant indices, as well as the appropriateness of the reference material being discussed.

Contextual Notes

Some participants note that the transformation matrix is not generally symmetric, which may affect the interpretation of index order. There are also discussions about the conventions used in writing tensor indices and the potential confusion that may arise from different authors' notations.

Gene Naden
Messages
320
Reaction score
64
I am working through a derivation of the Dirac matrix transformation properties. I have a tensor for the Lorentz transformation that is covariant on the first index and contravariant on the second index. For the derivation, I need vice versa, i.e. covariant on the second index and contravariant on the first index. Is it the same? (There are only two indices). Thanks.
 
Physics news on Phys.org
Gene Naden said:
I have a tensor for the Lorentz transformation that is covariant on the first index and contravariant on the second index.
First things first. The Lorentz transformation coefficients are not the elements of a tensor. They describe how different types of tensor components transform when you change the coordinate system.

In general, you will have for the Lorentz transformation of a 4-vector
$$
V^{\mu'} = \Lambda^{\mu'}_{\phantom\mu\mu} V^\mu,
$$
where ##\Lambda^{\mu'}_{\phantom\mu\mu} = \partial x'^{\mu'}/\partial x^\mu##. It is generally not the same as ##\partial x^\mu/\partial x'^{\mu'}##.
 
  • Like
Likes   Reactions: Gene Naden
ok. My ##\Lambda## is a tensor as defined by the authors. Thanks for your critical response. I need all the corrections I can get; I am really struggling.
 
Gene Naden said:
ok. My ##\Lambda## is a tensor as defined by the authors. Thanks for your critical response. I need all the corrections I can get; I am really struggling.
Where are you reading this? Please provide a reference. The transformation coefficients are certainly not the elements of a tensor.
 
Here is how they define ##\Lambda##
##x\prime=\Lambda^{\mu}_{\nu}x^\nu## except that the index ##\nu## is shifted off slightly to the right, as if it were in the "second" position. So I have two questions: first, is ##\Lambda## as the authors are using it a tensor? Second, does it matter that the ##\nu## is shifted off to the right a little bit (compared to the ##\mu##)?
 
Again. Please provide the actual reference you are reading.

Gene Naden said:
Here is how they define ##\Lambda##
##x\prime=\Lambda^{\mu}_{\nu}x^\nu## except that the index ##\nu## is shifted off slightly to the right, as if it were in the "second" position. So I have two questions: first, is ##\Lambda## as the authors are using it a tensor? Second, does it matter that the ##\nu## is shifted off to the right a little bit (compared to the ##\mu##)?
1. No. It is not a tensor.
2. Yes, it matters.
 
  • Like
Likes   Reactions: Gene Naden
To be honest, you would be better off trying to learn relativity from a textbook or lecture notes that are written for a relativity course. This is not the case with the text you are reading. The discussion they have on the Lorentz group essentially focuses on its application to particle physics and QFT, not on teaching relativity.
 
Gene Naden said:
Sure, the reference is the online pdf, Lessons in Particle Physics, by Luis Anchordoqui and Francis Halzen. The link is https://arxiv.org/PS_cache/arxiv/pdf/0906/0906.1271v2.pdf

Thanks again

I didn't see \Lambda^\nu_\mu being called a tensor anywhere.

The important thing about the Lorentz transformation is that it is a transformation between two different coordinate systems. To keep that straight, it's common to use different alphabets for the indices of the two coordinate systems. So for example, we let x^\alpha, x^\beta, x^\gamma to refer to coordinates in one system, and x^\mu, x^\nu, x^\lambda to refer to coordinates in the other system. Then \Lambda^\alpha_\mu is used to transform the components of a vector from one system to the other: V^\alpha \equiv \sum_\mu \Lambda^\alpha_\mu V^\mu. Since \Lambda is not a tensor, it's not strictly speaking correct to call the indices contravariant or covariant. But it is true that you can use the same matrix to transform covectors in the other direction:

P_\mu = \sum_\alpha \Lambda^\alpha_\mu P_\alpha

You use the inverse to transform back:
  • V^\mu = \sum_\alpha (\Lambda^{-1})^\mu_\alpha V^\alpha
  • P_\alpha = \sum_\mu (\Lambda^{-1})^\mu_\alpha P_\mu
where the inverse matrix is defined by:

\sum_\mu(\Lambda^{-1})^\alpha_\mu \Lambda^\mu_\beta = \delta^\alpha_\beta
 
  • #10
I should point out that the same equation: V^\alpha = \Lambda^\alpha_\mu V^\mu is sometimes used to mean the same vector described in two different coordinate systems, or two different vectors described in the same coordinate system. That's the "passive" versus "active" distinction.
 
  • #11
  • #12
While we're discussing fine points, shouldn't we really write ##\Lambda^{\mu}{}{\nu}##, rather than##\Lambda^{\mu}_{\nu}##, so that we know that ##\mu## is the first index, and ##\nu## is the second?
 
  • #13
Gene Naden said:
I am working through a derivation of the Dirac matrix transformation properties. I have a tensor for the Lorentz transformation that is covariant on the first index and contravariant on the second index. For the derivation, I need vice versa, i.e. covariant on the second index and contravariant on the first index. Is it the same? (There are only two indices). Thanks.

I'm not sure why you think you need that. My recollection is that the transformations are ALWAYS written as ##\Lambda^{\mu}{}_{\nu}##. MTW calls this convention "northwest to southeast.

So we might have ##\Lambda^{\mu}{}_{\nu}## or ##\Lambda^{\nu}{}_{\mu}##, but never ##\Lambda_{\mu}{}^{\nu}##.

As I recall, the relation between ##\Lambda^{\mu}{}_{\nu}## and ##\Lambda^{\nu}{}_{\mu}## is that the product ##\Lambda^{\mu}{}_{\nu} \,\Lambda^{\nu}{}_{\mu}## [edit] contracts on the repeated index ##\nu## to ##\Lambda^{\mu}{}_{\mu}##, which then further contracts on the repeated index ##\mu## to a scalar value of 1. Thus one transformation is the inverse of the other.

For usage, we might transform a vector using the first transformation matrix, i.e ##x^{\mu} = \Lambda^{\mu}{}_{\nu} \, x^{\nu}## and covectors using the second, ##x_{\mu} = \Lambda^{\nu}{}_{\mu} x_{\nu}##. Which would be my guess as to what you wanted to do.
 
Last edited:
  • #14
pervect said:
While we're discussing fine points, shouldn't we really write ##\Lambda^{\mu}{}{\nu}##, rather than##\Lambda^{\mu}_{\nu}##, so that we know that ##\mu## is the first index, and ##\nu## is the second?

But since there is only one upper index and one lower index, then I don't see how the order makes a difference.
 
  • #15
stevendaryl said:
But since there is only one upper index and one lower index, then I don't see how the order makes a difference.
Some authors will use the reverse order to denote the transformation coefficients for dual vectors, i.e., ##V_{\mu'} = \Lambda^{\phantom{\mu}\mu}_{\mu'} V_\mu##. Placement then matters although one could also solve the issue by using different types of indices for the different systems, i.e., ##\Lambda^\mu_{\mu'}## vs ##\Lambda^{\mu'}_\mu##. However, this latter option places a bit more strain on the reader in remembering where many indices go.

For typesetting multiple indices where order matters, I much prefer using phantoms rather than subscripts of empty {}, i.e., ##\Lambda^{\mu'}_{\phantom\mu\nu}## over ##\Lambda^{\mu'}{}_\nu##. (The same goes for tensor indices, i.e., ##R^{a\phantom b c}_{\phantom a b}## vs ##R^a{}_b{}^c##).

Of course, all of those issues are conventions and any author should explain their conventions at some point.
 
  • #16
stevendaryl said:
But since there is only one upper index and one lower index, then I don't see how the order makes a difference.

The transformation matrix isn't in general symmetric, so the order of the indices can matter. It is true that if one knows the standard convention of northwest-southeast, one can understand the correct order even though the notation doesn't tell us what it is. Note, however, that the OP wanted to NOT use the standard conventions (if I understand his post correctly). Which strikes me as a recipe for confusion.

We can understand the tensor manipulations through a standard process, suitable for carrying out by a computer or a student working by rote, whereby we first form the tensor product, then contract repeated indices. (I recommend the computer approach with symbolic algebra, it's much less tedious).

Consider the example in my post #13. In ##x^{\mu} = \Lambda^{\mu}{}_{\nu} \, x^{\nu}## the repeated index is ##\nu##, which occurs in the second and third positions. So we form a rank three tensor product of the rank 2 transformation matrix and the rank 1 tensors, and then contract the second and third slots of the resulting entity to get the result, a rank 1 tensor. When transforming ##x_{\mu} = \Lambda^{\nu}{}_{\mu} x_{\nu}##, the repeated index are in in the first and third positions, so we compute the rank 3 entity via the tensor product, and then contract the first and third slots. Note that we contract different slots in the different cases.

As a further example, when we write ##\Lambda^{\mu}{}_{\nu} \,\Lambda^{\nu}{}_{\mu}##, we have two repeated indices. The notation tells us that we form a rank four entity via the tensor product, then perform a contraction on the first and third slots and the second and fourth slots (The order in which we perform the contraction doesn't matter, IIRC).

With symmetric tensors, I've seen papers (usually with linear gravity) that stack one index over the other, rather than making sure that the order of the indices is clear. In this case it doesn't matter because the tensor are symmetric. With non-symmetric tensors, though, the order of the indices does matter, and if one wishes to specify the calculations to be performed unambiguously by the mechanism I outlined earlier, one needs to make sure one knows the correct order. When we write the indices such that each index is in its own sequential slot and no two indices occupy the same slot, the notation specifies the order unambiguously.
 
  • #17
pervect said:
When transforming xμ=Λνμxνxμ=Λνμxνx_{\mu} = \Lambda^{\nu}{}_{\mu} x_{\nu}, the repeated index are in in the first and third positions, so we compute the rank 3 entity via the tensor product, and then contract the first and third slots. Note that we contract different slots in the different cases.
This is not the correct transformation rule for dual vectors. It would imply
$$
V_{\mu'}W^{\mu'} = \Lambda^\mu_{\phantom\mu\mu'} \Lambda^{\mu'}_{\phantom\mu\nu} V_\mu W^\nu,
$$
but generally ##\Lambda^\mu_{\phantom\mu\mu'} \Lambda^{\mu'}_{\phantom\mu\nu} \neq \delta^\mu_\nu## (it is correct only for rotations). Instead, you can get the correct transformation rule by first raising the index, applying the transformation rule for contravariant indices, and then lower the index, i.e.,
$$
V_{\mu'} = \eta_{\mu'\nu'}V^{\nu'} = \eta_{\mu'\nu'} \Lambda^{\nu'}_{\phantom\nu\nu} V^{\nu} = \eta_{\mu'\nu'} \Lambda^{\nu'}_{\phantom\nu\nu} \eta^{\nu\rho} V_{\rho}.
$$
We also note that
$$
\Lambda^{\mu'}_{\phantom\mu\gamma}\eta_{\mu'\nu'} \Lambda^{\nu'}_{\phantom\nu\nu} \eta^{\nu\rho} = \eta_{\gamma\nu}\eta^{\nu\rho} = \delta^\rho_\gamma,
$$
just by the defining property of the Lorentz transformation preserving the metric tensor, and thus recovering the appropriate inner product.

For convenience we can apply the lowering and raising of the indices using the metric to write ##\eta_{\mu'\nu'} \Lambda^{\nu'}_{\phantom\nu\nu} \eta^{\nu\rho} = \Lambda_{\mu'}^{\phantom\mu\rho}##.
 
  • #18
pervect said:
The transformation matrix isn't in general symmetric, so the order of the indices can matter. It is true that if one knows the standard convention of northwest-southeast, one can understand the correct order even though the notation doesn't tell us what it is.
With symmetric tensors, I've seen papers (usually with linear gravity) that stack one index over the other, rather than making sure that the order of the indices is clear. In this case it doesn't matter because the tensor are symmetric. With non-symmetric tensors, though, the order of the indices does matter, and if one wishes to specify the calculations to be performed unambiguously by the mechanism I outlined earlier, one needs to make sure one knows the correct order. When we write the indices such that each index is in its own sequential slot and no two indices occupy the same slot, the notation specifies the order unambiguously.

This is probably not worth spending more time on, but could you give an example where the order makes a difference? I can certainly see that if a tensor has multiple upper indices or multiple lower indices, then if it's asymmetric, there is a difference between

A^{\mu, \nu} B_\mu and A^{\nu \mu} B_\mu

If A is asymmetric, then those are different quantities.

But I don't understand what kind of ambiguity can result from the relative order of an upper index with a lower index.
 
  • #19
stevendaryl said:
This is probably not worth spending more time on, but could you give an example where the order makes a difference? I can certainly see that if a tensor has multiple upper indices or multiple lower indices, then if it's asymmetric, there is a difference between

A^{\mu, \nu} B_\mu and A^{\nu \mu} B_\mu

If A is asymmetric, then those are different quantities.

But I don't understand what kind of ambiguity can result from the relative order of an upper index with a lower index.
Take the electromagnetic field tensor as an example. There is a difference between ##F^\mu_{\phantom\mu\nu}## and ##F^{\phantom\nu\mu}_{\nu}## as ##F^{\mu\nu}## is anti-symmetric. This ambiguity only appears when you have a metric to raise and lower indices, but you do have that in relativity.
 
  • #20
Orodruin said:
First things first. The Lorentz transformation coefficients are not the elements of a tensor. They describe how different types of tensor components transform when you change the coordinate system.

In general, you will have for the Lorentz transformation of a 4-vector
$$
V^{\mu'} = \Lambda^{\mu'}_{\phantom\mu\mu} V^\mu,
$$
where ##\Lambda^{\mu'}_{\phantom\mu\mu} = \partial x'^{\mu'}/\partial x^\mu##. It is generally not the same as ##\partial x^\mu/\partial x'^{\mu'}##.
I strongly warn against this notation, which usually leads to confusion, and it's mathematically not making any sense! It's among the few really bad habits some physicists have when writing textbooks ;-))): One should distinguish the components of a vector (or tensor of any rank) with respect to different bases with some "ornament" at the symbol and not at the index (or indices), i.e., one should write the transformation of a contravariant vector component via a Lorentz transformation as
$$V^{\prime \mu}={\Lambda^{\mu}}_{\nu} V^{\nu}.$$
The notation in the latter part is correct, where you make the prime at the symbol, i.e., it's indeed fine to write
$${\Lambda^{\mu}}_{\nu} = \frac{\partial x^{\prime \mu}}{\partial x^{\nu}}.$$
Of course, it is of utmost importance for any tensor or other object notation with indices to keep both the vertical (i.e., distinguishing co- and contravariant objects) and the horizontal position of the indices. The only exception are symmetric 2nd-rank tensor components, where you can sloppily write ##S_{\mu}^{\nu}## since anyway in this case
$${S_{\mu}}^{\nu}={S^{\nu}}_{\mu} \quad \text{(true only and really only for symmetric 2nd-rank tensors!)}$$
 
  • Like
Likes   Reactions: Gene Naden
  • #21
vanhees71 said:
I strongly warn against this notation, which usually leads to confusion, and it's mathematically not making any sense!
This might be one of the (rare) cases where we disagree! :biggrin:

I used to think this way until I thought about it thoroughly and decided to use the notation with adornment on indices only (apart from on coordinates). My argumentation for the notation is that given a tensor ##T##, a fundamental fact which is often lost on students is that the tensor itself does not depend on the coordinates - only its components do. Thus, priming the symbol for the tensor (i.e., ##T'##) may give the impression that the tensor changes as the coordinates changes. Instead, I prefer to leave the tensor symbobl itself unadorned and instead adorn only the indices, which are what specifies the coordinate system. (You will note in my book that indices in primed systems are numbered ##1', 2', \ldots## instead of ##1, 2, \ldots## precisely for this reason.)

As another example, if using two coordinate systems on Euclidean space, one Cartesian (##xyz##) and one spherical (##r\theta\varphi##), I would call the components of the vector ##V## ##V^x, V^y, V^z## in the Cartesian system and ##V^r, V^\theta, V^\varphi## in the spherical system (i.e., without any additional adornments).
 
  • #22
Well, for the tensor as an invariant object, I'd use an own type of symbol. For print I prefer to set them in bold face, i.e.,
$$\boldsymbol{T}=T_{\mu \nu} \boldsymbol{e}^{\mu} \otimes \boldsymbol{e}^{\nu}=T_{\mu \nu}' \boldsymbol{e}^{\prime \mu} \otimes \boldsymbol{e}^{\prime \nu}.$$
I'd also never ever use the notation for your other example. I know, it's common in textbooks but I know also that it leads to common misunderstanding!
 
  • #23
Orodruin said:
Take the electromagnetic field tensor as an example. There is a difference between ##F^\mu_{\phantom\mu\nu}## and ##F^{\phantom\nu\mu}_{\nu}## as ##F^{\mu\nu}## is anti-symmetric. This ambiguity only appears when you have a metric to raise and lower indices, but you do have that in relativity.

Okay, but there is a sense in which the ambiguity is an artifact of the use of the same name for a tensor, F in this case, after being contracted with g. F^{\mu \nu}, F^\nu_{\ \ \mu}, F_\mu^{\ \ \nu} are really three different tensors.
 
  • #24
stevendaryl said:
Okay, but there is a sense in which the ambiguity is an artifact of the use of the same name for a tensor, F in this case, after being contracted with g. F^{\mu \nu}, F^\nu_{\ \ \mu}, F_\mu^{\ \ \nu} are really three different tensors.
I agree, but to use different notation to denote those different tensors would lead to horrific amounts of notation in relativity and other disciplines. Hence, for this purpose it is necessary to use index placement.
 
  • #25
Orodruin said:
This might be one of the (rare) cases where we disagree! :biggrin:

I used to think this way until I thought about it thoroughly and decided to use the notation with adornment on indices only (apart from on coordinates). My argumentation for the notation is that given a tensor ##T##, a fundamental fact which is often lost on students is that the tensor itself does not depend on the coordinates - only its components do. Thus, priming the symbol for the tensor (i.e., ##T'##) may give the impression that the tensor changes as the coordinates changes. Instead, I prefer to leave the tensor symbobl itself unadorned and instead adorn only the indices, which are what specifies the coordinate system. (You will note in my book that indices in primed systems are numbered ##1', 2', \ldots## instead of ##1, 2, \ldots## precisely for this reason.)

As another example, if using two coordinate systems on Euclidean space, one Cartesian (##xyz##) and one spherical (##r\theta\varphi##), I would call the components of the vector ##V## ##V^x, V^y, V^z## in the Cartesian system and ##V^r, V^\theta, V^\varphi## in the spherical system (i.e., without any additional adornments).

On the topic of peeves about vector notation, what grates on me is the covariant derivative written as: \nabla_\mu V^\nu. It really should be (\nabla_\mu V)^\nu, although that's ugly. Some convention uses V^\nu to mean the vector, not a component of the vector, but I find that confusing.
 
  • #26
stevendaryl said:
On the topic of peeves about vector notation, what grates on me is the covariant derivative written as: \nabla_\mu V^\nu. It really should be (\nabla_\mu V)^\nu, although that's ugly. Some convention uses V^\nu to mean the vector, not a component of the vector, but I find that confusing.
Well, there are two things you might mean by ##\nabla_\mu V^\nu##, either ##(\nabla_\mu V)^\nu## or ##\nabla_\mu (V^\nu)##. It is rather natural to choose a convention such that the former is what is intended since the latter can just be written ##\partial_\mu V^\nu##.
 
  • #27
Orodruin said:
I agree, but to use different notation to denote those different tensors would lead to horrific amounts of notation in relativity and other disciplines. Hence, for this purpose it is necessary to use index placement.

That's actually the reason notation is so bad so often. You can always make things unambiguous by making more careful distinctions, but at the cost of making expressions uglier and more complicated. So you have the choice of simple notation or precise notation---you can't have both.
 
  • #28
I would argue that the notation is precise as long as you have defined exactly what you mean by it. If I define ##\nabla_\mu V^\nu## to mean ##(\nabla_\mu V)^\nu##, then there is no problem and I can just use ##\partial_\mu V^\nu## for the other quantity (or ##\nabla_\mu(V^\nu)## if I need to specify). Likewise I can define the distinction between the components of the different tensors through index placement and it should be clear what is intended.
 
  • #29
Orodruin said:
I would argue that the notation is precise as long as you have defined exactly what you mean by it. If I define ##\nabla_\mu V^\nu## to mean ##(\nabla_\mu V)^\nu##, then there is no problem and I can just use ##\partial_\mu V^\nu## for the other quantity (or ##\nabla_\mu(V^\nu)## if I need to specify). Likewise I can define the distinction between the components of the different tensors through index placement and it should be clear what is intended.

I suppose. But for people just learning about tensors, the need for covariant derivatives rather than partial derivatives seems mysterious. But I think part of the mystery is due to the notation. It looks like \partial_\mu and \nabla_\mu are two different operations that can be performed on the vector component V^\nu. But it's a little clearer (to me, anyway), if you write

(\nabla_\mu V)^\nu

Because then you can substitute equals for equals: V = e_\lambda V^\lambda. The expression then becomes:

(\nabla_\mu V)^\nu = (\nabla_\mu (e_\lambda V^\lambda))^\nu

Then using Leibniz rules for derivatives,

(\nabla_\mu V)^\nu = (\nabla_\mu\ e_\lambda)^\nu V^\lambda + e_\lambda (\nabla_\mu V^\lambda)

The noobie doesn't know how to evaluate the first expression, but he's also not clear that it should be zero. Then we can for now just say that by definition

(\nabla_\mu\ e_\lambda)^\nu \equiv \Gamma^\nu_{\mu\lambda}
 
  • #30
stevendaryl said:
Then using Leibniz rules for derivatives,

(\nabla_\mu V)^\nu = (\nabla_\mu\ e_\lambda)^\nu V^\lambda + e_\lambda (\nabla_\mu V^\lambda)

Just to point out that if you really want to be clear here, you should write the second term as ##e_\lambda \nabla_\mu(V^\lambda)##. If not you again end up with a presupposition on how ##\nabla_\mu V^\lambda## should be interpreted.

When you start with tensors in a Euclidean space, before you introduce any general forms of connection, it is natural to just define ##\nabla_\mu V^\nu## as a placeholder for ##\partial_\mu V^\nu + V^\lambda (\vec e^\nu \cdot \partial_\mu \vec e_\nu)## as the partial derivatives of ##\vec e_\nu## can always be related to an underlying Cartesian system. Of course, this imposes a particular connection on the space with the Cartesian basis vectors being considered "constant". However, I find that it is the "softest" way of introducing students to Christoffel symbols because they generally understand what the derivatives of the basis vectors mean and it has a direct geometrical interpretation in Euclidean space.
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 10 ·
Replies
10
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 28 ·
Replies
28
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 9 ·
Replies
9
Views
714
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 23 ·
Replies
23
Views
6K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K