# Tensor Confusion: Lambda & Partial Derivatives

• barnflakes
In summary, the conversation discussed the equation \Lambda^{\mu}_{\hspace{3 mm}\nu} = \partial_{\nu}x'^{\mu} = \frac{\partial x^'{\mu}}{\partial x^{\nu}} and its implications for the equation \Lambda_{\mu}^{\hspace{3 mm}\nu} = \partial^{\nu}x'_{\mu} = \frac{\partial x'_{\mu}}{\partial x_{\nu}}, as well as its relation to the inverse of \Lambda^{\nu}_{\hspace{3 mm} \mu} and the kronecker delta. The conversation also explored the index manipulation and its implications for swapping
barnflakes
If $\Lambda^{\mu}_{\hspace{3 mm}\nu} = \partial_{\nu}x'^{\mu} = \frac{\partial x^'{\mu}}{\partial x^{\nu}}$

does that mean $\Lambda_{\mu}^{\hspace{3 mm}\nu} = \partial^{\nu}x'_{\mu} = \frac{\partial x'_{\mu}}{\partial x_{\nu}}$ ?

Last edited:
Doesn't

$$\Lambda^{\mu}_{\hspace{3 mm}\nu} = \partial_{\nu}x'^{\mu} = \frac{\partial x^'{\mu}}{\partial x^{\nu}}$$

mean that

$$\Lambda_{\mu}^{\hspace{3 mm}\nu} = \partial^{\nu}x'_{\mu} = \frac{\partial x'_{\mu}}{\partial x_{\nu}}?$$

Yes George that's what I meant to write, sorry about that. Is that correct?

Does that also mean that $$\Lambda_{\mu}^{\hspace{3 mm}\nu} = \Lambda^{\nu}_{\hspace{3 mm} \mu} [/itex] ? barnflakes said: Yes George that's what I meant to write, sorry about that. Is that correct? I think so. barnflakes said: Does that also mean that [tex]\Lambda_{\mu}^{\hspace{3 mm}\nu} = \Lambda^{\nu}_{\hspace{3 mm} \mu} [/itex] ? No. [tex]\Lambda_{\mu}^{\hspace{3 mm}\nu} = \eta_{\mu \alpha} \Lambda^{\alpha \nu} = \eta_{\mu \alpha} \eta^{\beta \nu} \Lambda^\alpha{}_\beta$$

So $\Lambda_{\mu}^{\hspace{3 mm}\nu} = (\Lambda^{-1})^\nu{}_\mu$? Ie. $$\Lambda_{\mu}^{\hspace{3 mm}\nu}[/itex] Is the inverse of [tex]\Lambda^{\nu}_{\hspace{3 mm} \mu}$$ ?

So if I wanted to multiply two Lambdas together, it's only in certain cases that we get the kronecker delta?

For instance: $$\Lambda^{\mu}_{\hspace{3 mm} \alpha} \Lambda_{\nu}^{\hspace{3 mm}\alpha}= \delta^\mu_\alpha$$

and $$\Lambda^{\mu}_{\hspace{3 mm} \alpha} \Lambda_{\mu}^{\hspace{3 mm}\nu}= \delta^\nu_\alpha$$

Is that right?

Last edited:
That's right. See this post for a little bit more.

George Jones said:
I think so.No.

$$\Lambda_{\mu}^{\hspace{3 mm}\nu} = \eta_{\mu \alpha} \Lambda^{\alpha \nu} = \eta_{\mu \alpha} \eta^{\beta \nu} \Lambda^\alpha{}_\beta$$

OK some more conceptual problems I'm having:

$$\Lambda_{\mu}^{\hspace{3 mm}\nu} = \eta_{\mu \alpha} \Lambda^{\alpha \nu} = \eta_{\mu \alpha} \eta^{\beta \nu} \Lambda^\alpha{}_\beta$$

But matrix multiplication is associate, so $$(\eta_{\mu \alpha} \eta^{\beta \nu}) \Lambda^\alpha{}_\beta = \eta_{\mu \alpha} (\eta^{\beta \nu} \Lambda^\alpha{}_\beta)$$ but surely $$(\eta_{\mu \alpha} \eta^{\beta \nu})$$ is equal to the identity matrix?

In

$$\eta_{\mu \alpha} \eta^{\beta \nu}$$

you basically take the tensor product of an n-dimensional covariant metric and a contravariant metric and you end up with a type (2,2) tensor. Normally this is not represented by an nxn matrix (or you have to take a convention in which you build a 16x16 matrix by multiplying every element of the first matrix by the second matrix, but that's not what is useful here).

If you would take a contraction then ofcourse you could write things down in terms of simple matrix multiplication. But you don't.

barnflakes said:
...but surely $$(\eta_{\mu \alpha} \eta^{\beta \nu})$$ is equal to the identity matrix?
It's not. It's the product of one component of $\eta$ and one component of $\eta^{-1}$. Recall that the definition of matrix multiplication is $(AB)^i_j=A^i_k B^k_j$ and that the right-hand side actually means $\sum_k A^i_k B^k_j$. There is no summation in $$\eta_{\mu \alpha} \eta^{\beta \nu}$$.

Did you understand my calculation of the components of $\Lambda^{-1}$ in the thread I linked to?

Fredrik said:
It's not. It's the product of one component of $\eta$ and one component of $\eta^{-1}$. Recall that the definition of matrix multiplication is $(AB)^i_j=A^i_k B^k_j$ and that the right-hand side actually means $\sum_k A^i_k B^k_j$. There is no summation in $$\eta_{\mu \alpha} \eta^{\beta \nu}$$.

And the other reason why it's not an identity matrix is because we're working in Minkowski space with signature (-,+,+,+), and therefore $\eta$'s are not identity matrices.

hamster143 said:
And the other reason why it's not an identity matrix is because we're working in Minkowski space with signature (-,+,+,+), and therefore $\eta$'s are not identity matrices.
You're right that they're not, but the result would still be (the components of) an identity matrix if the indices had matched. See the post I linked to.

Last edited:
George Jones said:
$$\Lambda_{\mu}^{\hspace{3 mm}\nu} = \eta_{\mu \alpha} \Lambda^{\alpha \nu} = \eta_{\mu \alpha} \eta^{\beta \nu} \Lambda^\alpha{}_\beta$$

Could we write this in matrix notation as

$$\left [ \Lambda_{\mu}^{\enspace \nu} \right ] = \eta \left [ \Lambda^{\mu}_{\enspace \nu} \right ] \eta^{-1} = \eta \Lambda \eta = \left [ \Lambda^{\mu}_{\enspace \nu} \right ]^{-1}$$

And am I right in thinking this equation only applies to boosts? The more general equation including boosts and rotations being:

$$\eta \Lambda^{T} \eta = \Lambda^{-1}$$

George's equations (one for each value of the indices) are just the components of a matrix equation that holds for all Lorentz transformations. See the post I linked to in #6.

How's this for index juggling?

$$\Lambda^{\mu}_{\enspace\rho} \left ( \Lambda^{-1} \right )^{\rho}_{\enspace\nu} = \delta^{\mu}_{\nu}$$

And substituting your equation for the components of $$\Lambda^{-1}$$, from post #2 of the thread you linked to:

$$\Lambda^{\mu}_{\enspace\rho} \eta^{\thinspace \rho\tau} \Lambda^{\sigma}_{\enspace\tau} \eta_{\sigma\nu} = \delta^{\mu}_{\nu}$$

$$\Lambda^{\mu}_{\enspace\rho} \Lambda_{\nu}^{\enspace\rho} = \delta^{\mu}_{\nu}$$

Or in matrix format:

$$\Lambda \eta^{-1} \Lambda^{T} \eta = I \Leftrightarrow \Lambda^{-1} = \eta^{-1} \Lambda^{T} \eta$$

I suppose what this shows is that the rules of index manipulation imply the convention that when there's a pair of indices--one up, and one down--then swapping their horizontal order (moving the leftmost index to the right, and the rightmost index to the left) inverts a Lorentz transformation. Does swapping the horizontal order of indices indicate inversion in general, or does this only work for a Lorentz transformation?

$$\left [ T^{\mu}_{\enspace\nu} \right ]^{-1} = \left [ T_{\nu}^{\enspace\mu} \right ]$$

And if so, since the indices are arbitrary:

$$\left [ T^{\mu}_{\enspace\nu} \right ]^{-1} = \left [ T_{\alpha}^{\enspace\beta} \right ]$$ ?

In #18 of this thread https://www.physicsforums.com/showthread.php?t=353536&page=2 Haushofer concludes with a formula similar to George's. In fact, I think it's equivalent to George's, except that $$T$$ is used instead of $$\Lambda$$. If I was going to try to write this in matrix notation, I'd write:

$$\left [ T^{\mu}_{\enspace\nu} \right ] = \eta^{-1} \left [ T_{\alpha}^{\enspace\beta} \right ]^{T} \eta$$

Is that correct? Then if $$T$$ was a Lorentz transformation, I guess we'd know that $$\left [ T^{\mu}_{\enspace\nu} \right ]$$ is the inverse of $$\left [ T_{\alpha}^{\enspace\beta} \right ]$$. But since not everything is a Lorentz transformation, I'm guessing maybe it's not true in general that

$$\left [ T^{\mu}_{\enspace\nu} \right ]^{-1} = \left [ T_{\alpha}^{\enspace\beta} \right ]$$

Rasalhague said:
How's this for index juggling?
It's all good.

Rasalhague said:
$$\Lambda \eta^{-1} \Lambda^{T} \eta = I \Leftrightarrow \Lambda^{-1} = \eta^{-1} \Lambda^{T} \eta$$
That's right. I like to take $\Lambda^T\eta\Lambda=\eta$ as the definition of a Lorentz transformation. If we multiply this with $\eta^{-1}$ from the left, we get $\eta^{-1}\Lambda^T\eta\Lambda=I$, which implies that $\Lambda^{-1}$ is what you said.

To go from any of the nice and simple matrix equations to the corresponding result with lots of mostly pointless and annoying indices, you simply use the definition of matrix multiplication stated above, the summation convention, and the notational convention described in the other thread.

Rasalhague said:
I suppose what this shows is that the rules of index manipulation imply the convention that when there's a pair of indices--one up, and one down--then swapping their horizontal order (moving the leftmost index to the right, and the rightmost index to the left) inverts a Lorentz transformation. Does swapping the horizontal order of indices indicate inversion in general, or does this only work for a Lorentz transformation?
Only for Lorentz transformations, because it follows from the formula for $\Lambda^{-1}$ that you found (which only holds when $\Lambda$ is a Lorentz transformation), and those other things I just mentioned.

Rasalhague said:
$$\left [ T^{\mu}_{\enspace\nu} \right ]^{-1} = \left [ T_{\alpha}^{\enspace\beta} \right ]$$ ?
What you need to understand is that while $T^\alpha{}_\beta$ is defined as the component of T on row $\alpha$, column $\beta$, $T_\alpha{}^\beta$ is defined as the component on row $\alpha$, column $\beta$ of $\eta T\eta^{-1}$. (This is just the convention to use the metric to raise and lower indices). So your equation says that $T^{-1}=\eta T\eta^{-1}$, or maybe that $T^{-1}=(\eta T\eta^{-1})^T=\eta^{-1}T^T\eta$. the second alternative makes more sense (since it would be true for Lorentz transformations), so that suggests that if we use that bracket notation to indicate "the matrix with these components", we should actually interpret it as "the transpose of the matrix with these components" when the indices are "first one downstairs, second one upstairs". (A better option is probably to avoid that notation when you can).

Rasalhague said:
In #18 of this thread https://www.physicsforums.com/showthread.php?t=353536&page=2 Haushofer concludes with a formula similar to George's.
His formula is just an equivalent way to define what we mean by $T_\alpha{}^\beta$.

Last edited:
Fredrik said:
It's all good.

Phew! Thanks, that's a relief to know.

Fredrik said:
What you need to understand is that while $T^\alpha{}_\beta$ is defined as the component of T on row $\alpha$, column $\beta$, $T_\alpha{}^\beta$ is defined as the component on row $\alpha$, column $\beta$ of $\eta T\eta^{-1}$. (This is just the convention to use the metric to raise and lower indices).

Ah, another source of confusion... This differs from the convention explained by Ruslan Shapirov in his Quick Introduction to Tensor Analysis, which I'd assumed was the rule everyone followed:

"For any double indexed array with indices on the same level (both upper or both lower) the first index is a row number, while the second index is a column number. If indices are on different levels (one upper and one lower), then the upper index is a row number, while lower one is a column number."

I gather that some people follow a convention whereby upper indices are always written first, $$T^{\alpha}_{\enspace\beta}$$, or where an arbitrary type-(1,1) tensor is written $$T^{\alpha}_{\beta}$$ (this is what Shapirov does), and only the order of indices on the same level as each other is significant, whereas others use a convention whereby changing the horizontal order of a pair of indices on a type-(1,1) tensor does make a difference (indicating inversion of a Lorentz transformation, and I don't know what--if anything--it indicates more generally). So maybe Shapirov's rule shouldn't be applied to the system of index manipulation in which $$T^{\alpha}_{\enspace\beta}$$ doesn't necessarily equal $$T_{\beta}^{\enspace\alpha}$$.

Rasalhague said:
I gather that some people follow a convention whereby upper indices are always written first, $$T^{\alpha}_{\enspace\beta}$$, or where an arbitrary type-(1,1) tensor is written $$T^{\alpha}_{\beta}$$ (this is what Shapirov does), and only the order of indices on the same level as each other is significant, whereas others use a convention whereby changing the horizontal order of a pair of indices on a type-(1,1) tensor does make a difference (indicating inversion of a Lorentz transformation, and I don't know what--if anything--it indicates more generally). So maybe Shapirov's rule shouldn't be applied to the system of index manipulation in which $$T^{\alpha}_{\enspace\beta}$$ doesn't necessarily equal $$T_{\beta}^{\enspace\alpha}$$.

In GR, if you raise and lower indices, then the ordering, including leaving appropriate spaces for upstairs and downstairs indices matters. In SR, there are some tricks where you don't have to keep track of this, because of the fixed background, and restriction to Lorentz inertial frames, but I don't remember the rules off the top of my head.

Rasalhague said:
Ah, another source of confusion... This differs from the convention explained by Ruslan Shapirov in his Quick Introduction to Tensor Analysis, which I'd assumed was the rule everyone followed:

"For any double indexed array with indices on the same level (both upper or both lower) the first index is a row number, while the second index is a column number. If indices are on different levels (one upper and one lower), then the upper index is a row number, while lower one is a column number."
I don't know what "everyone" is using, but his convention does make sense. It seems that if we use his convention, we can use the [] notation consistently regardless of whether the left or the right index is upstairs. Either way we have $$T_\alpha{}^\beta=\eta_\alpha_\gamma T^\gamma{}_\delta\eta^\delta^\beta$$. The question is, do we want to interpret that as the components of $\eta\Lambda\eta^{-1}$ or as the components of the transpose of that?

Adding to what atyy said, $$T_\alpha{}^\beta[/itex] would be the result of having a tensor $T:V\times V^*\rightarrow\mathbb R$ act on basis vectors, and [tex]S^\alpha{}_\beta[/itex] would be the result of having a tensor $S:V^*\times V\rightarrow\mathbb R$ act on basis vectors. Here V is a vector space (usually the tangent space at some point of a manifold) and V* it's dual space. So the positions of the indices determine what type of tensor we're dealing with. In SR, there's no reason to even think about tensors (at least not in a situation I can think of right now), so I would really prefer to just write components of matrices as $A^\mu_\nu$ or $A_\mu_\nu$. The notational convention we've been discussing in this thread just makes everything more complicated without having any significant benefits. It ensures that we never have to write $^{-1}$ or $^T$ on a Lorentz transformation matrix, but I think that's it. I really don't get why so many (all?) authors choose to write equations like [tex]\Lambda^T\eta\Lambda=\eta$$ in component form.

Last edited:
Sean Carroll, in his GR lecture notes, ch. 1, p. 10, writes, breaking another of Shapirov's rules (that indices should match at the same height on opposite sides of an equation),

We will [...] introduce a somewhat subtle notation by using the same symbol for both matrices [a Lorentz transformation and its inverse], just with primed and unprimed indices adjusted. That is,

$$\left(\Lambda^{-1} \right)^{\nu'}_{\enspace \mu} = \Lambda_{\nu'}^{\enspace \mu}$$

or

$$\Lambda_{\nu'}^{\enspace\mu} \Lambda^{\sigma'}_{\enspace\mu} = \delta^{\sigma'}_{\nu'} \qquad \Lambda_{\nu'}^{\enspace\mu} \Lambda^{\nu'}_{\enspace\rho} = \delta^{\mu}_{\rho'}$$

(Note that Schutz uses a different convention, always arranging the two indices northwest/southeast; the important thing is where the primes go.)

http://preposterousuniverse.com/grnotes/

I haven't seen Schutz's First Course in General Relativity, so I don't know any more about that, but in Blandford and Thorne's Applications of Classical Physics, 1.7.2, where they introduce Lorentz tramsformations, they write

$$L^{\overline{\mu}}_{\enspace \alpha} L^{\alpha}_{\enspace \overline{\nu}} = \delta^{\overline{\mu}}_{\enspace \overline{\nu}} \qquad L^{\alpha}_{\enspace \overline{\mu}} L^{\overline{\nu}}_{\enspace \beta} = \delta^{\alpha}_{\enspace \beta}$$

Notice the up/down placement of indices on the elements of the transformation matrices: the first index is always up, and the second is always down.

Perhaps this is similar to Schutz's notation. Is the role of the left-right ordering, in other people's notation, fulfilled here in Blandford and Thorne's notation by the position of the bar, or would left-right ordering still be needed for a more general treatment?

In other sources I've looked at, such as Bowen and Wang's Introduction to Vector's and Tensors, tensors are just said to be of type, or valency, (p,q), p-times contravariant and q-times covariant, requiring p up indices and q down indices:

$$T : V^{*}_{1} \times ... \times V^{*}_{p} \times V_{1} \times ... \times V_{q} \to \mathbb{R}$$

...with up and down indices ordered separately. So apparently it's more complicated than I realized. Is there any way of explaining or hinting at why leaving spaces for up and down indices becomes important in GR to someone just starting out and battling with the basics of terminology, definitions and notational conventions?

Rasalhague said:
In other sources I've looked at, such as Bowen and Wang's Introduction to Vector's and Tensors, tensors are just said to be of type, or valency, (p,q), p-times contravariant and q-times covariant, requiring p up indices and q down indices:

$$T : V^{*}_{1} \times ... \times V^{*}_{p} \times V_{1} \times ... \times V_{q} \to \mathbb{R}$$

...with up and down indices ordered separately. So apparently it's more complicated than I realized. Is there any way of explaining or hinting at why leaving spaces for up and down indices becomes important in GR to someone just starting out and battling with the basics of terminology, definitions and notational conventions?

A tensor eats a bunch of one forms and tangent vectors and spits out a number. As you can see from the above definition, it matters which one forms and vectors go into which mouth of a tensor, so the order of the up and down indices matter. It is a good idea to keep in mind that one forms and vectors are separate objects.

However, when there is a metric tensor, each one form can be associated with a vector as follows. A one form eats a vector and spits out a number. The metric tensor eats two vectors and spits out a number. So if a metric tensor eats one vector, it's still hungry and can eat another vector, so the half-full metric tensor is a one form. This is defined as the one form associated with the vector that the half-full metric tensor has just eaten, and is denoted by the same symbol as that vector, but with its index lowered or raised (I don't remember which one). So the combined requirement of keeping track of which mouth of a tensor eats what, and the ability to raise or lower indices means that we have to keep track of the combined order of up and down indices.

Aha, so, if I've understood this, it's actually the existence of a metric tensor, which is symmetric and let's us raise and lower indices like this

$$g_{\alpha \beta} V^{\alpha} U^{\beta} = V_{\beta} U^{\beta} = V^{\alpha} U_{\alpha} = g_{\beta \alpha} V^{\alpha} U^{\beta},$$

that makes it necessary to keep track of the order of upper indices relative to lower indices because, for example

$$g_{\alpha \beta} g_{\gamma \delta} T^{\beta}_{\enspace \epsilon}^{ \gamma}_{\enspace \zeta} = T_{\alpha \epsilon \delta \zeta}$$

won't in general be equal to

$$g_{\alpha \beta} g_{\gamma \delta} T^{\beta \gamma}_{\enspace \enspace \epsilon \zeta} = T_{\alpha \delta \epsilon \zeta}.$$

## 1. What is Tensor Confusion?

Tensor Confusion refers to the confusion that can arise when working with tensors in mathematics or computer science. Tensors are multidimensional arrays that are commonly used to represent data in fields such as machine learning and physics. However, their complex nature can make them difficult to understand and work with, leading to confusion.

## 2. What is Lambda in relation to Tensor Confusion?

Lambda, also known as the lambda symbol (λ), is a key concept in Tensor Confusion. It represents a partial derivative, which is a way to measure how a function changes when one of its variables is changed. Lambdas are often used in tensor calculus to calculate the derivatives of tensors.

## 3. How do partial derivatives relate to Tensor Confusion?

Partial derivatives are an important tool in understanding and working with tensors. They allow us to measure how a tensor changes when one of its variables is changed. This is useful in many applications, such as optimizing machine learning models or solving physics problems involving tensors.

## 4. What are some common sources of Tensor Confusion?

There are several common sources of Tensor Confusion, including the complex nature of tensors, the use of abstract mathematical notation, and the use of tensors in different fields with varying definitions and conventions. Additionally, the use of partial derivatives and other mathematical concepts can also contribute to confusion when working with tensors.

## 5. How can one overcome Tensor Confusion?

One way to overcome Tensor Confusion is to gain a deeper understanding of tensors and their properties. This can be achieved through studying tensor calculus, practicing with various tensor operations, and learning from examples and tutorials. Additionally, seeking help from experts or joining online communities focused on tensors can also be helpful in overcoming confusion.

Replies
1
Views
537
Replies
4
Views
680
Replies
17
Views
1K
Replies
3
Views
2K
Replies
2
Views
1K
Replies
9
Views
788
Replies
7
Views
527
Replies
16
Views
2K
Replies
10
Views
482
Replies
8
Views
523