# On the index notation used in Lorentz transformations

1. Apr 2, 2014

### Rococo

I understand how contravariant 4-vectors transform under a Lorentz transformation, that is:

$x'^μ= \Lambda^\mu~_\nu x^\nu$ [1]

and how covariant 4-vectors transform:

$x'_\mu=(\lambda^{-1})^\nu~_\mu x_\nu$. [2]

Now, I have come across the following relations:

$\Lambda^\mu~_\nu = \frac{∂x'^\mu}{∂x^\nu} = \frac{∂x_\nu}{∂x'_\mu}$

and

$(\lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}=\frac{∂x^\nu}{∂x'^\mu}$.

It is clear to me from [1] that $\Lambda^\mu~_\nu = \frac{∂x'^\mu}{∂x^\nu}$. But I cannot see how this is then also equal to $\frac{∂x_\nu}{∂x'_\mu}$.

Similarly, it is easy to see from [2] that $(\lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}$. But I cannot understand how this is also equal to $\frac{∂x^\nu}{∂x'^\mu}$.

Any help on understanding how the indices are manipulated would be appreciated.

2. Apr 2, 2014

### Matterwave

The Lorentz transformations are a limited group of transformations, and as such they are not as general as the potential coordinate transformations that are possible.

Therefore, in general, the Lambda matrices will not equal what you have given, unless the coordinate transformations were restricted to be Lorentz transformations in the first place.

That being said. A general (contravariant) vector will transform under a coordinate transformation as (Einstein summation implied):

$$a'^\mu=\frac{\partial x'^\mu}{\partial x^\nu}a^\nu$$

A one form (covariant vector) will transform as:

$$a'_\mu=\frac{\partial x^\nu}{\partial x'^\mu}a_\nu$$

One can think of these as the defining properties of vectors and one forms (but this is an older way of thinking).

Now, if we restrict ourselves to the Lorentz transformation, then the first equation becomes, by definition:

$$a'^\mu=(\Lambda)^\mu_\nu a^\nu$$

And it is now easy to see:

$$(\Lambda)^\mu_\nu=\frac{\partial x'^\mu}{\partial x^\nu}$$

From simple differential calculus (the chain rule) one can see:

$$\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}=\delta^\mu_\nu$$

Where that's just the Kronecker-delta (identity matrix). Therefore:

$$\frac{\partial x'^\mu}{\partial x^\nu}=(\frac{\partial x^\mu}{\partial x'^\nu})^{-1}$$

Where by inverse I mean the matrix inverse. Therefore, by our transformation rule above, we can see that if we limited ourselves to Lorentz transformations:

$$a'_\mu=(\Lambda^{-1})^\nu_\mu a_\nu$$

And immediately it is apparent:

$$(\Lambda^{-1})^\nu_\mu=\frac{\partial x^\nu}{\partial x'^\mu}$$

This is the result (from [2]) you are unsure of.

I'm not sure the utility of using the lower indices x's since the Cartesian coordinates, in flat spacetime, are contravariant vectors, not covariant vectors.

Last edited: Apr 2, 2014
3. Apr 2, 2014

### Rococo

Thanks for the response.

I'm having difficulty seeing how you went from the equation

$$\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}=\delta^\mu_\nu$$

to:

$$\frac{\partial x'^\mu}{\partial x^\nu}=(\frac{\partial x^\mu}{\partial x'^\nu})^{-1}$$

If I multiply both sides by $(\frac{∂x^\alpha}{∂x'^\nu})^{-1}$like so:

$$\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}(\frac{∂x^\alpha}{∂x'^\nu})^{-1}=\delta^\mu_\nu (\frac{∂x^\alpha}{∂x'^\nu})^{-1}$$

this gives me:

$$\frac{\partial x'^\mu}{\partial x^\alpha}=(\frac{∂x^\alpha}{∂x'^\nu})^{-1}$$

which is different to what you have, so I must be going wrong somewhere.

4. Apr 2, 2014

### Matterwave

All I'm saying is that the matrix with the prime below is simply the matrix inverse of the matrix with the prime above. When you multiply the two together, you get the identity matrix (the Kronecker delta).

Be careful when you are doing matrix summations, always remember that one index should be up and one index should be down.

Notice in my first equation that you quoted, there's an alpha down and an alpha up, which makes it an index to be summed over. It's equivalent to matrix multiplication.

When you are multiplying both sides by a matrix, fix your indices! You have three alphas on the left. And none of the summations are going correctly.

To do what you wanted to do you would do this:

$$\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\tau}(\frac{\partial x'^\tau}{\partial x^\nu})^{-1}=\delta^\mu_\tau(\frac{\partial x'^\tau}{\partial x^\nu})^{-1}$$

Which gives:

$$\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\tau}(\frac{\partial x'^\tau}{\partial x^\nu})^{-1}=(\frac{\partial x'^\mu}{\partial x^\nu})^{-1}$$

But we really haven't gotten anywhere, and I don't know if this will help you see it... It should just be obvious that if you multiply two matrices together and get the identity, then the two matrices are inverses of each other.

Last edited: Apr 2, 2014
5. Apr 3, 2014

### Rococo

I'm having trouble understanding how matrix multiplication comes into this. We have the equation

$\frac{∂x'^\mu}{∂x'^\nu}=\delta^\mu_\nu$

since this expression equals 1 when $\mu=\nu$ and 0 when $\mu≠\nu$. Is this correct?

And hence by the chain rule,

$$\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}=\delta^\mu_\nu$$

Am I right in saying, that because we have $\alpha$ repeated as an upper and lower index, it is summed over 0, 1, 2 and 3. And so would this mean:

$$\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}= \frac{\partial x'^\mu}{\partial x^0}\frac{\partial x^0}{\partial x'^\nu} + \frac{\partial x'^\mu}{\partial x^1}\frac{\partial x^1}{\partial x'^\nu} + \frac{\partial x'^\mu}{\partial x^2}\frac{\partial x^2}{\partial x'^\nu} + \frac{\partial x'^\mu}{\partial x^3}\frac{\partial x^3}{\partial x'^\nu}$$

I'm having difficulty seeing how this can be seen as the result of multiplying two matrices together. It might help me to understand if I were to see the explicit matrix multiplications with the individual elements of the matrices shown.

6. Apr 3, 2014

### Fredrik

Staff Emeritus
If you denote the entry on row $\mu$, column $\nu$ of an arbitrary matrix X by $X^\mu{}_\nu$, the definition of matrix multiplication says that $(AB)^\mu{}_\nu=A^\mu{}_\rho B^\rho{}_\nu$.

Let A be the 4x4 matrix with $\frac{\partial x'^\mu}{\partial x^\alpha}$ on row $\mu$, column $\alpha$. Let B be the 4x4 matrix with $\frac{\partial x^\alpha}{\partial x'^\nu}$ on row $\alpha$, column $\nu$. The last equality in the quote above is simply row $\mu$, column $\nu$ of the matrix equality $AB=I$. Note that this has to mean that A is the inverse of B and vice versa.

It's useful to also be aware of the stuff discussed in this post.

7. Apr 3, 2014

### Rococo

Ok, so we have:

$A=B^{-1}$

where A is the matrix with the $\mu,\alpha$ element equal to $\frac{\partial x'^\mu}{\partial x^\alpha}$ and B is the matrix with the matrix with the $\alpha,\nu$ element equal to $\frac{\partial x^\alpha}{\partial x'^\nu}$.

But how then is the following obtained:

$$\frac{\partial x'^\mu}{\partial x^\nu}=(\frac{\partial x^\mu}{\partial x'^\nu})^{-1}$$

Am I right in saying that the partial derivatives are elements of matrices, not the matrices themselves? If so, I am confused as to the meaning of the above equation.

8. Apr 3, 2014

### Fredrik

Staff Emeritus
Yes, they are matrix elements, not matrices. However, because of the definition of matrix multiplication, to say that $A^\mu{}_\rho B^\rho{}_\nu=\delta^\mu_\nu$ for all $\mu,\nu$, is to say that $AB=I$. So people who use the index notation a lot but never write any "for all" statements tend to think of a notation like $A^\mu{}_\rho$ as representing a matrix rather than a matrix element. I suspect that some of them don't even understand that they're just using the definition of matrix multiplication.

The notation
$$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$ doesn't really make sense. To me it's like saying "nucular" instead of "nuclear". It's ugly as hell, but if enough people do it, you just have to get used to it.

9. Apr 3, 2014

### Rococo

If one were to use that notation, would it not then be:

$\frac{\partial x'^\mu}{\partial x^\alpha} = (\frac{\partial x^\alpha}{\partial x'^\nu})^{-1}$

due to the fact that $A=B^{-1}$. Obviously this is different to the equation:

$$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$

since the alphas remain.

In any case, I would probably like to avoid using statements of the form $$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$ seeing as the partial derivatives are matrix elements, and not matrices. Is there any other way that I could prove the original relations in my opening post, without doing this? There must be a different way I can go about manipulating the indices.

10. Apr 3, 2014

### Fredrik

Staff Emeritus
OK, I see what you mean. The equality $\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$ is however correct if the left-hand side is interpreted as the matrix that for all $\mu,\nu$ has $\frac{\partial x'^\mu}{\partial x^\nu}$ on row $\mu$, column $\nu$, and the right-hand side is interpreted as the inverse of the matrix that for all $\mu,\nu$ has $\frac{\partial x^\mu}{\partial x'^\nu}$ on row $\mu$, column $\nu$. The "for all" makes both indices dummy variables. So you could really use any indices you want.

I wouldn't use this notation myself. I prefer to work with matrices when I can, and with their components when I can't. I never use notations that mix things up like this.

I have to leave the computer for a while. I'll take a look at what you wanted to prove later.

11. Apr 3, 2014

### Matterwave

Frederick is right, there is quite a bit of abuse of notation going on, but it was the most convenient way, that I saw, to show your equality mentioned in your first post.

When we say:

$$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$

We mean that the matrix on the left, with elements $$(\mu,\nu)$$ is the inverse of the matrix on the right with elements $$(\mu,\nu)$$. Your first equation can't be correct because the indices don't match on the two sides. You will notice that on the left is a matrix with indices $$(\mu,\alpha)$$ and on the right is a matrix with indices $$(\alpha,\nu)$$. Since there's no longer anything to be summed over, you have 3 different indices for 2 rank 2 matrices, and so you can't equate any elements with each other.

A less abuse of notation would be:

Let:

$$\bf{A}^\mu_\nu=\frac{\partial x'^\mu}{\partial x^\nu}$$

and let:

$$\bf{B}^\mu_\nu=\frac{\partial x^\mu}{\partial x'^\nu}$$

Where the boldface simply denotes matrices (not tensors). And then we can make the statement:

$$\bf{A}=\bf{B}^{-1}$$

But as you can see, this is a little bit more cumbersome. There is no way to express this equality in tensor form as those matrices do not form tensors. Since they are coordinate transformations, they can't, by definition, be coordinate independent, and so there is no better geometrical way of expressing that equality. One CAN do away with ever having to do coordinate transformations in our definitions of vectors and one forms and work in a completely coordinate free language (in which case coordinate maps are called diffeomorphisms of open sets of the real numbers of dimension n=dim(manifold) with open sets in our manifold). That requires the full power of differential geometry, but in order to actually do every day calculations, coordinates are often necessary at some point.

Edit: There's a bug (or so I'm told) that is making the x's boldface in the above expressions. They should not be, as they are not matrices, but simply coordinates.

12. Apr 3, 2014

### Rococo

I'm still not understanding how you get that $$\bf{A}=\bf{B}^{-1}$$ If this is the case, then we should have $$\bf{A}\bf{B}=\bf{I}$$ If we say that the product of A and B, is equal to another matrix C, then the elements of C along the leading diagonal should all be 1. However:

$\bf{C}=\bf{A}\bf{B}$

$C^\mu_\nu=A^\mu_\alpha B^\alpha_\nu$

$C^0_0=A^0_0 B^0_0 + A^0_1 B^1_0 + A^0_2 B^2_0 + A^0_3 B^3_0$

$C^0_0=\frac{∂x'^0}{∂x^0}\frac{∂x^0}{∂x'^0} + \frac{∂x'^0}{∂x^1}\frac{∂x^1}{∂x'^0} + \frac{∂x'^0}{∂x^2}\frac{∂x^2}{∂x'^0} + \frac{∂x'^0}{∂x^3}\frac{∂x^3}{∂x'^0}$

$C^0_0= \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0}$

$C^0_0= 1 + 1 + 1 + 1$

$C^0_0≠1$

Hence the matrix C not not be the identity matrix, and so it cannot be the case that $$\bf{A}=\bf{B}^{-1}$$

Where am I going wrong?

13. Apr 3, 2014

### Rococo

Why is that equality correct? If it is correct then the product of the two matrices you have defined must be the identity matrix. But I have worked out the element on row 0, column 0 of the product matrix (denoted C) to be the following:

$C^0_0= \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0}$

as shown in more detail in post #12.

As $C^0_0≠1$ then C can not be the identity matrix, if I am not mistaken.

Putting that aside, I would be interested to see any other different approaches to proving the relationships in my original post.

14. Apr 3, 2014

### Bill_K

This is not correct. If x'0 is a function of x1, x2, x3, x0, then the chain rule says:

$\frac{∂x'^0}{∂x'^0} = \frac{∂x'^0}{∂x^0}\frac{∂x^0}{∂x'^0} + \frac{∂x'^0}{∂x^1}\frac{∂x^1}{∂x'^0} + \frac{∂x'^0}{∂x^2}\frac{∂x^2}{∂x'^0} + \frac{∂x'^0}{∂x^3}\frac{∂x^3}{∂x'^0}$

Just one term.

15. Apr 3, 2014

### Fredrik

Staff Emeritus
Let's start with the matrix equation $x'=\Lambda x$. Multiply by $\Lambda^{-1}$ from the left, and we get $x=\Lambda^{-1}x'$. The $\mu$ component of $x'=\Lambda x$ is $x'^\mu=\Lambda^\mu{}_\rho x^\rho$. From this we get
$$\frac{\partial x'^\mu}{\partial x^\nu} =\Lambda^\mu{}_\rho\delta^\rho_\nu =\Lambda^\mu{}_\nu.$$ Similarly, we can use $x=\Lambda^{-1}x'$ to get
$$\frac{\partial x^\mu}{\partial x'^\nu} =(\Lambda^{-1})^\mu{}_\nu.$$ This right-hand side is often written as $\Lambda_\nu{}^\mu$. This is explained in the post I linked to in post #6.

The step from the first line to the second in this quote is wrong. You can't just cancel the dx's. If you could, this would contradict the chain rule. For example:
$$\frac{\partial}{\partial x}f(g(x,y),h(x,y))=\frac{\partial f}{\partial g}\frac{\partial g}{\partial x}+\frac{\partial f}{\partial h}\frac{\partial h}{\partial x}\neq \frac{\partial f}{\partial x} +\frac{\partial f}{\partial x}.$$

16. Apr 3, 2014

### Rococo

Thanks, I'd forgotten the chain rule for partial derivatives.

17. Apr 3, 2014

### Rococo

I see, so at the moment we have the two equations:

$\Lambda^\mu{}_\nu=\frac{\partial x'^\mu}{\partial x^\nu}$

and

$(\Lambda^{-1})^\mu{}_\nu=\frac{\partial x^\mu}{\partial x'^\nu}$

and I am trying to prove the following two relations:

$\Lambda^\mu~_\nu = \frac{∂x'^\mu}{∂x^\nu} = \frac{∂x_\nu}{∂x'_\mu}$

and

$(\Lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}=\frac{∂x^\nu}{∂x'^\mu}$

but I'm still not sure how to go about it. It is clear that the middle term in in these two equations satisfies the relation as I showed in post #1. But it is the terms on the far right which I am having trouble obtaining.

18. Apr 3, 2014

### Fredrik

Staff Emeritus
I didn't look at this closely enough before. I see now that the indices on the right are subscripts. OK, we need to prove that $x_\nu=\Lambda^\mu{}_\nu x'_\mu$. This will imply that your right-hand side is equal to your left-hand side.

The notation $x_\mu$ is defined by $x_\mu=\eta_{\mu\nu}x^\nu =(\eta x)^\mu$. Note that it's conventional to define the matrix $\eta$ so that its component on row $\mu$, column $\nu$ is denoted by $\eta_{\mu\nu}$. It's when the matrix $\eta$ is defined this way that an arbitrary Lorentz transformation $\Lambda$ satisfies $\Lambda^T\eta\Lambda=\eta$. We will need a formula for $\Lambda^{-1}$. If we multiply this equality by $\eta^{-1}$ from the left, we get $\eta^{-1}\Lambda^T\eta\Lambda=I$, which implies that $\Lambda^{-1}=\eta^{-1}\Lambda^T\eta$.
\begin{align}
&\eta x=\eta \Lambda^{-1} x' =\eta(\eta^{-1}\Lambda^T\eta) x'=\Lambda^T\eta x'\\
&x_\nu=(\eta x)^\nu =(\Lambda^T\eta x')^\nu =(\Lambda^T)^\nu{}_\mu (\eta x')^\mu=\Lambda^\mu{}_\nu x'_\mu.
\end{align}
Edit: I think most people would do this using the component version of $\Lambda^T\eta\Lambda=\eta$, which is
$$\eta_{\mu\nu}=(\Lambda^T)^\mu{}_\rho\eta_{\rho\sigma}\Lambda^\sigma{}_\nu =\eta_{\rho\sigma}\Lambda^\rho{}_\mu\Lambda^\sigma{}_\nu.$$

Last edited: Apr 3, 2014
19. Apr 3, 2014

### Rococo

Thanks, so now we have shown:

$x_\nu=\Lambda^\mu~_\nu x'_\mu$

Hence: $\frac{∂x_\nu}{∂x'_\mu} = \Lambda^\mu~_\nu$

and so the first relation: $\Lambda^\mu~_\nu = \frac{∂x'^\mu}{∂x^\nu} = \frac{∂x_\nu}{∂x'_\mu}$

has been shown.

20. Apr 3, 2014

### Rococo

Now, the second relation, $(\Lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}=\frac{∂x^\nu}{∂x'^\mu}$ is to be proved. I've tried to do it in a similar way.

We need to show: $x^\nu = (\Lambda^{-1})^\nu~_\mu x'^\mu$

So:

$x^v = η^{\mu\nu}x_\mu = (ηx)_\nu$

But we have: $ηx=(\Lambda)^Tηx'$

Hence: $x^\nu = (\Lambda^Tηx')_\nu$

$x^\nu = (\Lambda^T)_\nu~^\mu (ηx')_\mu$

$x^\nu = (\Lambda^T)_\nu~^\mu x'^\mu$

$\frac{∂x^\nu}{∂x'^\mu} = \Lambda_\mu~^\nu$

$\Lambda_\mu~^\nu = \frac{∂x^\nu}{∂x'^\mu}$

$(\Lambda^{-1})^\nu~_\mu = \frac{∂x^\nu}{∂x'^\mu}$

Which proves the second relation. Are all the above steps valid?