On the index notation used in Lorentz transformations

Rococo · Apr 2, 2014

I understand how contravariant 4-vectors transform under a Lorentz transformation, that is:

##x'^μ= \Lambda^\mu~_\nu x^\nu## [1]

and how covariant 4-vectors transform:

##x'_\mu=(\lambda^{-1})^\nu~_\mu x_\nu##. [2]

Now, I have come across the following relations:

##\Lambda^\mu~_\nu = \frac{∂x'^\mu}{∂x^\nu} = \frac{∂x_\nu}{∂x'_\mu}##

and

##(\lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}=\frac{∂x^\nu}{∂x'^\mu}##.

It is clear to me from [1] that ##\Lambda^\mu~_\nu = \frac{∂x'^\mu}{∂x^\nu} ##. But I cannot see how this is then also equal to ##\frac{∂x_\nu}{∂x'_\mu}##.

Similarly, it is easy to see from [2] that ##(\lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}##. But I cannot understand how this is also equal to ##\frac{∂x^\nu}{∂x'^\mu}##.

Any help on understanding how the indices are manipulated would be appreciated.

Matterwave · Apr 2, 2014

The Lorentz transformations are a limited group of transformations, and as such they are not as general as the potential coordinate transformations that are possible.

Therefore, in general, the Lambda matrices will not equal what you have given, unless the coordinate transformations were restricted to be Lorentz transformations in the first place.

That being said. A general (contravariant) vector will transform under a coordinate transformation as (Einstein summation implied):

[tex]a'^\mu=\frac{\partial x'^\mu}{\partial x^\nu}a^\nu[/tex]

A one form (covariant vector) will transform as:

[tex]a'_\mu=\frac{\partial x^\nu}{\partial x'^\mu}a_\nu[/tex]

One can think of these as the defining properties of vectors and one forms (but this is an older way of thinking).

Now, if we restrict ourselves to the Lorentz transformation, then the first equation becomes, by definition:

[tex]a'^\mu=(\Lambda)^\mu_\nu a^\nu[/tex]

And it is now easy to see:

[tex](\Lambda)^\mu_\nu=\frac{\partial x'^\mu}{\partial x^\nu}[/tex]

From simple differential calculus (the chain rule) one can see:

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}=\delta^\mu_\nu[/tex]

Where that's just the Kronecker-delta (identity matrix). Therefore:

[tex]\frac{\partial x'^\mu}{\partial x^\nu}=(\frac{\partial x^\mu}{\partial x'^\nu})^{-1}[/tex]

Where by inverse I mean the matrix inverse. Therefore, by our transformation rule above, we can see that if we limited ourselves to Lorentz transformations:

[tex]a'_\mu=(\Lambda^{-1})^\nu_\mu a_\nu[/tex]

And immediately it is apparent:

[tex](\Lambda^{-1})^\nu_\mu=\frac{\partial x^\nu}{\partial x'^\mu}[/tex]

This is the result (from [2]) you are unsure of.

I'm not sure the utility of using the lower indices x's since the Cartesian coordinates, in flat spacetime, are contravariant vectors, not covariant vectors.

Rococo · Apr 2, 2014

Matterwave said:

The Lorentz transformations are a limited group of transformations, and as such they are not as general as the potential coordinate transformations that are possible.

Therefore, in general, the Lambda matrices will not equal what you have given, unless the coordinate transformations were restricted to be Lorentz transformations in the first place.

That being said. A general (contravariant) vector will transform under a coordinate transformation as (Einstein summation implied):

[tex]a'^\mu=\frac{\partial x'^\mu}{\partial x^\nu}a^\nu[/tex]

A one form (covariant vector) will transform as:

[tex]a'_\mu=\frac{\partial x^\nu}{\partial x'^\mu}a_\nu[/tex]

One can think of these as the defining properties of vectors and one forms (but this is an older way of thinking).

Now, if we restrict ourselves to the Lorentz transformation, then the first equation becomes, by definition:

[tex]a'^\mu=(\Lambda)^\mu_\nu a^\nu[/tex]

And it is now easy to see:

[tex](\Lambda)^\mu_\nu=\frac{\partial x'^\mu}{\partial x^\nu}[/tex]

From simple differential calculus (the chain rule) one can see:

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}=\delta^\mu_\nu[/tex]

Where that's just the Kronecker-delta (identity matrix). Therefore:

[tex]\frac{\partial x'^\mu}{\partial x^\nu}=(\frac{\partial x^\mu}{\partial x'^\nu})^{-1}[/tex]

Where by inverse I mean the matrix inverse. Therefore, by our transformation rule above, we can see that if we limited ourselves to Lorentz transformations:

[tex]a'_\mu=(\Lambda^{-1})^\nu_\mu a_\nu[/tex]

And immediately it is apparent:

[tex](\Lambda^{-1})^\nu_\mu=\frac{\partial x^\nu}{\partial x'^\mu}[/tex]

This is the result (from [2]) you are unsure of.

I'm not sure the utility of using the lower indices x's since the Cartesian coordinates, in flat spacetime, are contravariant vectors, not covariant vectors.

Thanks for the response.

I'm having difficulty seeing how you went from the equation

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}=\delta^\mu_\nu[/tex]

to:

[tex]\frac{\partial x'^\mu}{\partial x^\nu}=(\frac{\partial x^\mu}{\partial x'^\nu})^{-1}[/tex]

If I multiply both sides by ##(\frac{∂x^\alpha}{∂x'^\nu})^{-1} ##like so:

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}(\frac{∂x^\alpha}{∂x'^\nu})^{-1}=\delta^\mu_\nu (\frac{∂x^\alpha}{∂x'^\nu})^{-1}[/tex]

this gives me:

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}=(\frac{∂x^\alpha}{∂x'^\nu})^{-1}[/tex]

which is different to what you have, so I must be going wrong somewhere.

Matterwave · Apr 2, 2014

All I'm saying is that the matrix with the prime below is simply the matrix inverse of the matrix with the prime above. When you multiply the two together, you get the identity matrix (the Kronecker delta).

Be careful when you are doing matrix summations, always remember that one index should be up and one index should be down.

Notice in my first equation that you quoted, there's an alpha down and an alpha up, which makes it an index to be summed over. It's equivalent to matrix multiplication.

When you are multiplying both sides by a matrix, fix your indices! You have three alphas on the left. And none of the summations are going correctly.

To do what you wanted to do you would do this:

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\tau}(\frac{\partial x'^\tau}{\partial x^\nu})^{-1}=\delta^\mu_\tau(\frac{\partial x'^\tau}{\partial x^\nu})^{-1}[/tex]

Which gives:

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\tau}(\frac{\partial x'^\tau}{\partial x^\nu})^{-1}=(\frac{\partial x'^\mu}{\partial x^\nu})^{-1}[/tex]

But we really haven't gotten anywhere, and I don't know if this will help you see it... It should just be obvious that if you multiply two matrices together and get the identity, then the two matrices are inverses of each other.

Rococo · Apr 3, 2014

Matterwave said:

All I'm saying is that the matrix with the prime below is simply the matrix inverse of the matrix with the prime above. When you multiply the two together, you get the identity matrix (the Kronecker delta).

Be careful when you are doing matrix summations, always remember that one index should be up and one index should be down.

Notice in my first equation that you quoted, there's an alpha down and an alpha up, which makes it an index to be summed over. It's equivalent to matrix multiplication.

When you are multiplying both sides by a matrix, fix your indices! You have three alphas on the left. And none of the summations are going correctly.

To do what you wanted to do you would do this:

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\tau}(\frac{\partial x'^\tau}{\partial x^\nu})^{-1}=\delta^\mu_\tau(\frac{\partial x'^\tau}{\partial x^\nu})^{-1}[/tex]

Which gives:

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\tau}(\frac{\partial x'^\tau}{\partial x^\nu})^{-1}=(\frac{\partial x'^\mu}{\partial x^\nu})^{-1}[/tex]

But we really haven't gotten anywhere, and I don't know if this will help you see it... It should just be obvious that if you multiply two matrices together and get the identity, then the two matrices are inverses of each other.

I'm having trouble understanding how matrix multiplication comes into this. We have the equation

##\frac{∂x'^\mu}{∂x'^\nu}=\delta^\mu_\nu##

since this expression equals 1 when ##\mu=\nu## and 0 when ##\mu≠\nu##. Is this correct?

And hence by the chain rule,

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}=\delta^\mu_\nu[/tex]

Am I right in saying, that because we have ##\alpha## repeated as an upper and lower index, it is summed over 0, 1, 2 and 3. And so would this mean:

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}= \frac{\partial x'^\mu}{\partial x^0}\frac{\partial x^0}{\partial x'^\nu} + \frac{\partial x'^\mu}{\partial x^1}\frac{\partial x^1}{\partial x'^\nu} + \frac{\partial x'^\mu}{\partial x^2}\frac{\partial x^2}{\partial x'^\nu} + \frac{\partial x'^\mu}{\partial x^3}\frac{\partial x^3}{\partial x'^\nu}[/tex]

I'm having difficulty seeing how this can be seen as the result of multiplying two matrices together. It might help me to understand if I were to see the explicit matrix multiplications with the individual elements of the matrices shown.

Fredrik · Apr 3, 2014

Rococo said:

I'm having trouble understanding how matrix multiplication comes into this. We have the equation

##\frac{∂x'^\mu}{∂x'^\nu}=\delta^\mu_\nu##

since this expression equals 1 when ##\mu=\nu## and 0 when ##\mu≠\nu##. Is this correct?

And hence by the chain rule,

[tex]\frac{\partial x'^\mu}{\partial x^\alpha}\frac{\partial x^\alpha}{\partial x'^\nu}=\delta^\mu_\nu[/tex]

If you denote the entry on row ##\mu##, column ##\nu## of an arbitrary matrix X by ##X^\mu{}_\nu##, the definition of matrix multiplication says that ##(AB)^\mu{}_\nu=A^\mu{}_\rho B^\rho{}_\nu##.

Let A be the 4x4 matrix with ##\frac{\partial x'^\mu}{\partial x^\alpha}## on row ##\mu##, column ##\alpha##. Let B be the 4x4 matrix with ##\frac{\partial x^\alpha}{\partial x'^\nu}## on row ##\alpha##, column ##\nu##. The last equality in the quote above is simply row ##\mu##, column ##\nu## of the matrix equality ##AB=I##. Note that this has to mean that A is the inverse of B and vice versa.

It's useful to also be aware of the stuff discussed in this post.

Rococo · Apr 3, 2014

Fredrik said:

If you denote the entry on row ##\mu##, column ##\nu## of an arbitrary matrix X by ##X^\mu{}_\nu##, the definition of matrix multiplication says that ##(AB)^\mu{}_\nu=A^\mu{}_\rho B^\rho{}_\nu##.

Let A be the 4x4 matrix with ##\frac{\partial x'^\mu}{\partial x^\alpha}## on row ##\mu##, column ##\alpha##. Let B be the 4x4 matrix with ##\frac{\partial x^\alpha}{\partial x'^\nu}## on row ##\alpha##, column ##\nu##. The last equality in the quote above is simply row ##\mu##, column ##\nu## of the matrix equality ##AB=I##. Note that this has to mean that A is the inverse of B and vice versa.

It's useful to also be aware of the stuff discussed in this post.

Ok, so we have:

##A=B^{-1}##

where A is the matrix with the ##\mu,\alpha## element equal to ##\frac{\partial x'^\mu}{\partial x^\alpha}## and B is the matrix with the matrix with the ##\alpha,\nu## element equal to ##\frac{\partial x^\alpha}{\partial x'^\nu}##.

But how then is the following obtained:

[tex]\frac{\partial x'^\mu}{\partial x^\nu}=(\frac{\partial x^\mu}{\partial x'^\nu})^{-1}[/tex]

Am I right in saying that the partial derivatives are elements of matrices, not the matrices themselves? If so, I am confused as to the meaning of the above equation.

Fredrik · Apr 3, 2014

Yes, they are matrix elements, not matrices. However, because of the definition of matrix multiplication, to say that ##A^\mu{}_\rho B^\rho{}_\nu=\delta^\mu_\nu## for all ##\mu,\nu##, is to say that ##AB=I##. So people who use the index notation a lot but never write any "for all" statements tend to think of a notation like ##A^\mu{}_\rho## as representing a matrix rather than a matrix element. I suspect that some of them don't even understand that they're just using the definition of matrix multiplication.

The notation
$$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$ doesn't really make sense. To me it's like saying "nucular" instead of "nuclear". It's ugly as hell, but if enough people do it, you just have to get used to it.

Rococo · Apr 3, 2014

Fredrik said:

Yes, they are matrix elements, not matrices. However, because of the definition of matrix multiplication, to say that ##A^\mu{}_\rho B^\rho{}_\nu=\delta^\mu_\nu## for all ##\mu,\nu##, is to say that ##AB=I##. So people who use the index notation a lot but never write any "for all" statements tend to think of a notation like ##A^\mu{}_\rho## as representing a matrix rather than a matrix element. I suspect that some of them don't even understand that they're just using the definition of matrix multiplication.

The notation
$$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$ doesn't really make sense. To me it's like saying "nucular" instead of "nuclear". It's ugly as hell, but if enough people do it, you just have to get used to it.

If one were to use that notation, would it not then be:

##\frac{\partial x'^\mu}{\partial x^\alpha} = (\frac{\partial x^\alpha}{\partial x'^\nu})^{-1}##

due to the fact that ##A=B^{-1}##. Obviously this is different to the equation:

$$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$

since the alphas remain.

In any case, I would probably like to avoid using statements of the form $$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$ seeing as the partial derivatives are matrix elements, and not matrices. Is there any other way that I could prove the original relations in my opening post, without doing this? There must be a different way I can go about manipulating the indices.

Fredrik · Apr 3, 2014

OK, I see what you mean. The equality ##\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}## is however correct if the left-hand side is interpreted as the matrix that for all ##\mu,\nu## has ##\frac{\partial x'^\mu}{\partial x^\nu}## on row ##\mu##, column ##\nu##, and the right-hand side is interpreted as the inverse of the matrix that for all ##\mu,\nu## has ##\frac{\partial x^\mu}{\partial x'^\nu}## on row ##\mu##, column ##\nu##. The "for all" makes both indices dummy variables. So you could really use any indices you want.

I wouldn't use this notation myself. I prefer to work with matrices when I can, and with their components when I can't. I never use notations that mix things up like this.

I have to leave the computer for a while. I'll take a look at what you wanted to prove later.

Matterwave · Apr 3, 2014

Rococo said:

If one were to use that notation, would it not then be:

##\frac{\partial x'^\mu}{\partial x^\alpha} = (\frac{\partial x^\alpha}{\partial x'^\nu})^{-1}##

due to the fact that ##A=B^{-1}##. Obviously this is different to the equation:

$$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$

since the alphas remain.

In any case, I would probably like to avoid using statements of the form $$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$ seeing as the partial derivatives are matrix elements, and not matrices. Is there any other way that I could prove the original relations in my opening post, without doing this? There must be a different way I can go about manipulating the indices.

Frederick is right, there is quite a bit of abuse of notation going on, but it was the most convenient way, that I saw, to show your equality mentioned in your first post.

When we say:

$$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$

We mean that the matrix on the left, with elements $$(\mu,\nu)$$ is the inverse of the matrix on the right with elements $$(\mu,\nu)$$. Your first equation can't be correct because the indices don't match on the two sides. You will notice that on the left is a matrix with indices $$(\mu,\alpha)$$ and on the right is a matrix with indices $$(\alpha,\nu)$$. Since there's no longer anything to be summed over, you have 3 different indices for 2 rank 2 matrices, and so you can't equate any elements with each other.

A less abuse of notation would be:

Let:

$$\bf{A}^\mu_\nu=\frac{\partial x'^\mu}{\partial x^\nu}$$

and let:

$$\bf{B}^\mu_\nu=\frac{\partial x^\mu}{\partial x'^\nu}$$

Where the boldface simply denotes matrices (not tensors). And then we can make the statement:

$$\bf{A}=\bf{B}^{-1}$$

But as you can see, this is a little bit more cumbersome. There is no way to express this equality in tensor form as those matrices do not form tensors. Since they are coordinate transformations, they can't, by definition, be coordinate independent, and so there is no better geometrical way of expressing that equality. One CAN do away with ever having to do coordinate transformations in our definitions of vectors and one forms and work in a completely coordinate free language (in which case coordinate maps are called diffeomorphisms of open sets of the real numbers of dimension n=dim(manifold) with open sets in our manifold). That requires the full power of differential geometry, but in order to actually do every day calculations, coordinates are often necessary at some point.

Edit: There's a bug (or so I'm told) that is making the x's boldface in the above expressions. They should not be, as they are not matrices, but simply coordinates.

Rococo · Apr 3, 2014

Matterwave said:

Frederick is right, there is quite a bit of abuse of notation going on, but it was the most convenient way, that I saw, to show your equality mentioned in your first post.

When we say:

$$\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}$$

We mean that the matrix on the left, with elements $$(\mu,\nu)$$ is the inverse of the matrix on the right with elements $$(\mu,\nu)$$. Your first equation can't be correct because the indices don't match on the two sides. You will notice that on the left is a matrix with indices $$(\mu,\alpha)$$ and on the right is a matrix with indices $$(\alpha,\nu)$$. Since there's no longer anything to be summed over, you have 3 different indices for 2 rank 2 matrices, and so you can't equate any elements with each other.

A less abuse of notation would be:

Let:

$$\bf{A}^\mu_\nu=\frac{\partial x'^\mu}{\partial x^\nu}$$

and let:

$$\bf{B}^\mu_\nu=\frac{\partial x^\mu}{\partial x'^\nu}$$

Where the boldface simply denotes matrices (not tensors). And then we can make the statement:

$$\bf{A}=\bf{B}^{-1}$$

But as you can see, this is a little bit more cumbersome. There is no way to express this equality in tensor form as those matrices do not form tensors. Since they are coordinate transformations, they can't, by definition, be coordinate independent, and so there is no better geometrical way of expressing that equality. One CAN do away with ever having to do coordinate transformations in our definitions of vectors and one forms and work in a completely coordinate free language (in which case coordinate maps are called diffeomorphisms of open sets of the real numbers of dimension n=dim(manifold) with open sets in our manifold). That requires the full power of differential geometry, but in order to actually do every day calculations, coordinates are often necessary at some point.

Edit: There's a bug (or so I'm told) that is making the x's boldface in the above expressions. They should not be, as they are not matrices, but simply coordinates.

I'm still not understanding how you get that $$\bf{A}=\bf{B}^{-1}$$ If this is the case, then we should have $$\bf{A}\bf{B}=\bf{I}$$ If we say that the product of A and B, is equal to another matrix C, then the elements of C along the leading diagonal should all be 1. However:

##\bf{C}=\bf{A}\bf{B}##

##C^\mu_\nu=A^\mu_\alpha B^\alpha_\nu##

##C^0_0=A^0_0 B^0_0 + A^0_1 B^1_0 + A^0_2 B^2_0 + A^0_3 B^3_0##

##C^0_0=\frac{∂x'^0}{∂x^0}\frac{∂x^0}{∂x'^0} + \frac{∂x'^0}{∂x^1}\frac{∂x^1}{∂x'^0} + \frac{∂x'^0}{∂x^2}\frac{∂x^2}{∂x'^0} + \frac{∂x'^0}{∂x^3}\frac{∂x^3}{∂x'^0}##

##C^0_0= \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0}##

##C^0_0= 1 + 1 + 1 + 1##

##C^0_0≠1##

Hence the matrix C not not be the identity matrix, and so it cannot be the case that $$\bf{A}=\bf{B}^{-1}$$

Where am I going wrong?

Rococo · Apr 3, 2014

Fredrik said:

OK, I see what you mean. The equality ##\frac{\partial x'^\mu}{\partial x^\nu}=\left(\frac{\partial x^\mu}{\partial x'^\nu}\right)^{-1}## is however correct if the left-hand side is interpreted as the matrix that for all ##\mu,\nu## has ##\frac{\partial x'^\mu}{\partial x^\nu}## on row ##\mu##, column ##\nu##, and the right-hand side is interpreted as the inverse of the matrix that for all ##\mu,\nu## has ##\frac{\partial x^\mu}{\partial x'^\nu}## on row ##\mu##, column ##\nu##. The "for all" makes both indices dummy variables. So you could really use any indices you want.

I wouldn't use this notation myself. I prefer to work with matrices when I can, and with their components when I can't. I never use notations that mix things up like this.

I have to leave the computer for a while. I'll take a look at what you wanted to prove later.

Why is that equality correct? If it is correct then the product of the two matrices you have defined must be the identity matrix. But I have worked out the element on row 0, column 0 of the product matrix (denoted C) to be the following:

##C^0_0= \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0}##

as shown in more detail in post #12.

As ##C^0_0≠1## then C can not be the identity matrix, if I am not mistaken.

Putting that aside, I would be interested to see any other different approaches to proving the relationships in my original post.

Bill_K · Apr 3, 2014

Rococo said:

##C^0_0=\frac{∂x'^0}{∂x^0}\frac{∂x^0}{∂x'^0} + \frac{∂x'^0}{∂x^1}\frac{∂x^1}{∂x'^0} + \frac{∂x'^0}{∂x^2}\frac{∂x^2}{∂x'^0} + \frac{∂x'^0}{∂x^3}\frac{∂x^3}{∂x'^0}##

##C^0_0= \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0}##

This is not correct. If x'⁰ is a function of x¹, x², x³, x⁰, then the chain rule says:

##\frac{∂x'^0}{∂x'^0} = \frac{∂x'^0}{∂x^0}\frac{∂x^0}{∂x'^0} + \frac{∂x'^0}{∂x^1}\frac{∂x^1}{∂x'^0} + \frac{∂x'^0}{∂x^2}\frac{∂x^2}{∂x'^0} + \frac{∂x'^0}{∂x^3}\frac{∂x^3}{∂x'^0}##

Just one term.

Fredrik · Apr 3, 2014

Let's start with the matrix equation ##x'=\Lambda x##. Multiply by ##\Lambda^{-1}## from the left, and we get ##x=\Lambda^{-1}x'##. The ##\mu## component of ##x'=\Lambda x## is ##x'^\mu=\Lambda^\mu{}_\rho x^\rho##. From this we get
$$\frac{\partial x'^\mu}{\partial x^\nu} =\Lambda^\mu{}_\rho\delta^\rho_\nu =\Lambda^\mu{}_\nu.$$ Similarly, we can use ##x=\Lambda^{-1}x'## to get
$$\frac{\partial x^\mu}{\partial x'^\nu} =(\Lambda^{-1})^\mu{}_\nu.$$ This right-hand side is often written as ##\Lambda_\nu{}^\mu##. This is explained in the post I linked to in post #6.

Rococo said:

##C^0_0=\frac{∂x'^0}{∂x^0}\frac{∂x^0}{∂x'^0} + \frac{∂x'^0}{∂x^1}\frac{∂x^1}{∂x'^0} + \frac{∂x'^0}{∂x^2}\frac{∂x^2}{∂x'^0} + \frac{∂x'^0}{∂x^3}\frac{∂x^3}{∂x'^0}##

##C^0_0= \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0} + \frac{∂x'^0}{∂x'^0}##

The step from the first line to the second in this quote is wrong. You can't just cancel the dx's. If you could, this would contradict the chain rule. For example:
$$\frac{\partial}{\partial x}f(g(x,y),h(x,y))=\frac{\partial f}{\partial g}\frac{\partial g}{\partial x}+\frac{\partial f}{\partial h}\frac{\partial h}{\partial x}\neq \frac{\partial f}{\partial x} +\frac{\partial f}{\partial x}.$$

Rococo · Apr 3, 2014

Bill_K said:

This is not correct. If x'⁰ is a function of x¹, x², x³, x⁰, then the chain rule says:

##\frac{∂x'^0}{∂x'^0} = \frac{∂x'^0}{∂x^0}\frac{∂x^0}{∂x'^0} + \frac{∂x'^0}{∂x^1}\frac{∂x^1}{∂x'^0} + \frac{∂x'^0}{∂x^2}\frac{∂x^2}{∂x'^0} + \frac{∂x'^0}{∂x^3}\frac{∂x^3}{∂x'^0}##

Just one term.

Thanks, I'd forgotten the chain rule for partial derivatives.

Rococo · Apr 3, 2014

Fredrik said:

Let's start with the matrix equation ##x'=\Lambda x##. Multiply by ##\Lambda^{-1}## from the left, and we get ##x=\Lambda^{-1}x'##. The ##\mu## component of ##x'=\Lambda x## is ##x'^\mu=\Lambda^\mu{}_\rho x^\rho##. From this we get
$$\frac{\partial x'^\mu}{\partial x^\nu} =\Lambda^\mu{}_\rho\delta^\rho_\nu =\Lambda^\mu{}_\nu.$$ Similarly, we can use ##x=\Lambda^{-1}x'## to get
$$\frac{\partial x^\mu}{\partial x'^\nu} =(\Lambda^{-1})^\mu{}_\nu.$$ This right-hand side is often written as ##\Lambda_\nu{}^\mu##. This is explained in the post I linked to in post #6.

The step from the first line to the second in this quote is wrong. You can't just cancel the dx's. If you could, this would contradict the chain rule. For example:
$$\frac{\partial}{\partial x}f(g(x,y),h(x,y))=\frac{\partial f}{\partial g}\frac{\partial g}{\partial x}+\frac{\partial f}{\partial h}\frac{\partial h}{\partial x}\neq \frac{\partial f}{\partial x} +\frac{\partial f}{\partial x}.$$

I see, so at the moment we have the two equations:

##\Lambda^\mu{}_\nu=\frac{\partial x'^\mu}{\partial x^\nu}##

and

##(\Lambda^{-1})^\mu{}_\nu=\frac{\partial x^\mu}{\partial x'^\nu}##

and I am trying to prove the following two relations:

##\Lambda^\mu~_\nu = \frac{∂x'^\mu}{∂x^\nu} = \frac{∂x_\nu}{∂x'_\mu}##

and

##(\Lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}=\frac{∂x^\nu}{∂x'^\mu}##

but I'm still not sure how to go about it. It is clear that the middle term in in these two equations satisfies the relation as I showed in post #1. But it is the terms on the far right which I am having trouble obtaining.

Fredrik · Apr 3, 2014

Rococo said:

##\Lambda^\mu~_\nu = \frac{∂x'^\mu}{∂x^\nu} = \frac{∂x_\nu}{∂x'_\mu}##

I didn't look at this closely enough before. I see now that the indices on the right are subscripts. OK, we need to prove that ##x_\nu=\Lambda^\mu{}_\nu x'_\mu##. This will imply that your right-hand side is equal to your left-hand side.

The notation ##x_\mu## is defined by ##x_\mu=\eta_{\mu\nu}x^\nu =(\eta x)^\mu##. Note that it's conventional to define the matrix ##\eta## so that its component on row ##\mu##, column ##\nu## is denoted by ##\eta_{\mu\nu}##. It's when the matrix ##\eta## is defined this way that an arbitrary Lorentz transformation ##\Lambda## satisfies ##\Lambda^T\eta\Lambda=\eta##. We will need a formula for ##\Lambda^{-1}##. If we multiply this equality by ##\eta^{-1}## from the left, we get ##\eta^{-1}\Lambda^T\eta\Lambda=I##, which implies that ##\Lambda^{-1}=\eta^{-1}\Lambda^T\eta##.
\begin{align}
&\eta x=\eta \Lambda^{-1} x' =\eta(\eta^{-1}\Lambda^T\eta) x'=\Lambda^T\eta x'\\
&x_\nu=(\eta x)^\nu =(\Lambda^T\eta x')^\nu =(\Lambda^T)^\nu{}_\mu (\eta x')^\mu=\Lambda^\mu{}_\nu x'_\mu.
\end{align}
Edit: I think most people would do this using the component version of ##\Lambda^T\eta\Lambda=\eta##, which is
$$\eta_{\mu\nu}=(\Lambda^T)^\mu{}_\rho\eta_{\rho\sigma}\Lambda^\sigma{}_\nu =\eta_{\rho\sigma}\Lambda^\rho{}_\mu\Lambda^\sigma{}_\nu.$$

Rococo · Apr 3, 2014

Fredrik said:

I didn't look at this closely enough before. I see now that the indices on the right are subscripts. OK, we need to prove that ##x_\nu=\Lambda^\mu{}_\nu x'_\mu##. This will imply that your right-hand side is equal to your left-hand side.

The notation ##x_\mu## is defined by ##x_\mu=\eta_{\mu\nu}x^\nu =(\eta x)^\mu##. Note that it's conventional to define the matrix ##\eta## so that its component on row ##\mu##, column ##\nu## is denoted by ##\eta_{\mu\nu}##. It's when the matrix ##\eta## is defined this way that an arbitrary Lorentz transformation ##\Lambda## satisfies ##\Lambda^T\eta\Lambda=\eta##. We will need a formula for ##\Lambda^{-1}##. If we multiply this equality by ##\eta^{-1}## from the left, we get ##\eta^{-1}\Lambda^T\eta\Lambda=I##, which implies that ##\Lambda^{-1}=\eta^{-1}\Lambda^T\eta##.
\begin{align}
&\eta x=\eta \Lambda^{-1} x' =\eta(\eta^{-1}\Lambda^T\eta) x'=\Lambda^T\eta x'\\
&x_\nu=(\eta x)^\nu =(\Lambda^T\eta x')^\nu =(\Lambda^T)^\nu{}_\mu (\eta x')^\mu=\Lambda^\mu{}_\nu x'_\mu.
\end{align}
Edit: I think most people would do this using the component version of ##\Lambda^T\eta\Lambda=\eta##, which is
$$\eta_{\mu\nu}=(\Lambda^T)^\mu{}_\rho\eta_{\rho\sigma}\Lambda^\sigma{}_\nu =\eta_{\rho\sigma}\Lambda^\rho{}_\mu\Lambda^\sigma{}_\nu.$$

Thanks, so now we have shown:

##x_\nu=\Lambda^\mu~_\nu x'_\mu##

Hence: ##\frac{∂x_\nu}{∂x'_\mu} = \Lambda^\mu~_\nu##

and so the first relation: ##\Lambda^\mu~_\nu = \frac{∂x'^\mu}{∂x^\nu} = \frac{∂x_\nu}{∂x'_\mu}##

has been shown.

Rococo · Apr 3, 2014

Now, the second relation, ##(\Lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}=\frac{∂x^\nu}{∂x'^\mu}## is to be proved. I've tried to do it in a similar way.

We need to show: ##x^\nu = (\Lambda^{-1})^\nu~_\mu x'^\mu##

So:

##x^v = η^{\mu\nu}x_\mu = (ηx)_\nu##

But we have: ##ηx=(\Lambda)^Tηx'##

Hence: ##x^\nu = (\Lambda^Tηx')_\nu##

##x^\nu = (\Lambda^T)_\nu~^\mu (ηx')_\mu##

##x^\nu = (\Lambda^T)_\nu~^\mu x'^\mu##

##\frac{∂x^\nu}{∂x'^\mu} = \Lambda_\mu~^\nu##

##\Lambda_\mu~^\nu = \frac{∂x^\nu}{∂x'^\mu}##

##(\Lambda^{-1})^\nu~_\mu = \frac{∂x^\nu}{∂x'^\mu}##

Which proves the second relation. Are all the above steps valid?

Fredrik · Apr 3, 2014

Rococo said:

Now, the second relation, ##(\Lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}=\frac{∂x^\nu}{∂x'^\mu}## is to be proved. I've tried to do it in a similar way.

We need to show: ##x^\nu = (\Lambda^{-1})^\nu~_\mu x'^\mu##

This one follows almost immediately from our starting point ##x'=\Lambda x##, which implies ##x=\Lambda^{-1}x'##. The ##\nu## component of this equality is ##x^\nu=(\Lambda^{-1})^\nu{}_\mu x'^\mu##.

The equality that you might want to prove in a similar way is ##x'_\mu=(\Lambda^{-1})^\nu{}_\mu x_\nu##.

Rococo said:

##x^v = η^{\mu\nu}x_\mu = (ηx)_\nu##

You have to be careful here. One issue is that ##\eta^{\mu\nu}## by convention denotes the entry on row ##\mu##, column ##\nu## of ##\eta^{-1}##, not ##\eta##. This probably doesn't matter here though, since ##\eta=\eta^{-1}## when we're dealing with the Minkowski metric.

A bigger problem is that that ##x_\mu## is not the ##\mu## component of ##x##. (##x^\mu## is). If it had been, we would have had ##\eta^{\mu\nu}x_\mu=(\eta x)^\nu##, with the ##\nu## upstairs, not downstairs. What does ##(\eta x)_\nu## denote exactly? I guess the natural definition would be ##(\eta x)_\nu=\eta_{\nu\mu}(\eta x)^\mu##, but then the second equality in your quote doesn't hold.

Rococo said:

But we have: ##ηx=(\Lambda)^Tηx'##

Hence: ##x^\nu = (\Lambda^Tηx')_\nu##

The ##\nu## component of the equality in the first line is ##x_\nu=(\Lambda^T\eta x')^\nu##. So you appear to be taking the ##\nu## component of something else, perhaps the transpose of the first line.

A comment about the transpose operation is in order. The transpose of a matrix equation ##y=Mx## is ##y^T=x^TM^T##. If we want to write down the ##\mu## component of the latter equation, it may look like a good idea to write it as ##y_\mu=x_\nu(M^T)^\nu{}_\mu##, and it is, if we're just using the definition of matrix multiplication, and not the lowering convention. But since we are using the lowering convention, we can't afford to write the ##\mu## component of ##x^T## as ##x_\mu##, since that notation is already reserved for the ##\mu## component of ##\eta x##. So what do we write instead? We simply use the fact that the ##\mu## component of a 4×1 matrix is the same as the ##\mu## component of its transpose to rewrite ##y_\mu=x_\nu(M^T)^\nu{}_\mu## as ##y^\mu=x^\nu M^\mu{}_\nu=M^\mu{}_\nu x^\nu##. That's right, we can't afford to write the component version of ##y^T=x^TM^T## different from how we write the component version of ##y=Mx##.

An easier way to use the formula for ##\eta x## is to do this:
$$x=\eta^{-1}\eta x=\eta^{-1}\Lambda^T\eta x'=\Lambda^{-1}x',$$ but as already mentioned, we could this result even faster from the starting point ##x'=\Lambda x##.

Rococo · Apr 4, 2014

I think the second relation, ##(\lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}=\frac{∂x^\nu}{∂x'^\mu}## may actually be easier to prove.

Let us decompose this into two equations to consider separately:

##(\lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}## [3]

##(\lambda^{-1})^\nu~_\mu=\frac{∂x^\nu}{∂x'^\mu}## [4]

[3] can be shown to be the case quite simply, if you were to start with the definition of how covariant 4-vectors transform under a Lorentz transformation, as in post #1:

##x'_\mu=(\lambda^{-1})^\nu~_\mu x_\nu##

Thus: ##\frac{∂x'_\mu}{∂x_\nu}=(\lambda^{-1})^\nu~_\mu##

##(\lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}##

which is equation [3].

Now to prove equation [4], we can proceed as in your previous post. Starting with:

##x'=\Lambda x##
##x=\Lambda^{-1}x'##
##x^\nu=(\Lambda^{-1})^\nu{}_\mu x'^\mu##
##\frac{∂x^\nu}{∂x'^\mu}=(\lambda^{-1})^\nu~_\mu##
##(\lambda^{-1})^\nu~_\mu=\frac{∂x^\nu}{∂x'^\mu}##

which is equation [4].

Since we have shown that [3] and [4] are true, then the second relation: ##(\lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}=\frac{∂x^\nu}{∂x'^\mu}##
has been shown.

Is all of this valid?

One thing I would like to clarify is how you go from ##x=\Lambda^{-1}x'## to ##x^\nu=(\Lambda^{-1})^\nu{}_\mu x'^\mu##
Here ##x## is a 4x1 matrix, with 4 rows and 1 column. The four elements are ##x^0, x^1, x^2, x^3## in different rows. So why is ##\nu## used as the superscript instead of ##\mu## - is ##\mu## not used to signify the row of the matrix?

Fredrik · Apr 4, 2014

Rococo said:

One thing I would like to clarify is how you go from ##x=\Lambda^{-1}x'## to ##x^\nu=(\Lambda^{-1})^\nu{}_\mu x'^\mu##
Here ##x## is a 4x1 matrix, with 4 rows and 1 column. The four elements are ##x^0, x^1, x^2, x^3## in different rows. So why is ##\nu## used as the superscript instead of ##\mu## - is ##\mu## not used to signify the row of the matrix?

No, ##\mu## is the column index of ##\Lambda^{-1}## in that calculation.

For all 4×4 matrices X except ##\eta## and ##\eta^{-1}##, the component on row ##\mu##, column ##\nu## is denoted by ##X^\mu{}_\nu##. For ##\eta##, the notation is ##\eta_{\mu\nu}##, and for ##\eta^{-1}##, the notation is ##\eta^{\mu\nu}##.

For all 4×1 matrices x, the component on row ##\mu## is denoted by ##x^\mu##.

For 1×4 matrices x, the obvious notation for the component on row ##\mu## would be ##x_\mu## ...if it hadn't been for the fact that we use ##\eta## to lower indices. This messes things up so bad that we simply don't have a notation for components of 1×4 matrices. If you want to use the definition of matrix multiplication on something like ##x^T\eta y##, you use the fact that the ##\mu## component of ##x^T## is the same as the ##\mu## component of ##x##, to write it as ##x^T\eta y=x^\mu\eta_{\mu\nu} y^\nu##.

Rococo said:

Is all of this valid?

I didn't see any mistakes. However, if you take ##x'_\mu=(\lambda^{-1})^\nu~_\mu x_\nu## as your starting point, your calculation doesn't show that ##\lambda=\Lambda##.

There are several ways to do this. The simplest is probably to start with the result ##x_\nu=\Lambda^\rho{}_\nu x'_\rho## that I derived, and multiply it by ##(\Lambda^{-1})^\nu{}_\mu##.

The approach that looks the most like my derivation of ##x_\nu=\Lambda^\mu{}_\nu x'_\mu## goes like this:
\begin{align}
&x'_\mu=\eta_{\mu\nu} x'^\nu =(\eta x')^\mu\\
&\eta x'=\eta \Lambda x=\eta\Lambda\eta^{-1}\eta x=(\Lambda^T)^{-1}\eta x =(\Lambda^{-1})^T\eta x\\
&x'_\mu=(\eta x')^\mu =\left((\Lambda^{-1})^T\right)^\mu{}_\nu (\eta x)^\nu =(\Lambda^{-1})^\nu{}_\mu x_\nu
\end{align}

Rococo · Apr 4, 2014

Fredrik said:

No, ##\mu## is the column index of ##\Lambda^{-1}## in that calculation.

For all 4×4 matrices X except ##\eta## and ##\eta^{-1}##, the component on row ##\mu##, column ##\nu## is denoted by ##X^\mu{}_\nu##. For ##\eta##, the notation is ##\eta_{\mu\nu}##, and for ##\eta^{-1}##, the notation is ##\eta^{\mu\nu}##.

For all 4×1 matrices x, the component on row ##\mu## is denoted by ##x^\mu##.

For 1×4 matrices x, the obvious notation for the component on row ##\mu## would be ##x_\mu## ...if it hadn't been for the fact that we use ##\eta## to lower indices. This messes things up so bad that we simply don't have a notation for components of 1×4 matrices. If you want to use the definition of matrix multiplication on something like ##x^T\eta y##, you use the fact that the ##\mu## component of ##x^T## is the same as the ##\mu## component of ##x##, to write it as ##x^T\eta y=x^\mu\eta_{\mu\nu} y^\nu##.I didn't see any mistakes. However, if you take ##x'_\mu=(\lambda^{-1})^\nu~_\mu x_\nu## as your starting point, your calculation doesn't show that ##\lambda=\Lambda##.

There are several ways to do this. The simplest is probably to start with the result ##x_\nu=\Lambda^\rho{}_\nu x'_\rho## that I derived, and multiply it by ##(\Lambda^{-1})^\nu{}_\mu##.

The approach that looks the most like my derivation of ##x_\nu=\Lambda^\mu{}_\nu x'_\mu## goes like this:
\begin{align}
&x'_\mu=\eta_{\mu\nu} x'^\nu =(\eta x')^\mu\\
&\eta x'=\eta \Lambda x=\eta\Lambda\eta^{-1}\eta x=(\Lambda^T)^{-1}\eta x =(\Lambda^{-1})^T\eta x\\
&x'_\mu=(\eta x')^\mu =\left((\Lambda^{-1})^T\right)^\mu{}_\nu (\eta x)^\nu =(\Lambda^{-1})^\nu{}_\mu x_\nu
\end{align}

Thanks for the clarification. I've accidentally been using ##\lambda## in places where I meant to type ##\Lambda##, including in the original post, by forgetting to capitalise the 'l' in Latex. Anywhere I've used ##\lambda## should actually be read as ##\Lambda##.

So my starting point should have been that covariant 4-vectors transform under a Lorentz transformation as follows:

##x'_\mu=(\Lambda^{-1})^\nu~_\mu x_\nu##

and from this it can be seen that ##(\Lambda^{-1})^\nu~_\mu=\frac{∂x'_\mu}{∂x_\nu}## which is Equation [3].

On the index notation used in Lorentz transformations

Undergrad Why is gravity a fictitious force?

Undergrad Relativistic Space Travel: Optimizing Proper Time [Project Hail Mary]

Undergrad KE of rotating disc

Undergrad Why is the Lorentz Force always perpendicular to velocity?

Graduate How valid is the Block Universe theory?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

On the index notation used in Lorentz transformations

Similar threads