Energy Formulas in SR: Explained

PWiz
Messages
695
Reaction score
117
As I understand it, the formula ##E=\gamma m_0 c^2## gives the total energy of a body in any inertial frame. However, the formula ##E=\sqrt{ (m_0 c^2)^2 + (pc)^2}##, which is also supposed to give the total energy of a body (in any inertial frame), does not equal to the first formula. Why is this? I'm guessing that the answer is very simple, but I'm not able to lay my finger on it. Any help is appreciated.
 
  • Like
Likes Dr Arun K JOY
Physics news on Phys.org
##m_0c^2## is the energy of the particle at rest (##m_0## is the rest mass). You are no doubt aware that at relativistic speeds mass increases. ##E=\sqrt{(m_0c^2)^2+(pc)^2}## is the energy of the particle with the mass increase taken into consideration. It's just a restatement of the fact that ##E=m_vc^2## where ##m_v## is the mass at velocity ##v##.You should have no trouble proving equation 2 if you remember that ##m_v=\frac{m_0}{\sqrt{1-\frac{v^2}{c^2}}}##. I highly suspect that ##\gamma m_0## in equation 1 is just a convenient way to denote ##m_v##
 
PWiz said:
which is also supposed to give the total energy of a body (in any inertial frame), does not equal to the first formula. Why is this? I'm guessing that the answer is very simple,

The answer is very simple indeed. The two formulas are exactly the same if you use the relativistic expression for momentum.
 
certainly said:
You are no doubt aware that at relativistic speeds mass increases.
This is not in line with what most physicists mean when they say "mass". See our FAQ: https://www.physicsforums.com/threads/what-is-relativistic-mass-and-why-it-is-not-used-much.796527/
 
Last edited by a moderator:
Orodruin said:
It happens that the term relativistic mass is used, in particular in introductory text on special relativity.
This was exactly the case.
Orodruin said:
relativistic mass depends on the frame in which the object is observed
So does this mean that if I accelerate an object of mass ##m## to a high velocity and bring it back at rest the mass will still be ##m##?
 
@certainly Thanks, but I usually avoid using ##\gamma m_0## for ##m## and instead just stick with ##\vec p=\gamma m_0 v##, since it causes much less confusion.

@Orodruin Ah, got it! I was thinking how ##\sqrt{1+{\gamma}^2 \frac{v^2}{c^2}}## was equivalent to gamma, but it turns out it was just because I was being lazy and was not explicitly using the gamma function to prove the equivalence of the formulae. But I have one more question though - why do we use the second equation when the first one is so elegant, neat and compact? I can understand that in collisions involving multiple bodies one would use four-momentum vectors trying to find a zero momentum frame and then use scalar products, but for relativistic K.E. calculations, can a simple ##m_0c^2(\gamma -1)## not suffice? I mean first calculating the relativistic momentum in the individual spatial dimensions, then finding the resultant momentum and using the scalar product to calculate the total energy of the body in a frame takes a lot more time than simply plugging in the resultant velocity and mass in the first formula.
 
The second form is actually much more elegant in my view and the formulation in terms of four vectors is very powerful. You will find very few particle physicists actually compute gamma factors in terms of velocities.
 
@Orodruin Ah, got it! I was thinking how ##\sqrt{1+{\gamma}^2 \frac{v^2}{c^2}}## was equivalent to gamma, but it turns out it was just because I was being lazy and was not explicitly using the gamma function to prove the equivalence of the formulae. But I have one more question though - why do we use the second equation when the first one is so elegant, neat and compact?[/QUOTE]

The second formula applies even for zero-mass particles such as the photon.
 
stevendaryl said:
The second formula applies even for zero-mass particles such as the photon.
How so? You have to use ##E=hf## regardless of which of the two formulas you choose, since the spatial momentum of particles with 0 rest mass is equal to their energy as they move on null lines.
 
  • #10
PWiz said:
How so? You have to use ##E=hf## regardless of which of the two formulas you choose, since the spatial momentum of particles with 0 rest mass is equal to their energy as they move on null lines.

Well, E=\gamma m_0 c^2 is undefined when v=c and m_0 = 0. But E = \sqrt{(m_0)^2 c^4 + p^2 c^2} is still true.
 
Last edited:
  • #11
stevendaryl said:
Well, E=\gamma m_0 c^2 is undefined when v=0 and m_0 = 0. But E = \sqrt{(m_0)^2 c^4 + p^2 c^2} is still true.
Yes, but it's more work again. For objects with non-zero rest mass, one can use ##E=\gamma m_0 c^2## and for massless particles, ##E=hf## can be used. I still don't find a reason to use the length of a four-momentum vector. Can you give an example where it is more convenient to use the second formula?
 
  • #12
PWiz said:
Can you give an example where it is more convenient to use the second formula?

Basically any kinematics problem in special relativity.

PWiz said:
I still don't find a reason to use the length of a four-momentum vector.

The "length" of the four-momenutm is ##m_0 c##. The power of the four-vector formalism is that inner products between four-vectors are scalars and you can therefore evaluate them in any frame.

I can tell you from experience in correcting exams in special relativity: The people who try to apply conservation of energy and momentum separately rather than learning the four-vector formalism generally make mistakes and get lost in mathematical issues that they really would not need to if they applied four-vector formalism.

Here is an example from the latest exam I put to my students:
Consider the particle collision ##e^− + e^− \to e^− + e^− + e^− + e^+##. Compute the necessary total energy of one of the initial electrons in the rest frame of the other for this process to occur. Also compute the ratio between this energy and the total required energy in the center of momentum frame.
 
  • #13
PWiz said:
Yes, but it's more work again. For objects with non-zero rest mass, one can use ##E=\gamma m_0 c^2## and for massless particles, ##E=hf## can be used. I still don't find a reason to use the length of a four-momentum vector. Can you give an example where it is more convenient to use the second formula?

Vectors are much more convenient to work with than non-vector quantities such as \gamma. For one thing, an expression written in terms of 4-vectors is true in any reference frame, so you can pick a reference frame in which the vectors have a particularly simple form to evaluate them.

For example, the expression

E^2 - p^2 c^2

is the magnitude of the 4-vector (E, p^x c, p^y c, p^z c). It can be evaluated for a massive particle by looking in its rest frame, where

E = m_0 c^2
p = 0

So we conclude that E^2 - p^2 c^2 = m_0^2 c^4 is true in every frame.

Here's another example of using 4-vectors. Suppose you have a rocket that travels in such a way that, as measured by accelerometers aboard the rocket, the acceleration is constant. How do you compute the rocket's position as a function of time?

That's enormously complicated to do using Lorentz transformations. But in terms of 4-vectors, we can let U be the 4-velocity, and let A be the 4-acceleration, and then the condition of constant acceleration becomes:

A \cdot A = -g^2

where g is the magnitude of the acceleration. That equation can readily be integrated to get U, which can be integrated to get (ct, x, y, z) as a function of proper time, \tau.
 
  • Like
Likes PWiz
  • #14
@Orodruin Did your students have to include the repulsive force between the electrons in their calculations?
Orodruin said:
The "length" of the four-momenutm is m0cm_0 c.
Um shouldn't it be ##m_0 c^2## instead?
@stevendaryl I liked the rocket example. I suppose one could use hyperbolic functions there and avoid vectors, but that would indeed complicate matters.
 
  • #15
PWiz said:
@Orodruin Did your students have to include the repulsive force between the electrons in their calculations?

No. This type of computations generally assume that the end products are well separated.

Um shouldn't it be ##m_0 c^2## instead?

No, ##m c^2## has units of energy, not momentum.
 
  • #16
Orodruin said:
No, mc2m c^2 has units of energy, not momentum.
Wait, you're defining the four momentum as ##p_{\mu}=(\frac E{c},\vec p)## or as ##(E,\vec pc)##? The first one will have a magnitude of ##m_0 c## whereas the second will have ##m_0 c^2##.
 
  • #17
PWiz said:
Wait, you're defining the four momentum as ##p_{\mu}=(\frac E{c},\vec p)## or as ##(E,\vec pc)##? The first one will have a magnitude of ##m_0 c## whereas the second will have ##m_0 c^2##.

As most theoretical physicists, I usually set ##c = 1## and define the four-momentum as ##(E,\vec p)##. :)
If I am forced to use units where ##c \neq 1## I would go for the definition ##(E/c,\vec p)##. After all, it is called four-momentum and not four-energy.
 
  • #18
Orodruin said:
As most theoretical physicists, I usually set ##c = 1## and define the four-momentum as ##(E,\vec p)##. :)
If I am forced to use units where ##c \neq 1## I would go for the definition ##(E/c,\vec p)##. After all, it is called four-momentum and not four-energy.
It's the same thing with spacetime coordinates as well. Some use ##(t,x,y,z)## and some use ##(ct,x,y,z)##. The problem is that it becomes painful solving questions which are framed in SI units because you have to remember where to insert the c's and constantly multiply/divide. I usually like to use the ##(ict,x,y,z)## and add ##i## to the 0th component of pretty much every four vector so that I don't have to "untune" my Euclidean metric sense. Is this unwise?
 
  • #19
PWiz said:
It's the same thing with spacetime coordinates as well. Some use ##(t,x,y,z)## and some use ##(ct,x,y,z)##. The problem is that it becomes painful solving questions which are framed in SI units because you have to remember where to insert the c's and constantly multiply/divide. I usually like to use the ##(ict,x,y,z)## and add ##i## to the 0th component of pretty much every four vector so that I don't have to "untune" my Euclidean metric sense. Is this unwise?

The use of ict is discouraged these days. It makes some formulas involving SR seem simpler, but when you go on to General Relativity, using ict doesn't work, anymore.

The preferred way to deal with vectors in General Relativity is to explicitly use the metric tensor. Do you know what the metric tensor is?
 
  • #20
PWiz said:
It's the same thing with spacetime coordinates as well. Some use ##(t,x,y,z)## and some use ##(ct,x,y,z)##. The problem is that it becomes painful solving questions which are framed in SI units because you have to remember where to insert the c's and constantly multiply/divide. I usually like to use the ##(ict,x,y,z)## and add ##i## to the 0th component of pretty much every four vector so that I don't have to "untune" my Euclidean metric sense. Is this unwise?
It is old fashioned, but if it helps you that is all that matters. I think it has fewer potential problems that relativistic mass. In fact, the only major limitation is the generalization to arbitrary coordinates SR or to GR - it only remains simple in SR with standard coordinates. For example, Dirac and Bondi developed nice approaches in SR that use coordinates that have two lightlike coordinates and two spatial coordinates, with no time coordinate at all. That works just fine with metric approaches but I don't see how you deal with such formulations using ict.
 
  • #21
stevendaryl said:
but when you go on to General Relativity, using ict doesn't work, anymore.
Isn't that because in SR spacetime is represented by flat Minkowski space, whereas in GR spacetime is a curved pseudo-Riemannian manifold (which is locally flat)?
stevendaryl said:
Do you know what the metric tensor is?
Yes, but only the Euclidean metric tensor $$g=
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}$$
and the Lorentzian metric tensor (with c=1)
$$η_{μν}=
\begin{bmatrix}
-1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}$$
used for taking the norm of two vectors in their respective manifolds.
I don't know the mathematical details of GR, but I'm pretty sure a different metric tensor is used with polar coordinates (I think the Schwarzschild metric), although ##ds^2=η_{μν} dx^μ dx^ν## still holds true (I guess).
@PAllen Which alternate coordinate systems in SR are you talking about?
 
  • #22
PWiz said:
Isn't that because in SR spacetime is represented by flat Minkowski space, whereas in GR spacetime is a curved pseudo-Riemannian manifold (which is locally flat)?

Yes, but only the Euclidean metric tensor $$g=
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}$$
and the Lorentzian metric tensor (with c=1)
$$η_{μν}=
\begin{bmatrix}
-1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}$$
used for taking the norm of two vectors in their respective manifolds.
I don't know the mathematical details of GR, but I'm pretty sure a different metric tensor is used with polar coordinates (I think the Schwarzschild metric), although ##ds^2=η_{μν} dx^μ dx^ν## still holds true (I guess).
@PAllen Which alternate coordinate systems in SR are you talking about?

Well, for example, you can do SR in polar coordinates, in which case the metric tensor no longer takes on the form of \eta_{\mu \nu}.

The point is that using ict is a way to get the minus sign into expressions such as -(ct)^2 + x^2 + y^2 + z^2, so you can pretend that you are using Euclidean vectors. But the more general approach is to let the -1 be part of the metric tensor, and keep all components as real numbers.
 
  • #23
stevendaryl said:
But the more general approach is to let the −1-1 be part of the metric tensor, and keep all components as real numbers.
The Lorentzian metric changes to a Euclidean one if you use ##ict##. Isn't it better that way? You then have only have one metric tensor to deal with.
 
  • #24
Consider u = x-ct, v=x+ct. A line of u constant describe a light ray in the +x direction, a line of constant v describes a light ray in the -x direction. Using these coordinates, (ignoring YZ plane), you end up with the amazingly simple metric:

ds2 = du dv

and I am not setting c=1, it just drops out if transform to these coordinates.

So where do you put your ict here?
 
  • #25
PWiz said:
The Lorentzian metric changes to a Euclidean one if you use ##ict##. Isn't it better that way? You then have only have one metric tensor to deal with.

No, it's really not the better approach. The approach that you're talking about ignores the distinction between covariant vectors (ones with components A^\mu) and contravariant vectors (ones with components A_\mu). That distinction is extremely important for curved spacetime, but it's also important for flat spacetime, if you're using curvilinear coordinates.

For example, let's just do 3-dimensional vectors in spherical coordinates,nonrelativistically. Then a point in space is described by three numbers, r, \theta, \phi. The path of a particle would similarly be described by three numbers: \frac{dr}{dt}, \frac{d\theta}{dt}, \frac{d\phi}{dt}, which you can think of as components of the velocity vector. But if you're computing the speed v, it's not simply v^2 = \frac{dr}{dt}^2 + \frac{d\theta}{dt}^2 + \frac{d\phi}{dt}^2. Instead, it's more complicated:
v^2 = \frac{dr}{dt}^2 + r^2 \frac{d\theta}{dt}^2 + r^2 sin^2(\theta) \frac{d\phi}{dt}^2. That can be understood as

\sum_{i,j}\ g_{ij} v^i v^j

where g_{rr} = 1, g_{\theta \theta} = r^2, g_{\phi \phi} = r^2 sin^2(\theta).

Of course, it is possible to get a notion of a velocity vector that uses a trivial metric, by letting the components of the velocity vector be:

\frac{dr}{dt}, r \frac{d\theta}{dt}, r sin(\theta) \frac{d\phi}{dt}

That allows the metric to be simple, but in exchange for making the velocity vector more complicated. In a sense, what you'd be doing is incorporating \sqrt{g} into the definition of the velocity vector (which is the same thing that is going on in using ict). That's a convoluted thing to do, and only works in the cases where the metric tensor is diagonal, which isn't always the case.
 
  • #26
@stevendaryl OK, I guess the better approach is to use simpler vector components by including the extra terms in the metric.
stevendaryl said:
The approach that you're talking about ignores the distinction between covariant vectors (ones with components A^\mu) and contravariant vectors (ones with components A_\mu).
I think you reversed the order there - upper indices denote contravariant vectors. If the vector ##\vec A = A^{\alpha} e_{\alpha}## is constructed in a different frame, then the basis vector transformation will be represented by ##e_{\alpha}=Λ^{\bar \beta}_{\alpha} e_{\bar \beta}## (covariance), while the vector components transformation will be given by ##A^{\bar \beta}=Λ^{\bar \beta}_{\alpha} A^{\alpha}## (contravariance) so that the same vector is constructed in the other frame as well (using the lorentz transformation matrix).
stevendaryl said:
i,j gijvivj\sum_{i,j}\ g_{ij} v^i v^j
If upper indices were covariant, this would not make any sense.
 
  • #27
PWiz said:
If upper indices were covariant, this would not make any sense.
Well, upper indices are contravariant, so "lucky" us ...
 
  • Like
Likes PWiz
  • #28
PWiz said:
I think you reversed the order there - upper indices denote contravariant vectors. If the vector ##\vec A = A^{\alpha} e_{\alpha}## is constructed in a different frame, then the basis vector transformation will be represented by ##e_{\alpha}=Λ^{\bar \beta}_{\alpha} e_{\bar \beta}## (covariance), while the vector components transformation will be given by ##A^{\bar \beta}=Λ^{\bar \beta}_{\alpha} A^{\alpha}## (contravariance) so that the same vector is constructed in the other frame as well (using the lorentz transformation matrix).

Sorry for mixing up the terminology. But your explanation of the two prefixes using basis vectors simply shows that basis vector indices have to be the opposite of vector components: If one is "co", then the other is "contra". It doesn't say which should be which.
 
  • #29
This is actually the reason I dislike the term "covariant vector" (and "contravariant vector"). The vectors themselves are either tangent vectors, which may be defined as directional derivatives or equivalence classes of curves, or covectors. Tangent vectors have a coordinate basis which transforms covariantly with components transforming contravariantly and covectors have contravariant coordinate bases and covariant components. The vectors themselves are not dependent on the choice of coordinate system and are either tangent vectors or covectors.
 
  • #30
stevendaryl said:
Sorry for mixing up the terminology. But your explanation of the two prefixes using basis vectors simply shows that basis vector indices have to be the opposite of vector components: If one is "co", then the other is "contra". It doesn't say which should be which.
AFAIK, there is no way to actually "prove" why upper indices were chosen to be contravariant - it just seems like an arbitrary decision to me. All that really matters in the end is that the summation is correctly carried out when two identical upper and lower indices are seen in an expression, and that one-forms and vectors are clearly distinguishable and recognizable when seen together (to prevent tensor algebra from going topsy-turvy).
 
  • #31
Orodruin said:
This is actually the reason I dislike the term "covariant vector" (and "contravariant vector"). The vectors themselves are either tangent vectors, which may be defined as directional derivatives or equivalence classes of curves, or covectors. Tangent vectors have a coordinate basis which transforms covariantly with components transforming contravariantly and covectors have contravariant coordinate bases and covariant components. The vectors themselves are not dependent on the choice of coordinate system and are either tangent vectors or covectors.
Well "covariant" and "contravariant" are old terms - they tend to mix up the expression of new vectors in terms of the old basis vectors and the expression of the same vector in terms of the new basis. I think we can agree that one-forms and vectors sound "nicer".
 
  • #32
I actually prefer tangent vectors and co(tangent)vectors. The reason is that "one form" brings my mind to think of n-forms and not arbitrary tensors.
 
  • #33
Orodruin said:
The reason is that "one form" brings my mind to think of n-forms and not arbitrary tensors.

Is there a difference between n-forms and tensors?
 
  • #34
PWiz said:
AFAIK, there is no way to actually "prove" why upper indices were chosen to be contravariant - it just seems like an arbitrary decision to me. All that really matters in the end is that the summation is correctly carried out when two identical upper and lower indices are seen in an expression, and that one-forms and vectors are clearly distinguishable and recognizable when seen together (to prevent tensor algebra from going topsy-turvy).

Well, in the Einstein convention, it's pretty hard to go wrong, because you always match a raised index with a lowered index. The only additional bits that need to be remembered is that whatever your convention for variables such as x^\alpha, derivatives count as the opposite: \partial_\alpha.
 
  • #35
ChrisVer said:
Is there a difference between n-forms and tensors?

The way I understand it, an n-form is a special case of a tensor, namely a tensor whose components have all lowered indices.
 
  • #36
I should have written an (0 n)-tensor or (n 0)-tensor (I don't remember right now where the contra/co-variant rank goes in this notation)
 
  • #37
An ##n##-form is (equivalent to) a totally anti-symmetric ##(0,n)## tensor. A typical example of a ##(0,2)## tensor which is not a 2-form is the metric.
 
  • Like
Likes ChrisVer
  • #38
Isn't an n-form a ##\binom {0}{n}## type tensor - a function of ##n## vectors into the real numbers which is linear in each of its ##n## arguments?
 
  • #39
So it takes the antisymmetry of the (0 n)-tensor...
The thing is that I have only encountered 0 or 1 forms,and this didn't let me see any distinction.
 
  • #40
Of course, for 1-forms, there is nothing to be anti-symmetric with so they are equivalent to covectors.
 
  • #41
As for the ict...
One useful application of that in SR is when you want to look at the rotations and boosts that are there for the Lorentz Group SO(3,1) instead of pure rotations that would be for the SO(4). In fact it would change the trigonometric functions (rotations) to hyperbolic ones (boosts). It is useful because I think it doesn't intercept with the problems of GR set by Stevendaryl .
And generally then for Wick rotations (but that's another thing).
 
  • #42
stevendaryl said:
Well, in the Einstein convention, it's pretty hard to go wrong, because you always match a raised index with a lowered index. The only additional bits that need to be remembered is that whatever your convention for variables such as x^\alpha, derivatives count as the opposite: \partial_\alpha.
It can sometimes be difficult to remember which indices go up and which go down. Take the tensor transformation law for example:
$$S^{μ'}\ _{ν\ 'ρ'}= \frac{∂x^{μ'}}{∂x^{μ}} \frac{∂x^{ν}}{∂x^{ν \ '}} \frac{∂x^{ρ}}{∂x^{ρ'}} S^{μ}\ _{νρ}$$
I'm trying to memorize this (and it's a little tricky) right now. If I were to mix up the order of even one index, everything would go for a toss.
 
  • #43
PWiz said:
It can sometimes be difficult to remember which indices go up and which go down. Take the tensor transformation law for example:
$$S^{μ'}\ _{ν\ 'ρ'}= \frac{∂x^{μ'}}{∂x^{μ}} \frac{∂x^{ν}}{∂x^{ν \ '}} \frac{∂x^{ρ}}{∂x^{ρ'}} S^{μ}\ _{νρ}$$
I'm trying to memorize this (and it's a little tricky) right now. If I were to mix up the order of even one index, everything would go for a toss.

There is really only one way of doing it and the rules to follow are very simple. Free upper indices need to be up on both sides and vice versa. Repeated (summation) indices need to appear one up and one down (partial derivatives count as down in terms of the coordinate they are derivatives with respect to).
 

Similar threads

Back
Top