Energy Formulas in SR: Explained

PWiz · Apr 24, 2015

As I understand it, the formula ##E=\gamma m_0 c^2## gives the total energy of a body in any inertial frame. However, the formula ##E=\sqrt{ (m_0 c^2)^2 + (pc)^2}##, which is also supposed to give the total energy of a body (in any inertial frame), does not equal to the first formula. Why is this? I'm guessing that the answer is very simple, but I'm not able to lay my finger on it. Any help is appreciated.

certainly · Apr 24, 2015

##m_0c^2## is the energy of the particle at rest (##m_0## is the rest mass). You are no doubt aware that at relativistic speeds mass increases. ##E=\sqrt{(m_0c^2)^2+(pc)^2}## is the energy of the particle with the mass increase taken into consideration. It's just a restatement of the fact that ##E=m_vc^2## where ##m_v## is the mass at velocity ##v##.You should have no trouble proving equation 2 if you remember that ##m_v=\frac{m_0}{\sqrt{1-\frac{v^2}{c^2}}}##. I highly suspect that ##\gamma m_0## in equation 1 is just a convenient way to denote ##m_v##

Orodruin · Apr 24, 2015

PWiz said:

which is also supposed to give the total energy of a body (in any inertial frame), does not equal to the first formula. Why is this? I'm guessing that the answer is very simple,

The answer is very simple indeed. The two formulas are exactly the same if you use the relativistic expression for momentum.

Orodruin · Apr 24, 2015

certainly said:

You are no doubt aware that at relativistic speeds mass increases.

This is not in line with what most physicists mean when they say "mass". See our FAQ: https://www.physicsforums.com/threads/what-is-relativistic-mass-and-why-it-is-not-used-much.796527/

certainly · Apr 24, 2015

Orodruin said:

It happens that the term relativistic mass is used, in particular in introductory text on special relativity.

This was exactly the case.

Orodruin said:

relativistic mass depends on the frame in which the object is observed

So does this mean that if I accelerate an object of mass ##m## to a high velocity and bring it back at rest the mass will still be ##m##?

PWiz · Apr 24, 2015

@certainly Thanks, but I usually avoid using ##\gamma m_0## for ##m## and instead just stick with ##\vec p=\gamma m_0 v##, since it causes much less confusion.

@Orodruin Ah, got it! I was thinking how ##\sqrt{1+{\gamma}^2 \frac{v^2}{c^2}}## was equivalent to gamma, but it turns out it was just because I was being lazy and was not explicitly using the gamma function to prove the equivalence of the formulae. But I have one more question though - why do we use the second equation when the first one is so elegant, neat and compact? I can understand that in collisions involving multiple bodies one would use four-momentum vectors trying to find a zero momentum frame and then use scalar products, but for relativistic K.E. calculations, can a simple ##m_0c^2(\gamma -1)## not suffice? I mean first calculating the relativistic momentum in the individual spatial dimensions, then finding the resultant momentum and using the scalar product to calculate the total energy of the body in a frame takes a lot more time than simply plugging in the resultant velocity and mass in the first formula.

Orodruin · Apr 24, 2015

The second form is actually much more elegant in my view and the formulation in terms of four vectors is very powerful. You will find very few particle physicists actually compute gamma factors in terms of velocities.

stevendaryl · Apr 24, 2015

@Orodruin Ah, got it! I was thinking how ##\sqrt{1+{\gamma}^2 \frac{v^2}{c^2}}## was equivalent to gamma, but it turns out it was just because I was being lazy and was not explicitly using the gamma function to prove the equivalence of the formulae. But I have one more question though - why do we use the second equation when the first one is so elegant, neat and compact?[/QUOTE]

The second formula applies even for zero-mass particles such as the photon.

PWiz · Apr 24, 2015

stevendaryl said:

The second formula applies even for zero-mass particles such as the photon.

How so? You have to use ##E=hf## regardless of which of the two formulas you choose, since the spatial momentum of particles with 0 rest mass is equal to their energy as they move on null lines.

stevendaryl · Apr 24, 2015

PWiz said:

How so? You have to use ##E=hf## regardless of which of the two formulas you choose, since the spatial momentum of particles with 0 rest mass is equal to their energy as they move on null lines.

Well, [itex]E=\gamma m_0 c^2[/itex] is undefined when [itex]v=c[/itex] and [itex]m_0 = 0[/itex]. But [itex]E = \sqrt{(m_0)^2 c^4 + p^2 c^2}[/itex] is still true.

PWiz · Apr 24, 2015

stevendaryl said:

Well, [itex]E=\gamma m_0 c^2[/itex] is undefined when [itex]v=0[/itex] and [itex]m_0 = 0[/itex]. But [itex]E = \sqrt{(m_0)^2 c^4 + p^2 c^2}[/itex] is still true.

Yes, but it's more work again. For objects with non-zero rest mass, one can use ##E=\gamma m_0 c^2## and for massless particles, ##E=hf## can be used. I still don't find a reason to use the length of a four-momentum vector. Can you give an example where it is more convenient to use the second formula?

Orodruin · Apr 24, 2015

PWiz said:

Can you give an example where it is more convenient to use the second formula?

Basically any kinematics problem in special relativity.

PWiz said:

I still don't find a reason to use the length of a four-momentum vector.

The "length" of the four-momenutm is ##m_0 c##. The power of the four-vector formalism is that inner products between four-vectors are scalars and you can therefore evaluate them in any frame.

I can tell you from experience in correcting exams in special relativity: The people who try to apply conservation of energy and momentum separately rather than learning the four-vector formalism generally make mistakes and get lost in mathematical issues that they really would not need to if they applied four-vector formalism.

Here is an example from the latest exam I put to my students:

Consider the particle collision ##e^− + e^− \to e^− + e^− + e^− + e^+##. Compute the necessary total energy of one of the initial electrons in the rest frame of the other for this process to occur. Also compute the ratio between this energy and the total required energy in the center of momentum frame.

stevendaryl · Apr 24, 2015

PWiz said:

Yes, but it's more work again. For objects with non-zero rest mass, one can use ##E=\gamma m_0 c^2## and for massless particles, ##E=hf## can be used. I still don't find a reason to use the length of a four-momentum vector. Can you give an example where it is more convenient to use the second formula?

Vectors are much more convenient to work with than non-vector quantities such as [itex]\gamma[/itex]. For one thing, an expression written in terms of 4-vectors is true in any reference frame, so you can pick a reference frame in which the vectors have a particularly simple form to evaluate them.

For example, the expression

[itex]E^2 - p^2 c^2[/itex]

is the magnitude of the 4-vector [itex](E, p^x c, p^y c, p^z c)[/itex]. It can be evaluated for a massive particle by looking in its rest frame, where

[itex]E = m_0 c^2[/itex]
[itex]p = 0[/itex]

So we conclude that [itex]E^2 - p^2 c^2 = m_0^2 c^4[/itex] is true in every frame.

Here's another example of using 4-vectors. Suppose you have a rocket that travels in such a way that, as measured by accelerometers aboard the rocket, the acceleration is constant. How do you compute the rocket's position as a function of time?

That's enormously complicated to do using Lorentz transformations. But in terms of 4-vectors, we can let [itex]U[/itex] be the 4-velocity, and let [itex]A[/itex] be the 4-acceleration, and then the condition of constant acceleration becomes:

[itex]A \cdot A = -g^2[/itex]

where g is the magnitude of the acceleration. That equation can readily be integrated to get [itex]U[/itex], which can be integrated to get [itex](ct, x, y, z)[/itex] as a function of proper time, [itex]\tau[/itex].

PWiz · Apr 24, 2015

@Orodruin Did your students have to include the repulsive force between the electrons in their calculations?

Orodruin said:

The "length" of the four-momenutm is m0cm_0 c.

Um shouldn't it be ##m_0 c^2## instead?
@stevendaryl I liked the rocket example. I suppose one could use hyperbolic functions there and avoid vectors, but that would indeed complicate matters.

Orodruin · Apr 24, 2015

PWiz said:

@Orodruin Did your students have to include the repulsive force between the electrons in their calculations?

No. This type of computations generally assume that the end products are well separated.

Um shouldn't it be ##m_0 c^2## instead?

No, ##m c^2## has units of energy, not momentum.

PWiz · Apr 24, 2015

Orodruin said:

No, mc2m c^2 has units of energy, not momentum.

Wait, you're defining the four momentum as ##p_{\mu}=(\frac E{c},\vec p)## or as ##(E,\vec pc)##? The first one will have a magnitude of ##m_0 c## whereas the second will have ##m_0 c^2##.

Orodruin · Apr 24, 2015

PWiz said:

Wait, you're defining the four momentum as ##p_{\mu}=(\frac E{c},\vec p)## or as ##(E,\vec pc)##? The first one will have a magnitude of ##m_0 c## whereas the second will have ##m_0 c^2##.

As most theoretical physicists, I usually set ##c = 1## and define the four-momentum as ##(E,\vec p)##. :)
If I am forced to use units where ##c \neq 1## I would go for the definition ##(E/c,\vec p)##. After all, it is called four-momentum and not four-energy.

PWiz · Apr 24, 2015

Orodruin said:

As most theoretical physicists, I usually set ##c = 1## and define the four-momentum as ##(E,\vec p)##. :)
If I am forced to use units where ##c \neq 1## I would go for the definition ##(E/c,\vec p)##. After all, it is called four-momentum and not four-energy.

It's the same thing with spacetime coordinates as well. Some use ##(t,x,y,z)## and some use ##(ct,x,y,z)##. The problem is that it becomes painful solving questions which are framed in SI units because you have to remember where to insert the c's and constantly multiply/divide. I usually like to use the ##(ict,x,y,z)## and add ##i## to the 0th component of pretty much every four vector so that I don't have to "untune" my Euclidean metric sense. Is this unwise?

stevendaryl · Apr 24, 2015

PWiz said:

It's the same thing with spacetime coordinates as well. Some use ##(t,x,y,z)## and some use ##(ct,x,y,z)##. The problem is that it becomes painful solving questions which are framed in SI units because you have to remember where to insert the c's and constantly multiply/divide. I usually like to use the ##(ict,x,y,z)## and add ##i## to the 0th component of pretty much every four vector so that I don't have to "untune" my Euclidean metric sense. Is this unwise?

The use of [itex]ict[/itex] is discouraged these days. It makes some formulas involving SR seem simpler, but when you go on to General Relativity, using [itex]ict[/itex] doesn't work, anymore.

The preferred way to deal with vectors in General Relativity is to explicitly use the metric tensor. Do you know what the metric tensor is?

PAllen · Apr 24, 2015

PWiz said:

It's the same thing with spacetime coordinates as well. Some use ##(t,x,y,z)## and some use ##(ct,x,y,z)##. The problem is that it becomes painful solving questions which are framed in SI units because you have to remember where to insert the c's and constantly multiply/divide. I usually like to use the ##(ict,x,y,z)## and add ##i## to the 0th component of pretty much every four vector so that I don't have to "untune" my Euclidean metric sense. Is this unwise?

It is old fashioned, but if it helps you that is all that matters. I think it has fewer potential problems that relativistic mass. In fact, the only major limitation is the generalization to arbitrary coordinates SR or to GR - it only remains simple in SR with standard coordinates. For example, Dirac and Bondi developed nice approaches in SR that use coordinates that have two lightlike coordinates and two spatial coordinates, with no time coordinate at all. That works just fine with metric approaches but I don't see how you deal with such formulations using ict.

PWiz · Apr 24, 2015

stevendaryl said:

but when you go on to General Relativity, using ict doesn't work, anymore.

Isn't that because in SR spacetime is represented by flat Minkowski space, whereas in GR spacetime is a curved pseudo-Riemannian manifold (which is locally flat)?

stevendaryl said:

Do you know what the metric tensor is?

Yes, but only the Euclidean metric tensor $$g=
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}$$
and the Lorentzian metric tensor (with c=1)
$$η_{μν}=
\begin{bmatrix}
-1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}$$
used for taking the norm of two vectors in their respective manifolds.
I don't know the mathematical details of GR, but I'm pretty sure a different metric tensor is used with polar coordinates (I think the Schwarzschild metric), although ##ds^2=η_{μν} dx^μ dx^ν## still holds true (I guess).
@PAllen Which alternate coordinate systems in SR are you talking about?

stevendaryl · Apr 24, 2015

PWiz said:

Isn't that because in SR spacetime is represented by flat Minkowski space, whereas in GR spacetime is a curved pseudo-Riemannian manifold (which is locally flat)?

Yes, but only the Euclidean metric tensor $$g=
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}$$
and the Lorentzian metric tensor (with c=1)
$$η_{μν}=
\begin{bmatrix}
-1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}$$
used for taking the norm of two vectors in their respective manifolds.
I don't know the mathematical details of GR, but I'm pretty sure a different metric tensor is used with polar coordinates (I think the Schwarzschild metric), although ##ds^2=η_{μν} dx^μ dx^ν## still holds true (I guess).
@PAllen Which alternate coordinate systems in SR are you talking about?

Well, for example, you can do SR in polar coordinates, in which case the metric tensor no longer takes on the form of [itex]\eta_{\mu \nu}[/itex].

The point is that using [itex]ict[/itex] is a way to get the minus sign into expressions such as [itex]-(ct)^2 + x^2 + y^2 + z^2[/itex], so you can pretend that you are using Euclidean vectors. But the more general approach is to let the [itex]-1[/itex] be part of the metric tensor, and keep all components as real numbers.

PWiz · Apr 24, 2015

stevendaryl said:

But the more general approach is to let the −1-1 be part of the metric tensor, and keep all components as real numbers.

The Lorentzian metric changes to a Euclidean one if you use ##ict##. Isn't it better that way? You then have only have one metric tensor to deal with.

PAllen · Apr 24, 2015

Consider u = x-ct, v=x+ct. A line of u constant describe a light ray in the +x direction, a line of constant v describes a light ray in the -x direction. Using these coordinates, (ignoring YZ plane), you end up with the amazingly simple metric:

ds² = du dv

and I am not setting c=1, it just drops out if transform to these coordinates.

So where do you put your ict here?

stevendaryl · Apr 24, 2015

PWiz said:

The Lorentzian metric changes to a Euclidean one if you use ##ict##. Isn't it better that way? You then have only have one metric tensor to deal with.

No, it's really not the better approach. The approach that you're talking about ignores the distinction between covariant vectors (ones with components [itex]A^\mu[/itex]) and contravariant vectors (ones with components [itex]A_\mu[/itex]). That distinction is extremely important for curved spacetime, but it's also important for flat spacetime, if you're using curvilinear coordinates.

For example, let's just do 3-dimensional vectors in spherical coordinates,nonrelativistically. Then a point in space is described by three numbers, [itex]r, \theta, \phi[/itex]. The path of a particle would similarly be described by three numbers: [itex]\frac{dr}{dt}, \frac{d\theta}{dt}, \frac{d\phi}{dt}[/itex], which you can think of as components of the velocity vector. But if you're computing the speed [itex]v[/itex], it's not simply [itex]v^2 = \frac{dr}{dt}^2 + \frac{d\theta}{dt}^2 + \frac{d\phi}{dt}^2[/itex]. Instead, it's more complicated:
[itex]v^2 = \frac{dr}{dt}^2 + r^2 \frac{d\theta}{dt}^2 + r^2 sin^2(\theta) \frac{d\phi}{dt}^2[/itex]. That can be understood as

[itex]\sum_{i,j}\ g_{ij} v^i v^j[/itex]

where [itex]g_{rr} = 1[/itex], [itex]g_{\theta \theta} = r^2[/itex], [itex]g_{\phi \phi} = r^2 sin^2(\theta)[/itex].

Of course, it is possible to get a notion of a velocity vector that uses a trivial metric, by letting the components of the velocity vector be:

[itex]\frac{dr}{dt}, r \frac{d\theta}{dt}, r sin(\theta) \frac{d\phi}{dt}[/itex]

That allows the metric to be simple, but in exchange for making the velocity vector more complicated. In a sense, what you'd be doing is incorporating [itex]\sqrt{g}[/itex] into the definition of the velocity vector (which is the same thing that is going on in using [itex]ict[/itex]). That's a convoluted thing to do, and only works in the cases where the metric tensor is diagonal, which isn't always the case.

PWiz · Apr 25, 2015

@stevendaryl OK, I guess the better approach is to use simpler vector components by including the extra terms in the metric.

stevendaryl said:

The approach that you're talking about ignores the distinction between covariant vectors (ones with components AμA^\mu) and contravariant vectors (ones with components AμA_\mu).

I think you reversed the order there - upper indices denote contravariant vectors. If the vector ##\vec A = A^{\alpha} e_{\alpha}## is constructed in a different frame, then the basis vector transformation will be represented by ##e_{\alpha}=Λ^{\bar \beta}_{\alpha} e_{\bar \beta}## (covariance), while the vector components transformation will be given by ##A^{\bar \beta}=Λ^{\bar \beta}_{\alpha} A^{\alpha}## (contravariance) so that the same vector is constructed in the other frame as well (using the lorentz transformation matrix).

stevendaryl said:

∑i,j gijvivj\sum_{i,j}\ g_{ij} v^i v^j

If upper indices were covariant, this would not make any sense.

Orodruin · Apr 25, 2015

PWiz said:

If upper indices were covariant, this would not make any sense.

Well, upper indices are contravariant, so "lucky" us ...

stevendaryl · Apr 25, 2015

PWiz said:

I think you reversed the order there - upper indices denote contravariant vectors. If the vector ##\vec A = A^{\alpha} e_{\alpha}## is constructed in a different frame, then the basis vector transformation will be represented by ##e_{\alpha}=Λ^{\bar \beta}_{\alpha} e_{\bar \beta}## (covariance), while the vector components transformation will be given by ##A^{\bar \beta}=Λ^{\bar \beta}_{\alpha} A^{\alpha}## (contravariance) so that the same vector is constructed in the other frame as well (using the lorentz transformation matrix).

Sorry for mixing up the terminology. But your explanation of the two prefixes using basis vectors simply shows that basis vector indices have to be the opposite of vector components: If one is "co", then the other is "contra". It doesn't say which should be which.

Orodruin · Apr 25, 2015

This is actually the reason I dislike the term "covariant vector" (and "contravariant vector"). The vectors themselves are either tangent vectors, which may be defined as directional derivatives or equivalence classes of curves, or covectors. Tangent vectors have a coordinate basis which transforms covariantly with components transforming contravariantly and covectors have contravariant coordinate bases and covariant components. The vectors themselves are not dependent on the choice of coordinate system and are either tangent vectors or covectors.

PWiz · Apr 25, 2015

stevendaryl said:

Sorry for mixing up the terminology. But your explanation of the two prefixes using basis vectors simply shows that basis vector indices have to be the opposite of vector components: If one is "co", then the other is "contra". It doesn't say which should be which.

AFAIK, there is no way to actually "prove" why upper indices were chosen to be contravariant - it just seems like an arbitrary decision to me. All that really matters in the end is that the summation is correctly carried out when two identical upper and lower indices are seen in an expression, and that one-forms and vectors are clearly distinguishable and recognizable when seen together (to prevent tensor algebra from going topsy-turvy).

Energy Formulas in SR: Explained

Undergrad Why is gravity a fictitious force?

Undergrad Relativistic Space Travel: Optimizing Proper Time [Project Hail Mary]

Undergrad KE of rotating disc

Undergrad Why is the Lorentz Force always perpendicular to velocity?

Graduate How valid is the Block Universe theory?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Energy Formulas in SR: Explained

Similar threads