Energy Formulas in SR: Explained

  • Context: Graduate 
  • Thread starter Thread starter PWiz
  • Start date Start date
  • Tags Tags
    Energy Formulas Sr
Click For Summary

Discussion Overview

The discussion revolves around the energy formulas in special relativity, specifically the relationships between the equations ##E=\gamma m_0 c^2## and ##E=\sqrt{(m_0 c^2)^2 + (pc)^2}##. Participants explore the implications of these formulas, their applications in various contexts, and the conceptual understanding of mass and energy in relativistic physics.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants suggest that the two energy formulas are equivalent when using the relativistic expression for momentum.
  • Others argue that the term "relativistic mass" can lead to confusion and is not commonly used among physicists today.
  • A participant questions the necessity of the second formula when the first appears more elegant and compact for certain calculations.
  • Some participants propose that the second formula is more versatile, applicable even for massless particles like photons.
  • There is a discussion about the advantages of using four-vector formalism over traditional methods in special relativity, particularly in simplifying calculations.
  • A later reply emphasizes that using four-vectors can prevent common mistakes in energy and momentum conservation problems.
  • Participants explore examples where four-vectors provide clearer solutions compared to scalar quantities.

Areas of Agreement / Disagreement

Participants express differing views on the use of relativistic mass and the preferred energy formula for various scenarios. There is no consensus on which formula is superior or more appropriate for specific applications, indicating ongoing debate and exploration of the topic.

Contextual Notes

Some participants note that the discussion involves assumptions about the definitions of mass and energy, as well as the conditions under which the formulas apply. There are unresolved questions regarding the implications of using different formulations in practical scenarios.

Who May Find This Useful

This discussion may be of interest to students and professionals in physics, particularly those studying special relativity, energy-momentum relationships, and the mathematical frameworks used in theoretical physics.

PWiz
Messages
695
Reaction score
117
As I understand it, the formula ##E=\gamma m_0 c^2## gives the total energy of a body in any inertial frame. However, the formula ##E=\sqrt{ (m_0 c^2)^2 + (pc)^2}##, which is also supposed to give the total energy of a body (in any inertial frame), does not equal to the first formula. Why is this? I'm guessing that the answer is very simple, but I'm not able to lay my finger on it. Any help is appreciated.
 
  • Like
Likes   Reactions: Dr Arun K JOY
Physics news on Phys.org
##m_0c^2## is the energy of the particle at rest (##m_0## is the rest mass). You are no doubt aware that at relativistic speeds mass increases. ##E=\sqrt{(m_0c^2)^2+(pc)^2}## is the energy of the particle with the mass increase taken into consideration. It's just a restatement of the fact that ##E=m_vc^2## where ##m_v## is the mass at velocity ##v##.You should have no trouble proving equation 2 if you remember that ##m_v=\frac{m_0}{\sqrt{1-\frac{v^2}{c^2}}}##. I highly suspect that ##\gamma m_0## in equation 1 is just a convenient way to denote ##m_v##
 
PWiz said:
which is also supposed to give the total energy of a body (in any inertial frame), does not equal to the first formula. Why is this? I'm guessing that the answer is very simple,

The answer is very simple indeed. The two formulas are exactly the same if you use the relativistic expression for momentum.
 
certainly said:
You are no doubt aware that at relativistic speeds mass increases.
This is not in line with what most physicists mean when they say "mass". See our FAQ: https://www.physicsforums.com/threads/what-is-relativistic-mass-and-why-it-is-not-used-much.796527/
 
Last edited by a moderator:
Orodruin said:
It happens that the term relativistic mass is used, in particular in introductory text on special relativity.
This was exactly the case.
Orodruin said:
relativistic mass depends on the frame in which the object is observed
So does this mean that if I accelerate an object of mass ##m## to a high velocity and bring it back at rest the mass will still be ##m##?
 
@certainly Thanks, but I usually avoid using ##\gamma m_0## for ##m## and instead just stick with ##\vec p=\gamma m_0 v##, since it causes much less confusion.

@Orodruin Ah, got it! I was thinking how ##\sqrt{1+{\gamma}^2 \frac{v^2}{c^2}}## was equivalent to gamma, but it turns out it was just because I was being lazy and was not explicitly using the gamma function to prove the equivalence of the formulae. But I have one more question though - why do we use the second equation when the first one is so elegant, neat and compact? I can understand that in collisions involving multiple bodies one would use four-momentum vectors trying to find a zero momentum frame and then use scalar products, but for relativistic K.E. calculations, can a simple ##m_0c^2(\gamma -1)## not suffice? I mean first calculating the relativistic momentum in the individual spatial dimensions, then finding the resultant momentum and using the scalar product to calculate the total energy of the body in a frame takes a lot more time than simply plugging in the resultant velocity and mass in the first formula.
 
The second form is actually much more elegant in my view and the formulation in terms of four vectors is very powerful. You will find very few particle physicists actually compute gamma factors in terms of velocities.
 
@Orodruin Ah, got it! I was thinking how ##\sqrt{1+{\gamma}^2 \frac{v^2}{c^2}}## was equivalent to gamma, but it turns out it was just because I was being lazy and was not explicitly using the gamma function to prove the equivalence of the formulae. But I have one more question though - why do we use the second equation when the first one is so elegant, neat and compact?[/QUOTE]

The second formula applies even for zero-mass particles such as the photon.
 
stevendaryl said:
The second formula applies even for zero-mass particles such as the photon.
How so? You have to use ##E=hf## regardless of which of the two formulas you choose, since the spatial momentum of particles with 0 rest mass is equal to their energy as they move on null lines.
 
  • #10
PWiz said:
How so? You have to use ##E=hf## regardless of which of the two formulas you choose, since the spatial momentum of particles with 0 rest mass is equal to their energy as they move on null lines.

Well, [itex]E=\gamma m_0 c^2[/itex] is undefined when [itex]v=c[/itex] and [itex]m_0 = 0[/itex]. But [itex]E = \sqrt{(m_0)^2 c^4 + p^2 c^2}[/itex] is still true.
 
Last edited:
  • #11
stevendaryl said:
Well, [itex]E=\gamma m_0 c^2[/itex] is undefined when [itex]v=0[/itex] and [itex]m_0 = 0[/itex]. But [itex]E = \sqrt{(m_0)^2 c^4 + p^2 c^2}[/itex] is still true.
Yes, but it's more work again. For objects with non-zero rest mass, one can use ##E=\gamma m_0 c^2## and for massless particles, ##E=hf## can be used. I still don't find a reason to use the length of a four-momentum vector. Can you give an example where it is more convenient to use the second formula?
 
  • #12
PWiz said:
Can you give an example where it is more convenient to use the second formula?

Basically any kinematics problem in special relativity.

PWiz said:
I still don't find a reason to use the length of a four-momentum vector.

The "length" of the four-momenutm is ##m_0 c##. The power of the four-vector formalism is that inner products between four-vectors are scalars and you can therefore evaluate them in any frame.

I can tell you from experience in correcting exams in special relativity: The people who try to apply conservation of energy and momentum separately rather than learning the four-vector formalism generally make mistakes and get lost in mathematical issues that they really would not need to if they applied four-vector formalism.

Here is an example from the latest exam I put to my students:
Consider the particle collision ##e^− + e^− \to e^− + e^− + e^− + e^+##. Compute the necessary total energy of one of the initial electrons in the rest frame of the other for this process to occur. Also compute the ratio between this energy and the total required energy in the center of momentum frame.
 
  • #13
PWiz said:
Yes, but it's more work again. For objects with non-zero rest mass, one can use ##E=\gamma m_0 c^2## and for massless particles, ##E=hf## can be used. I still don't find a reason to use the length of a four-momentum vector. Can you give an example where it is more convenient to use the second formula?

Vectors are much more convenient to work with than non-vector quantities such as [itex]\gamma[/itex]. For one thing, an expression written in terms of 4-vectors is true in any reference frame, so you can pick a reference frame in which the vectors have a particularly simple form to evaluate them.

For example, the expression

[itex]E^2 - p^2 c^2[/itex]

is the magnitude of the 4-vector [itex](E, p^x c, p^y c, p^z c)[/itex]. It can be evaluated for a massive particle by looking in its rest frame, where

[itex]E = m_0 c^2[/itex]
[itex]p = 0[/itex]

So we conclude that [itex]E^2 - p^2 c^2 = m_0^2 c^4[/itex] is true in every frame.

Here's another example of using 4-vectors. Suppose you have a rocket that travels in such a way that, as measured by accelerometers aboard the rocket, the acceleration is constant. How do you compute the rocket's position as a function of time?

That's enormously complicated to do using Lorentz transformations. But in terms of 4-vectors, we can let [itex]U[/itex] be the 4-velocity, and let [itex]A[/itex] be the 4-acceleration, and then the condition of constant acceleration becomes:

[itex]A \cdot A = -g^2[/itex]

where g is the magnitude of the acceleration. That equation can readily be integrated to get [itex]U[/itex], which can be integrated to get [itex](ct, x, y, z)[/itex] as a function of proper time, [itex]\tau[/itex].
 
  • Like
Likes   Reactions: PWiz
  • #14
@Orodruin Did your students have to include the repulsive force between the electrons in their calculations?
Orodruin said:
The "length" of the four-momenutm is m0cm_0 c.
Um shouldn't it be ##m_0 c^2## instead?
@stevendaryl I liked the rocket example. I suppose one could use hyperbolic functions there and avoid vectors, but that would indeed complicate matters.
 
  • #15
PWiz said:
@Orodruin Did your students have to include the repulsive force between the electrons in their calculations?

No. This type of computations generally assume that the end products are well separated.

Um shouldn't it be ##m_0 c^2## instead?

No, ##m c^2## has units of energy, not momentum.
 
  • #16
Orodruin said:
No, mc2m c^2 has units of energy, not momentum.
Wait, you're defining the four momentum as ##p_{\mu}=(\frac E{c},\vec p)## or as ##(E,\vec pc)##? The first one will have a magnitude of ##m_0 c## whereas the second will have ##m_0 c^2##.
 
  • #17
PWiz said:
Wait, you're defining the four momentum as ##p_{\mu}=(\frac E{c},\vec p)## or as ##(E,\vec pc)##? The first one will have a magnitude of ##m_0 c## whereas the second will have ##m_0 c^2##.

As most theoretical physicists, I usually set ##c = 1## and define the four-momentum as ##(E,\vec p)##. :)
If I am forced to use units where ##c \neq 1## I would go for the definition ##(E/c,\vec p)##. After all, it is called four-momentum and not four-energy.
 
  • #18
Orodruin said:
As most theoretical physicists, I usually set ##c = 1## and define the four-momentum as ##(E,\vec p)##. :)
If I am forced to use units where ##c \neq 1## I would go for the definition ##(E/c,\vec p)##. After all, it is called four-momentum and not four-energy.
It's the same thing with spacetime coordinates as well. Some use ##(t,x,y,z)## and some use ##(ct,x,y,z)##. The problem is that it becomes painful solving questions which are framed in SI units because you have to remember where to insert the c's and constantly multiply/divide. I usually like to use the ##(ict,x,y,z)## and add ##i## to the 0th component of pretty much every four vector so that I don't have to "untune" my Euclidean metric sense. Is this unwise?
 
  • #19
PWiz said:
It's the same thing with spacetime coordinates as well. Some use ##(t,x,y,z)## and some use ##(ct,x,y,z)##. The problem is that it becomes painful solving questions which are framed in SI units because you have to remember where to insert the c's and constantly multiply/divide. I usually like to use the ##(ict,x,y,z)## and add ##i## to the 0th component of pretty much every four vector so that I don't have to "untune" my Euclidean metric sense. Is this unwise?

The use of [itex]ict[/itex] is discouraged these days. It makes some formulas involving SR seem simpler, but when you go on to General Relativity, using [itex]ict[/itex] doesn't work, anymore.

The preferred way to deal with vectors in General Relativity is to explicitly use the metric tensor. Do you know what the metric tensor is?
 
  • #20
PWiz said:
It's the same thing with spacetime coordinates as well. Some use ##(t,x,y,z)## and some use ##(ct,x,y,z)##. The problem is that it becomes painful solving questions which are framed in SI units because you have to remember where to insert the c's and constantly multiply/divide. I usually like to use the ##(ict,x,y,z)## and add ##i## to the 0th component of pretty much every four vector so that I don't have to "untune" my Euclidean metric sense. Is this unwise?
It is old fashioned, but if it helps you that is all that matters. I think it has fewer potential problems that relativistic mass. In fact, the only major limitation is the generalization to arbitrary coordinates SR or to GR - it only remains simple in SR with standard coordinates. For example, Dirac and Bondi developed nice approaches in SR that use coordinates that have two lightlike coordinates and two spatial coordinates, with no time coordinate at all. That works just fine with metric approaches but I don't see how you deal with such formulations using ict.
 
  • #21
stevendaryl said:
but when you go on to General Relativity, using ict doesn't work, anymore.
Isn't that because in SR spacetime is represented by flat Minkowski space, whereas in GR spacetime is a curved pseudo-Riemannian manifold (which is locally flat)?
stevendaryl said:
Do you know what the metric tensor is?
Yes, but only the Euclidean metric tensor $$g=
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}$$
and the Lorentzian metric tensor (with c=1)
$$η_{μν}=
\begin{bmatrix}
-1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}$$
used for taking the norm of two vectors in their respective manifolds.
I don't know the mathematical details of GR, but I'm pretty sure a different metric tensor is used with polar coordinates (I think the Schwarzschild metric), although ##ds^2=η_{μν} dx^μ dx^ν## still holds true (I guess).
@PAllen Which alternate coordinate systems in SR are you talking about?
 
  • #22
PWiz said:
Isn't that because in SR spacetime is represented by flat Minkowski space, whereas in GR spacetime is a curved pseudo-Riemannian manifold (which is locally flat)?

Yes, but only the Euclidean metric tensor $$g=
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}$$
and the Lorentzian metric tensor (with c=1)
$$η_{μν}=
\begin{bmatrix}
-1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}$$
used for taking the norm of two vectors in their respective manifolds.
I don't know the mathematical details of GR, but I'm pretty sure a different metric tensor is used with polar coordinates (I think the Schwarzschild metric), although ##ds^2=η_{μν} dx^μ dx^ν## still holds true (I guess).
@PAllen Which alternate coordinate systems in SR are you talking about?

Well, for example, you can do SR in polar coordinates, in which case the metric tensor no longer takes on the form of [itex]\eta_{\mu \nu}[/itex].

The point is that using [itex]ict[/itex] is a way to get the minus sign into expressions such as [itex]-(ct)^2 + x^2 + y^2 + z^2[/itex], so you can pretend that you are using Euclidean vectors. But the more general approach is to let the [itex]-1[/itex] be part of the metric tensor, and keep all components as real numbers.
 
  • #23
stevendaryl said:
But the more general approach is to let the −1-1 be part of the metric tensor, and keep all components as real numbers.
The Lorentzian metric changes to a Euclidean one if you use ##ict##. Isn't it better that way? You then have only have one metric tensor to deal with.
 
  • #24
Consider u = x-ct, v=x+ct. A line of u constant describe a light ray in the +x direction, a line of constant v describes a light ray in the -x direction. Using these coordinates, (ignoring YZ plane), you end up with the amazingly simple metric:

ds2 = du dv

and I am not setting c=1, it just drops out if transform to these coordinates.

So where do you put your ict here?
 
  • #25
PWiz said:
The Lorentzian metric changes to a Euclidean one if you use ##ict##. Isn't it better that way? You then have only have one metric tensor to deal with.

No, it's really not the better approach. The approach that you're talking about ignores the distinction between covariant vectors (ones with components [itex]A^\mu[/itex]) and contravariant vectors (ones with components [itex]A_\mu[/itex]). That distinction is extremely important for curved spacetime, but it's also important for flat spacetime, if you're using curvilinear coordinates.

For example, let's just do 3-dimensional vectors in spherical coordinates,nonrelativistically. Then a point in space is described by three numbers, [itex]r, \theta, \phi[/itex]. The path of a particle would similarly be described by three numbers: [itex]\frac{dr}{dt}, \frac{d\theta}{dt}, \frac{d\phi}{dt}[/itex], which you can think of as components of the velocity vector. But if you're computing the speed [itex]v[/itex], it's not simply [itex]v^2 = \frac{dr}{dt}^2 + \frac{d\theta}{dt}^2 + \frac{d\phi}{dt}^2[/itex]. Instead, it's more complicated:
[itex]v^2 = \frac{dr}{dt}^2 + r^2 \frac{d\theta}{dt}^2 + r^2 sin^2(\theta) \frac{d\phi}{dt}^2[/itex]. That can be understood as

[itex]\sum_{i,j}\ g_{ij} v^i v^j[/itex]

where [itex]g_{rr} = 1[/itex], [itex]g_{\theta \theta} = r^2[/itex], [itex]g_{\phi \phi} = r^2 sin^2(\theta)[/itex].

Of course, it is possible to get a notion of a velocity vector that uses a trivial metric, by letting the components of the velocity vector be:

[itex]\frac{dr}{dt}, r \frac{d\theta}{dt}, r sin(\theta) \frac{d\phi}{dt}[/itex]

That allows the metric to be simple, but in exchange for making the velocity vector more complicated. In a sense, what you'd be doing is incorporating [itex]\sqrt{g}[/itex] into the definition of the velocity vector (which is the same thing that is going on in using [itex]ict[/itex]). That's a convoluted thing to do, and only works in the cases where the metric tensor is diagonal, which isn't always the case.
 
  • #26
@stevendaryl OK, I guess the better approach is to use simpler vector components by including the extra terms in the metric.
stevendaryl said:
The approach that you're talking about ignores the distinction between covariant vectors (ones with components A^\mu) and contravariant vectors (ones with components A_\mu).
I think you reversed the order there - upper indices denote contravariant vectors. If the vector ##\vec A = A^{\alpha} e_{\alpha}## is constructed in a different frame, then the basis vector transformation will be represented by ##e_{\alpha}=Λ^{\bar \beta}_{\alpha} e_{\bar \beta}## (covariance), while the vector components transformation will be given by ##A^{\bar \beta}=Λ^{\bar \beta}_{\alpha} A^{\alpha}## (contravariance) so that the same vector is constructed in the other frame as well (using the lorentz transformation matrix).
stevendaryl said:
i,j gijvivj\sum_{i,j}\ g_{ij} v^i v^j
If upper indices were covariant, this would not make any sense.
 
  • #27
PWiz said:
If upper indices were covariant, this would not make any sense.
Well, upper indices are contravariant, so "lucky" us ...
 
  • Like
Likes   Reactions: PWiz
  • #28
PWiz said:
I think you reversed the order there - upper indices denote contravariant vectors. If the vector ##\vec A = A^{\alpha} e_{\alpha}## is constructed in a different frame, then the basis vector transformation will be represented by ##e_{\alpha}=Λ^{\bar \beta}_{\alpha} e_{\bar \beta}## (covariance), while the vector components transformation will be given by ##A^{\bar \beta}=Λ^{\bar \beta}_{\alpha} A^{\alpha}## (contravariance) so that the same vector is constructed in the other frame as well (using the lorentz transformation matrix).

Sorry for mixing up the terminology. But your explanation of the two prefixes using basis vectors simply shows that basis vector indices have to be the opposite of vector components: If one is "co", then the other is "contra". It doesn't say which should be which.
 
  • #29
This is actually the reason I dislike the term "covariant vector" (and "contravariant vector"). The vectors themselves are either tangent vectors, which may be defined as directional derivatives or equivalence classes of curves, or covectors. Tangent vectors have a coordinate basis which transforms covariantly with components transforming contravariantly and covectors have contravariant coordinate bases and covariant components. The vectors themselves are not dependent on the choice of coordinate system and are either tangent vectors or covectors.
 
  • #30
stevendaryl said:
Sorry for mixing up the terminology. But your explanation of the two prefixes using basis vectors simply shows that basis vector indices have to be the opposite of vector components: If one is "co", then the other is "contra". It doesn't say which should be which.
AFAIK, there is no way to actually "prove" why upper indices were chosen to be contravariant - it just seems like an arbitrary decision to me. All that really matters in the end is that the summation is correctly carried out when two identical upper and lower indices are seen in an expression, and that one-forms and vectors are clearly distinguishable and recognizable when seen together (to prevent tensor algebra from going topsy-turvy).
 

Similar threads

  • · Replies 82 ·
3
Replies
82
Views
7K
  • · Replies 87 ·
3
Replies
87
Views
5K
  • · Replies 17 ·
Replies
17
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 47 ·
2
Replies
47
Views
5K
  • · Replies 15 ·
Replies
15
Views
1K
  • · Replies 40 ·
2
Replies
40
Views
5K