# A short derivation of the relativistic forms of energy and momentum

• I
• Featured
Staff Emeritus
Summary
This derivation uses collisions in 1 dimension, plus rapidity
I've been noodling around with derivations of the relativistic energy and momentum, and I almost got it down to just a few lines. But not quite.

I'm going to work in one spatial dimension, for simplicity (even though some derivations require a second spatial dimension)

Let's assume that there is an energy associated with a moving object of mass m given by:

##E=m g(v)##

and a corresponding momentum

##p= m f(v)##

We want to recover the nonrelativistic forms for the case ##\frac{v}{c} \rightarrow 0##, so we should have:

##g(v)=g(0)+\frac{1}{2} v^2+ ## higher-order terms in vc
##f(v)=v+ ## higher-order terms

The key insight is to look at conservation of energy and momentum in different frames. For that reason, it's actually more convenient to switch from velocity v to "rapidity" ##\theta##, related through ##v= c tanh(\theta)## (##tanh## is the hyperbolic tangent). In terms of rapidity, the Lorentz dilation factor ##\gamma = \dfrac{1}{1-\frac{v^2}{c^2}} = cosh(\theta)##.
The reason rapidity is more convenient is the addition law. Under a change of rest frames, velocity changes as follows: ##v′=\dfrac{v+u}{1+\frac{uv}{c^2}}##, where ##u## is the relative velocity between frames. Kind of messy. But if we let ##v= c tanh(\theta)## and ##u=c tanh(\phi)## and let ##v′= c tanh(\theta')##, then the addition law becomes just: ##\theta' = \theta + \phi##.

Since for small values of ##\theta##, ##tanh(\theta)\approx \theta## and so ##v \approx c \theta##, we can rewrite our limiting cases for g and f in terms of ##\theta##:

##g(\theta) \approx g(0)+\frac{1}{2} c^2 \theta^2##
##f(\theta) \approx c \theta##

Let's imagine a collision in which a number of masses collide inelastically to form a large composite mass. Let's choose a frame in which the large mass is at rest after the collision. Then conservation of energy and momentum become:

##\sum_j m_j g(\theta_j) = M g(0)##
##\sum_j m_j f(\theta_j) = 0##

Now, what we do is shift to another frame traveling at relative speed ##u = tanh(\phi)## relative to the first. Let's assume ##\phi## is small, so that we can use a Taylor series about ##\phi## and keep just the first few terms. (Note: we are thus assuming that ##M## is moving nonrelativistically, but not the smaller masses that collided.)

##\sum_j m_j g(\theta_j + \phi) = M g(\phi) \approx M g(0) + \frac{1}{2} M c^2 \phi^2##
##\sum_j m_j f(\theta_j + \phi) = M f(\phi) \approx M c \phi##

At this point, let's also expand the left-hand sides in powers of ##\phi##:

##g(\theta_j + \phi) \approx g(\theta_j) + g'(\theta_j) \phi + \frac{1}{2} g''(\theta_j) \phi^2##
##f(\theta_j + \phi) \approx f(\theta_j) + f'(\theta_j) \phi##

where ##'## means take the derivative of the function with respect to its argument.

Equating equal powers of ##\phi## gives:

1. ##\sum_j m_j g(\theta_j) = M g(0)##
2. ##\sum_j m_j g'(\theta_j) = 0##
3. ##\sum_j m_j g''(\theta_j) = M c^2##
4. ##\sum_j m_j f(\theta_j) = 0##
5. ##\sum_j m_j f'(\theta_j) = Mc##

We can plug equation 1 into equation 3 to get:

##\sum_j m_j g''(\theta_j) = \frac{c^2}{g(0)} \sum_j m_j g(\theta_j) ##

This has to be true for any collection of masses, so the only solution is if for each particle,

##g''(\theta) = \frac{c^2}{g(0)} g(\theta)##

Equation 5 is just like equation 3, except for a factor of ##c##, so we also conclude:

##f'(\theta) = \frac{c}{g(0)} g(\theta) = \frac{1}{c} g''(\theta)##

This implies that ##f(\theta) = \frac{1}{c} g'(\theta)##. (There is an arbitrary constant involved in integrating, but it must be zero if ##f(0) = 0##, which must be true for momentum).

So the equation for ##g''(\theta)## has an immediate solution:

##g(\theta) = g(0) cosh(\frac{c}{\sqrt{g(0}} \theta)##

Since ##cosh(\theta) = \gamma##, we would be home free if we could show that ##g(0) = c^2##. But information about collisions doesn't seem to give enough information to conclude this. (This post is about collisions in one dimension. There is an alternative derivation in 2 dimensions that allows you to deduce this.)

But there is a final trick up my sleeve, which is the work-energy equation. In one dimension.

##\Delta E = F \Delta x = \dfrac{\Delta p}{\Delta t} \Delta x = \Delta p \dfrac{\Delta x}{\Delta t} = \dfrac{\Delta p}{\Delta \theta} \Delta \theta \dfrac{\Delta x}{\Delta t} ##

Or dividing through by ##\Delta \theta## and taking the limit as ##\Delta t \rightarrow 0## and ##\Delta \theta \rightarrow 0##

##\dfrac{dE}{d\theta} = \dfrac{dp}{d\theta} v## where I've used ##\frac{dx}{dt} = v##.

Since in terms of ##\theta##, ##v = c \tanh(\theta)##, this becomes:

##\dfrac{dE}{d\theta} = \dfrac{dp}{d\theta} c tanh(\theta)##.

In terms of our functions ##g## and ##f##, this implies

##m \dfrac{dg}{d\theta} = m \dfrac{df}{d\theta} c tanh(\theta)##.

Since ##f(\theta) = \frac{1}{c} g'(\theta)##, we can rewrite this in terms of ##f##

##c f(\theta) = \dfrac{df}{d\theta} c tanh(\theta)##

Which implies that

##\dfrac{\frac{df}{d\theta}}{f} = \frac{1}{tanh(\theta)} = \dfrac{cosh(\theta)}{sinh(\theta)}##

which has the solution:

##f(\theta) = K sinh(\theta)##

for some constant ##K##. The nonrelativistic limit is ##f(\theta) \approx c \theta## tells us that ##K = c##.

So there we have it:

##f(\theta) = c sinh(\theta)##

Since ##f(\theta) = \frac{1}{c} g'(\theta)##, this implies:

##g'(\theta) = c^2 sinh(\theta)##, so ##g(\theta) = c^2 cosh(\theta)## (the constant must be zero from the fact that ##g''(\theta) \propto g(\theta)##)

So we conclude the energy-momentum functions for relativity:

##E = m g(\theta) = m c^2 cosh(\theta) = m c^2 \gamma##
##p = m f(\theta) = mc sinh(\theta) = m c tanh(\theta) cosh(\theta) = m v \gamma##

Last edited:
• romsofia, SiennaTheGr8 and vanhees71

## Answers and Replies

Gold Member
2021 Award
Another approach is to use Poincare invariance and ask for a Lagrangian of one particle fulfilling all the 10 conservation laws associated with the one-parameter subgroups.

For both Newtonian and relativistic spacetime from the symmetry under time and space translations as well as spatial rotations it follows that
$$L(\vec{x},\vec{v},t)=F(\vec{v}^2).$$
Now you can use the formalism of Noether's theorem for the symmetry under Lorentz boosts, but this is not necessary too, because the only way to get an invariant action only built from ##\vec{v}^2=\dot{\vec{x}}^2## obviously is
$$S=-m c^2\int_{t_1}^{t_2} \mathrm{d} t \sqrt{1-\dot{\vec{x}}^2/c^2}.$$
From this you get the (canonical) momenta
$$\vec{p}=\frac{\partial L}{\partial \dot{\vec{x}}}=\frac{m \dot{\vec{x}}}{\sqrt{1-\dot{\vec{x}}^2/c^2}}.$$
Since ##L## is not explicitely dependent on time, you have also the energy as conserved quantity
$$E=\vec{p} \cdot \dot{\vec{x}}-L=\frac{m c^2}{\sqrt{1-\dot{\vec{x}}^2/c^2}}.$$
Since further
$$\mathrm{d} \tau=\sqrt{1-\vec{v}^2/c^2} \mathrm{d} t$$
is a scalar quantity and we can write
$$\begin{pmatrix} E/c \\ \vec{p} \end{pmatrix}=m \frac{\mathrm{d} x^{\mu}}{\mathrm{d} \tau}$$
with ##x=(x^{\mu})=(c t,\vec{x})## we get the four-momentum in a manifestly covariant way.

Another approach is to use Poincare invariance and ask for a Lagrangian of one particle fulfilling all the 10 conservation laws associated with the one-parameter subgroups.
[...] we get the four-momentum in a manifestly covariant way.
(Nitpick:) what about light-like particles? Gold Member
2021 Award
Though there are no light-like particles, you can describe them with the Lagrange formalism. Of course you have to change another Lagrangian.

For a massive particle if you have in the above given "square-root form", i.e., a parametrization invariant action,
$$S=\int \mathrm{d} \lambda [-mc^2 \sqrt{\dot{x}_{\mu} \dot{x}^{\mu}}+L_I(x,\dot{x},\lambda)],$$
you can use another physically equivalent action
$$\tilde{S}=\int \mathrm{d} \lambda [-\frac{m}{2} \dot{x}_{\mu} \dot{x}^{\mu}+L_I(x,\dot{x})].$$
Here ##L_I(x,\dot{x})## must be a 1st-rank homogeneous function of ##\dot{x}##, i.e.,
$$L_I=\dot{x}^{\mu} \frac{\partial L_I}{\partial \dot{x}^{\mu}}.$$
While with ##S## you get a parametrization-invariant description, i.e., you can choose whatever world-line parameter you like, with ##\tilde{S}## you get automatically an affine parameter, i.e., since the Lagrangian is not explicitly dependent on ##\lambda##, you get
$$\dot{x}_{\mu} \dot{x}^{\mu}=C=\text{const}.$$
For a massive particle the world line must be timelike, and you can choose ##C=1## (then ##\lambda=s=c \tau##) or ##C=c^2## (then ##\lambda=\tau##, with ##\tau## the particle's proper time).

Now in the latter form with ##\tilde{S}## you can also choose ##C=0##. Then you describe massless particles with some arbitrary affine parameter, though I don't think that massless classical point particles make any sense.

The only example that makes sense is the use of the "naive photon picture" for the propagation of light, though it's utterly misleading to call this "photon". The most famous application is the calculation of Einstein's formula for the bending of light by the Sun within general relativity (where the above Lagrange formalism is of course also valid; you only have to use ##g_{\mu \nu}## describing the Lorentzian space-time pseudometric). What's called by modern slang "photon" here is nothing else than the eikonal approximation for the electromagnetic field (aka geometrical optics), taking its characteristics as worldlines of "massless photons".

For a massive particle if you have in the above given "square-root form", i.e., a parametrization invariant action,
$$S=\int \mathrm{d} \lambda [-mc^2 \sqrt{\dot{x}_{\mu} \dot{x}^{\mu}}+L_I(x,\dot{x},\lambda)],$$
you can use another physically equivalent action
$$\tilde{S}=\int \mathrm{d} \lambda [-\frac{m}{2} \dot{x}_{\mu} \dot{x}^{\mu}+L_I(x,\dot{x})].$$
Here ##L_I(x,\dot{x})## must be a 1st-rank homogeneous function of ##\dot{x}##, i.e.,
$$L_I=\dot{x}^{\mu} \frac{\partial L_I}{\partial \dot{x}^{\mu}}.$$
In ##\tilde{S}##, don't you need a 2-homogeneous version of ##L_I## ?

romsofia
If that's short, I wonder what the long way is!

I kid, fun read!

• stevendaryl
Staff Emeritus
Something that I realize about how relativistic energy momentum differs from classical, apart from the different transformations rules is this:
Classical physics assumes that momentum and kinetic energy are proportional to mass. But it doesn’t assume that total energy is proportional to mass. So there can be two objects with the same mass, same momentum, same kinetic energy, but different total energy.

Gold Member
2021 Award
In ##\tilde{S}##, don't you need a 2-homogeneous version of ##L_I## ?
The trick is to keep ##L_I## from the parametrization-independent ("square-root") formulation. Then due to the conservation law from ##L## being not explicitly dependent on the parameter in the formulation with ##\tilde{S}## implies that ##\lambda## is an affine parameter. BTW for ##S## this law doesn't lead to anything, because there ##H \equiv 0##.

For a derivation of ##\tilde{S}## in the special-relativistic case, see

https://itp.uni-frankfurt.de/~hees/pf-faq/srt.pdf

Sect. 2.4.3

Gold Member
2021 Award
Something that I realize about how relativistic energy momentum differs from classical, apart from the different transformations rules is this:
Classical physics assumes that momentum and kinetic energy are proportional to mass. But it doesn’t assume that total energy is proportional to mass. So there can be two objects with the same mass, same momentum, same kinetic energy, but different total energy.
Also in relativistic physics nobody assumes that total energy is proportional to mass. How do you come to this idea?

Staff Emeritus
Also in relativistic physics nobody assumes that total energy is proportional to mass. How do you come to this idea?

How can you say “nobody” when I implied that I assume that? That’s kind of rude. I’m nobody?

I thought it was common for massive particles to define mass to be the energy as measured in the rest frame. What definition of mass are you using?

Also in relativistic physics nobody assumes that total energy is proportional to mass. How do you come to this idea?
Huh? Total energy is the time component of 4-momentum, which is mass times 4-velocity, so total energy is proportional to mass.

ergospherical
Well, to give an example, the electrostatic energy ##U = \dfrac{q_1 q_2}{r}## of two charged particles [or more generally the electrostatic energy ##U = \frac{1}{8\pi} \int E^2 dV## of any distribution of charged particles] doesn't depend on mass.

Last edited:
Staff Emeritus
Here are some facts about momentum and energy that are true both relativistically and classically. (In one spatial dimension, for simplicity):

First relation

Let ##V(v,u)## be the velocity addition function: If an object has velocity ##v## in one frame ##F##, and that frame is moving velocity at velocity ##-u## relative to another frame, ##F'##, then the object has velocity ##V(v,u)## in frame ##F'##. (I chose the relative velocity to be ##-u## so that ##V(v,u) \gt v## when ##u \gt 0##.)

Classically, ##V(v,u) = v+u##. Relativistically, ##V(v,u) = \dfrac{v+u}{1+\frac{uv}{c^2}}##.

Let ##E(v)## be the energy of a particle traveling at speed ##v##, and let ##p(v)## be its momentum. Then both classically and relativistically,

##\frac{dE}{dv} \frac{\partial V}{\partial u}|_{u=0} = p##

Nonrelativistically, ##V(v,u)## is just ##v+u##, and so this becomes:

##\frac{dE}{dv} = p##

Relativistically ##V(v,u) = \dfrac{v+u}{1+\frac{uv}{c^2}}##. So this becomes:

##\frac{dE}{dv} (1 - \frac{v^2}{c^2}) = p##

Second relation

The work-energy relationship can be rearranged to give:

##\frac{dE}{dp} = v##

Combining the two relationships through writing ##\frac{dE}{dv} = \frac{dE}{dp} \frac{dp}{dv} = v \frac{dp}{dv}## gives:

Classically:

##v \frac{dp}{dv} = p##

This has the solution: ##p = mv##. (It has to be proportional to ##v##, so ##m## is in a sense, just the name of this constant of proportionality)

Relativistically:

##v (1 - \frac{v^2}{c^2}) \frac{dp}{dv} = p##

This is a more difficult equation to solve, but if you make the substitution ##v = c tanh(\theta)##, it simplifies to:
##\frac{dp}{p} = \frac{cosh(\theta)}{sinh(\theta)} d\theta##, which has the solution

##p = m sinh(\theta) = m \frac{v}{(1-\frac{v^2}{c^2})^\frac{1}{2}}##

Staff Emeritus
Well, to give an example, the electrostatic energy ##U = \dfrac{q_1 q_2}{r}## of two charged particles...

Well, in the frame in which the total momentum is 0, the two-particle composite system will have an effective mass that includes that electrostatic energy. The one-particle mass is a little ambiguous.

Well, to give an example, the electrostatic energy ##U = \dfrac{q_1 q_2}{r}## of two charged particles [or more generally the electrostatic energy ##U = \frac{1}{8\pi} \int E^2 dV## of any distribution of charged particles] doesn't depend on mass.
But this electrostatic energy belongs to a system. In the COM frame, the mass of the system is equal to the sum of the energies, including this electrostatic energy.

Source:

ergospherical
This is all true, but the energy in the field still does not depend on the masses of the particles

• dextercioby
This is all true, but the energy in the field still does not depend on the masses of the particles
Who said anything about particles? Total energy of any system is proportional to its invariant mass.

ergospherical
So long as you are in a reference system where ##\sum \limits_a \mathbf{p}_a = 0## then yeah, that's true by definition.

The point is that in a system of charged particles the energy is ##U = \frac{1}{8\pi}\int (E^2 + B^2) dV + \sum \limits_{a} \dfrac{ m_a }{\sqrt{1-v_a^2}}## and the first term (the energy in the field) has no dependence on the masses ##m_a##.

So long as you are in a reference system where ##\sum \limits_a \mathbf{p}_a = 0## then yeah, that's true by definition.

The point is that in a system of charged particles the energy is ##U = \frac{1}{8\pi}\int (E^2 + B^2) dV + \sum \limits_{a} \dfrac{ m_a }{\sqrt{1-v_a^2}}## and the first term (the energy in the field) has no dependence on the masses ##m_a##.
No, it is true in any frame, not just the COM frame. The invariant mass gives the energy in the COM frame, and then gamma time this is total energy in any frame (in SR, of course). It is more than just a definition. Effectively, it requires that all forms of matter and energy contribute to inertia in a fixed way.

ergospherical
The invariant mass gives the energy in the COM frame, and then gamma time this is total energy in any frame (in SR, of course).
What is then ##\gamma## for a system of more than one particle?

What is then ##\gamma## for a system of more than one particle?
It is a function of the speed of the COM in that specified frame.

• ergospherical
ergospherical
Ok I checked, you are correct:\begin{align*}
E = \dfrac{\sqrt{E^2-P^2}}{\sqrt{1-\dfrac{P^2}{E^2}}} = \dfrac{M}{\sqrt{1-V^2}}
\end{align*}

• vanhees71
Effectively, it requires that all forms of matter and energy contribute to inertia in a fixed way.
I think it is better to say: "Effectively, it requires that all forms of energy contribute to inertia in a fixed way." "Matter" doesn't matter.

Gold Member
2021 Award
How can you say “nobody” when I implied that I assume that? That’s kind of rude. I’m nobody?

I thought it was common for massive particles to define mass to be the energy as measured in the rest frame. What definition of mass are you using?
I wanted to say that nobody would assume a particle's energy to be propertional to ##m## in Newtonian physics. Why should one assume that in relativistic physics?

The mass of a particle is defined covariantly as ##p_{\mu} p^{\mu}=m^2 c^2##, where ##(p^{\mu})## is the four-momentum of the particle. This of course implies that in a momentaneous rest frame ##p^0=m c##.

Gold Member
2021 Award
Huh? Total energy is the time component of 4-momentum, which is mass times 4-velocity, so total energy is proportional to mass.
For non-interacting particles yes...

For non-interacting particles yes...
Is that really an exception? You must include the 4 momentum carried by fields in total 4 momentum of a system.

ergospherical
The equivalent of the "centre of mass" whose derivative enters ##\gamma = \dfrac{1}{\sqrt{1-\dot{X}^2}}## for the interacting system would have to be:
\begin{align*}
\mathbf{X} = \dfrac{\displaystyle\int \dfrac{E^2 + B^2}{8\pi} \mathbf{r} dV + \sum_a \dfrac{m_a}{\sqrt{1-v_a^2}} \mathbf{r}_a}{\displaystyle\int \dfrac{E^2 + B^2}{8\pi} dV + \sum_a \dfrac{m_a}{\sqrt{1-v_a^2}} }
\end{align*}I don't know if this is useful?

The equivalent of the "centre of mass" whose derivative enters ##\gamma## for the interacting system would have to be:
\begin{align*}
\mathbf{X} = \dfrac{\displaystyle\int \dfrac{E^2 + B^2}{8\pi} \mathbf{r} dV + \sum_a \dfrac{m_a}{\sqrt{1-v_a^2}} \mathbf{r}_a}{\displaystyle\int \dfrac{E^2 + B^2}{8\pi} dV + \sum_a \dfrac{m_a}{\sqrt{1-v_a^2}} }
\end{align*}I don't know if this is useful?
"COM" means "center of momentum", not "center of mass".

ergospherical
"COM" means "center of momentum", not "center of mass".

Lol why? I wrote it down, I can call it what I want. (It's just alluding to the notion of "the motion of the system as a whole")

Staff Emeritus
Lol why? I wrote it down, I can call it what I want. (It's just alluding to the notion of "the motion of the system as a whole")

Well, you don’t need a center of mass to make sense of the motion of the system as a whole. You find a frame F in which the total momentum is zero. Then you can say that this is the rest frame of the composite system.

• ergospherical
ergospherical
It's just a name, I mean there is a radius vector ##\mathbf{X}## whose derivative ##\dot{\mathbf{X}} = \dfrac{\mathbf{P}}{E}## characterises the motion of the system as a whole, in that ##E = \gamma(\dot{X}^2)M## as @PAllen showed me yesterday. Whether you wanna call it centre of mass/momentum/inertia/etc. doesn't really matter for all practical purposes.

I mean there is a radius vector ##\mathbf{X}## whose derivative ##\dot{\mathbf{X}} = \dfrac{\mathbf{P}}{E}## characterises the motion of the system as a whole, in that ##E = \gamma(\dot{X}^2)M## as @PAllen showed me yesterday.

This seems to be correct for an isolated system of free particles. I am not sure, if this is also correct for an isolated system of bound particles. Reason:

Rindler: Relativity - Special General Cosmological - Exercise 6.5 said:
By considering two equal particles travelling in opposite directions along parallel lines, show that the CM (center of mass) of a system in one IF does not necessarily coincide with its CM in another IF. Prove that, nevertheless, if the particles of the system suffer collision forces only, the CM in ervery IF moves with the velocity of the ZM frame.

I should clarify that I was thinking of COM in the sense of center of momentum. Specifically, given a total 4 momentum of an arbitrary isolated system, it is trivially decomposed into its mass times a 4 velocity. The total energy is then mass times gamma, the time component of the 4 velocity. A boost by the corresponding velocity takes you to a frame where gamma is zero. We can talk about the velocity of this COM frame (in the original frame) without needing to discuss any notion of center of energy computed by analogy to center of mass (using radius vectors).

But note that the Rindler exercise only asks you to prove a fact for collisions only; it does not state the result must be false for more general systems.

• ergospherical
Gold Member
2021 Award
Is that really an exception? You must include the 4 momentum carried by fields in total 4 momentum of a system.
Of course, to treat a complete closed system you have to take particles and fields as dynamical quantities (with the usual unsolved problems of interacting point particles) but also then the energy is not proportional to the particle mass.