# A short derivation of the relativistic forms of energy and momentum

• I
• Featured
Staff Emeritus
Summary:
This derivation uses collisions in 1 dimension, plus rapidity
I've been noodling around with derivations of the relativistic energy and momentum, and I almost got it down to just a few lines. But not quite.

I'm going to work in one spatial dimension, for simplicity (even though some derivations require a second spatial dimension)

Let's assume that there is an energy associated with a moving object of mass m given by:

##E=m g(v)##

and a corresponding momentum

##p= m f(v)##

We want to recover the nonrelativistic forms for the case ##\frac{v}{c} \rightarrow 0##, so we should have:

##g(v)=g(0)+\frac{1}{2} v^2+ ## higher-order terms in vc
##f(v)=v+ ## higher-order terms

The key insight is to look at conservation of energy and momentum in different frames. For that reason, it's actually more convenient to switch from velocity v to "rapidity" ##\theta##, related through ##v= c tanh(\theta)## (##tanh## is the hyperbolic tangent). In terms of rapidity, the Lorentz dilation factor ##\gamma = \dfrac{1}{1-\frac{v^2}{c^2}} = cosh(\theta)##.
The reason rapidity is more convenient is the addition law. Under a change of rest frames, velocity changes as follows: ##v′=\dfrac{v+u}{1+\frac{uv}{c^2}}##, where ##u## is the relative velocity between frames. Kind of messy. But if we let ##v= c tanh(\theta)## and ##u=c tanh(\phi)## and let ##v′= c tanh(\theta')##, then the addition law becomes just: ##\theta' = \theta + \phi##.

Since for small values of ##\theta##, ##tanh(\theta)\approx \theta## and so ##v \approx c \theta##, we can rewrite our limiting cases for g and f in terms of ##\theta##:

##g(\theta) \approx g(0)+\frac{1}{2} c^2 \theta^2##
##f(\theta) \approx c \theta##

Let's imagine a collision in which a number of masses collide inelastically to form a large composite mass. Let's choose a frame in which the large mass is at rest after the collision. Then conservation of energy and momentum become:

##\sum_j m_j g(\theta_j) = M g(0)##
##\sum_j m_j f(\theta_j) = 0##

Now, what we do is shift to another frame traveling at relative speed ##u = tanh(\phi)## relative to the first. Let's assume ##\phi## is small, so that we can use a Taylor series about ##\phi## and keep just the first few terms. (Note: we are thus assuming that ##M## is moving nonrelativistically, but not the smaller masses that collided.)

##\sum_j m_j g(\theta_j + \phi) = M g(\phi) \approx M g(0) + \frac{1}{2} M c^2 \phi^2##
##\sum_j m_j f(\theta_j + \phi) = M f(\phi) \approx M c \phi##

At this point, let's also expand the left-hand sides in powers of ##\phi##:

##g(\theta_j + \phi) \approx g(\theta_j) + g'(\theta_j) \phi + \frac{1}{2} g''(\theta_j) \phi^2##
##f(\theta_j + \phi) \approx f(\theta_j) + f'(\theta_j) \phi##

where ##'## means take the derivative of the function with respect to its argument.

Equating equal powers of ##\phi## gives:

1. ##\sum_j m_j g(\theta_j) = M g(0)##
2. ##\sum_j m_j g'(\theta_j) = 0##
3. ##\sum_j m_j g''(\theta_j) = M c^2##
4. ##\sum_j m_j f(\theta_j) = 0##
5. ##\sum_j m_j f'(\theta_j) = Mc##

We can plug equation 1 into equation 3 to get:

##\sum_j m_j g''(\theta_j) = \frac{c^2}{g(0)} \sum_j m_j g(\theta_j) ##

This has to be true for any collection of masses, so the only solution is if for each particle,

##g''(\theta) = \frac{c^2}{g(0)} g(\theta)##

Equation 5 is just like equation 3, except for a factor of ##c##, so we also conclude:

##f'(\theta) = \frac{c}{g(0)} g(\theta) = \frac{1}{c} g''(\theta)##

This implies that ##f(\theta) = \frac{1}{c} g'(\theta)##. (There is an arbitrary constant involved in integrating, but it must be zero if ##f(0) = 0##, which must be true for momentum).

So the equation for ##g''(\theta)## has an immediate solution:

##g(\theta) = g(0) cosh(\frac{c}{\sqrt{g(0}} \theta)##

Since ##cosh(\theta) = \gamma##, we would be home free if we could show that ##g(0) = c^2##. But information about collisions doesn't seem to give enough information to conclude this. (This post is about collisions in one dimension. There is an alternative derivation in 2 dimensions that allows you to deduce this.)

But there is a final trick up my sleeve, which is the work-energy equation. In one dimension.

##\Delta E = F \Delta x = \dfrac{\Delta p}{\Delta t} \Delta x = \Delta p \dfrac{\Delta x}{\Delta t} = \dfrac{\Delta p}{\Delta \theta} \Delta \theta \dfrac{\Delta x}{\Delta t} ##

Or dividing through by ##\Delta \theta## and taking the limit as ##\Delta t \rightarrow 0## and ##\Delta \theta \rightarrow 0##

##\dfrac{dE}{d\theta} = \dfrac{dp}{d\theta} v## where I've used ##\frac{dx}{dt} = v##.

Since in terms of ##\theta##, ##v = c \tanh(\theta)##, this becomes:

##\dfrac{dE}{d\theta} = \dfrac{dp}{d\theta} c tanh(\theta)##.

In terms of our functions ##g## and ##f##, this implies

##m \dfrac{dg}{d\theta} = m \dfrac{df}{d\theta} c tanh(\theta)##.

Since ##f(\theta) = \frac{1}{c} g'(\theta)##, we can rewrite this in terms of ##f##

##c f(\theta) = \dfrac{df}{d\theta} c tanh(\theta)##

Which implies that

##\dfrac{\frac{df}{d\theta}}{f} = \frac{1}{tanh(\theta)} = \dfrac{cosh(\theta)}{sinh(\theta)}##

which has the solution:

##f(\theta) = K sinh(\theta)##

for some constant ##K##. The nonrelativistic limit is ##f(\theta) \approx c \theta## tells us that ##K = c##.

So there we have it:

##f(\theta) = c sinh(\theta)##

Since ##f(\theta) = \frac{1}{c} g'(\theta)##, this implies:

##g'(\theta) = c^2 sinh(\theta)##, so ##g(\theta) = c^2 cosh(\theta)## (the constant must be zero from the fact that ##g''(\theta) \propto g(\theta)##)

So we conclude the energy-momentum functions for relativity:

##E = m g(\theta) = m c^2 cosh(\theta) = m c^2 \gamma##
##p = m f(\theta) = mc sinh(\theta) = m c tanh(\theta) cosh(\theta) = m v \gamma##

Last edited:
• romsofia, SiennaTheGr8 and vanhees71

vanhees71
Gold Member
Another approach is to use Poincare invariance and ask for a Lagrangian of one particle fulfilling all the 10 conservation laws associated with the one-parameter subgroups.

For both Newtonian and relativistic spacetime from the symmetry under time and space translations as well as spatial rotations it follows that
$$L(\vec{x},\vec{v},t)=F(\vec{v}^2).$$
Now you can use the formalism of Noether's theorem for the symmetry under Lorentz boosts, but this is not necessary too, because the only way to get an invariant action only built from ##\vec{v}^2=\dot{\vec{x}}^2## obviously is
$$S=-m c^2\int_{t_1}^{t_2} \mathrm{d} t \sqrt{1-\dot{\vec{x}}^2/c^2}.$$
From this you get the (canonical) momenta
$$\vec{p}=\frac{\partial L}{\partial \dot{\vec{x}}}=\frac{m \dot{\vec{x}}}{\sqrt{1-\dot{\vec{x}}^2/c^2}}.$$
Since ##L## is not explicitely dependent on time, you have also the energy as conserved quantity
$$E=\vec{p} \cdot \dot{\vec{x}}-L=\frac{m c^2}{\sqrt{1-\dot{\vec{x}}^2/c^2}}.$$
Since further
$$\mathrm{d} \tau=\sqrt{1-\vec{v}^2/c^2} \mathrm{d} t$$
is a scalar quantity and we can write
$$\begin{pmatrix} E/c \\ \vec{p} \end{pmatrix}=m \frac{\mathrm{d} x^{\mu}}{\mathrm{d} \tau}$$
with ##x=(x^{\mu})=(c t,\vec{x})## we get the four-momentum in a manifestly covariant way.

strangerep
Another approach is to use Poincare invariance and ask for a Lagrangian of one particle fulfilling all the 10 conservation laws associated with the one-parameter subgroups.
[...] we get the four-momentum in a manifestly covariant way.
(Nitpick:) what about light-like particles? vanhees71
Gold Member
Though there are no light-like particles, you can describe them with the Lagrange formalism. Of course you have to change another Lagrangian.

For a massive particle if you have in the above given "square-root form", i.e., a parametrization invariant action,
$$S=\int \mathrm{d} \lambda [-mc^2 \sqrt{\dot{x}_{\mu} \dot{x}^{\mu}}+L_I(x,\dot{x},\lambda)],$$
you can use another physically equivalent action
$$\tilde{S}=\int \mathrm{d} \lambda [-\frac{m}{2} \dot{x}_{\mu} \dot{x}^{\mu}+L_I(x,\dot{x})].$$
Here ##L_I(x,\dot{x})## must be a 1st-rank homogeneous function of ##\dot{x}##, i.e.,
$$L_I=\dot{x}^{\mu} \frac{\partial L_I}{\partial \dot{x}^{\mu}}.$$
While with ##S## you get a parametrization-invariant description, i.e., you can choose whatever world-line parameter you like, with ##\tilde{S}## you get automatically an affine parameter, i.e., since the Lagrangian is not explicitly dependent on ##\lambda##, you get
$$\dot{x}_{\mu} \dot{x}^{\mu}=C=\text{const}.$$
For a massive particle the world line must be timelike, and you can choose ##C=1## (then ##\lambda=s=c \tau##) or ##C=c^2## (then ##\lambda=\tau##, with ##\tau## the particle's proper time).

Now in the latter form with ##\tilde{S}## you can also choose ##C=0##. Then you describe massless particles with some arbitrary affine parameter, though I don't think that massless classical point particles make any sense.

The only example that makes sense is the use of the "naive photon picture" for the propagation of light, though it's utterly misleading to call this "photon". The most famous application is the calculation of Einstein's formula for the bending of light by the Sun within general relativity (where the above Lagrange formalism is of course also valid; you only have to use ##g_{\mu \nu}## describing the Lorentzian space-time pseudometric). What's called by modern slang "photon" here is nothing else than the eikonal approximation for the electromagnetic field (aka geometrical optics), taking its characteristics as worldlines of "massless photons".

strangerep
For a massive particle if you have in the above given "square-root form", i.e., a parametrization invariant action,
$$S=\int \mathrm{d} \lambda [-mc^2 \sqrt{\dot{x}_{\mu} \dot{x}^{\mu}}+L_I(x,\dot{x},\lambda)],$$
you can use another physically equivalent action
$$\tilde{S}=\int \mathrm{d} \lambda [-\frac{m}{2} \dot{x}_{\mu} \dot{x}^{\mu}+L_I(x,\dot{x})].$$
Here ##L_I(x,\dot{x})## must be a 1st-rank homogeneous function of ##\dot{x}##, i.e.,
$$L_I=\dot{x}^{\mu} \frac{\partial L_I}{\partial \dot{x}^{\mu}}.$$
In ##\tilde{S}##, don't you need a 2-homogeneous version of ##L_I## ?

If that's short, I wonder what the long way is!

• stevendaryl
Staff Emeritus
Something that I realize about how relativistic energy momentum differs from classical, apart from the different transformations rules is this:
Classical physics assumes that momentum and kinetic energy are proportional to mass. But it doesn’t assume that total energy is proportional to mass. So there can be two objects with the same mass, same momentum, same kinetic energy, but different total energy.

vanhees71
Gold Member
In ##\tilde{S}##, don't you need a 2-homogeneous version of ##L_I## ?
The trick is to keep ##L_I## from the parametrization-independent ("square-root") formulation. Then due to the conservation law from ##L## being not explicitly dependent on the parameter in the formulation with ##\tilde{S}## implies that ##\lambda## is an affine parameter. BTW for ##S## this law doesn't lead to anything, because there ##H \equiv 0##.

For a derivation of ##\tilde{S}## in the special-relativistic case, see

https://itp.uni-frankfurt.de/~hees/pf-faq/srt.pdf

Sect. 2.4.3

vanhees71
Gold Member
Something that I realize about how relativistic energy momentum differs from classical, apart from the different transformations rules is this:
Classical physics assumes that momentum and kinetic energy are proportional to mass. But it doesn’t assume that total energy is proportional to mass. So there can be two objects with the same mass, same momentum, same kinetic energy, but different total energy.
Also in relativistic physics nobody assumes that total energy is proportional to mass. How do you come to this idea?

Staff Emeritus
Also in relativistic physics nobody assumes that total energy is proportional to mass. How do you come to this idea?

How can you say “nobody” when I implied that I assume that? That’s kind of rude. I’m nobody?

I thought it was common for massive particles to define mass to be the energy as measured in the rest frame. What definition of mass are you using?

PAllen
Also in relativistic physics nobody assumes that total energy is proportional to mass. How do you come to this idea?
Huh? Total energy is the time component of 4-momentum, which is mass times 4-velocity, so total energy is proportional to mass.

ergospherical
Gold Member
Well, to give an example, the electrostatic energy ##U = \dfrac{q_1 q_2}{r}## of two charged particles [or more generally the electrostatic energy ##U = \frac{1}{8\pi} \int E^2 dV## of any distribution of charged particles] doesn't depend on mass.

Last edited:
Staff Emeritus
Here are some facts about momentum and energy that are true both relativistically and classically. (In one spatial dimension, for simplicity):

First relation

Let ##V(v,u)## be the velocity addition function: If an object has velocity ##v## in one frame ##F##, and that frame is moving velocity at velocity ##-u## relative to another frame, ##F'##, then the object has velocity ##V(v,u)## in frame ##F'##. (I chose the relative velocity to be ##-u## so that ##V(v,u) \gt v## when ##u \gt 0##.)

Classically, ##V(v,u) = v+u##. Relativistically, ##V(v,u) = \dfrac{v+u}{1+\frac{uv}{c^2}}##.

Let ##E(v)## be the energy of a particle traveling at speed ##v##, and let ##p(v)## be its momentum. Then both classically and relativistically,

##\frac{dE}{dv} \frac{\partial V}{\partial u}|_{u=0} = p##

Nonrelativistically, ##V(v,u)## is just ##v+u##, and so this becomes:

##\frac{dE}{dv} = p##

Relativistically ##V(v,u) = \dfrac{v+u}{1+\frac{uv}{c^2}}##. So this becomes:

##\frac{dE}{dv} (1 - \frac{v^2}{c^2}) = p##

Second relation

The work-energy relationship can be rearranged to give:

##\frac{dE}{dp} = v##

Combining the two relationships through writing ##\frac{dE}{dv} = \frac{dE}{dp} \frac{dp}{dv} = v \frac{dp}{dv}## gives:

Classically:

##v \frac{dp}{dv} = p##

This has the solution: ##p = mv##. (It has to be proportional to ##v##, so ##m## is in a sense, just the name of this constant of proportionality)

Relativistically:

##v (1 - \frac{v^2}{c^2}) \frac{dp}{dv} = p##

This is a more difficult equation to solve, but if you make the substitution ##v = c tanh(\theta)##, it simplifies to:
##\frac{dp}{p} = \frac{cosh(\theta)}{sinh(\theta)} d\theta##, which has the solution

##p = m sinh(\theta) = m \frac{v}{(1-\frac{v^2}{c^2})^\frac{1}{2}}##

Staff Emeritus
Well, to give an example, the electrostatic energy ##U = \dfrac{q_1 q_2}{r}## of two charged particles...

Well, in the frame in which the total momentum is 0, the two-particle composite system will have an effective mass that includes that electrostatic energy. The one-particle mass is a little ambiguous.

Well, to give an example, the electrostatic energy ##U = \dfrac{q_1 q_2}{r}## of two charged particles [or more generally the electrostatic energy ##U = \frac{1}{8\pi} \int E^2 dV## of any distribution of charged particles] doesn't depend on mass.
But this electrostatic energy belongs to a system. In the COM frame, the mass of the system is equal to the sum of the energies, including this electrostatic energy.

Source:

ergospherical
Gold Member
This is all true, but the energy in the field still does not depend on the masses of the particles

• dextercioby
PAllen
This is all true, but the energy in the field still does not depend on the masses of the particles
Who said anything about particles? Total energy of any system is proportional to its invariant mass.

ergospherical
Gold Member
So long as you are in a reference system where ##\sum \limits_a \mathbf{p}_a = 0## then yeah, that's true by definition.

The point is that in a system of charged particles the energy is ##U = \frac{1}{8\pi}\int (E^2 + B^2) dV + \sum \limits_{a} \dfrac{ m_a }{\sqrt{1-v_a^2}}## and the first term (the energy in the field) has no dependence on the masses ##m_a##.

PAllen
So long as you are in a reference system where ##\sum \limits_a \mathbf{p}_a = 0## then yeah, that's true by definition.

The point is that in a system of charged particles the energy is ##U = \frac{1}{8\pi}\int (E^2 + B^2) dV + \sum \limits_{a} \dfrac{ m_a }{\sqrt{1-v_a^2}}## and the first term (the energy in the field) has no dependence on the masses ##m_a##.
No, it is true in any frame, not just the COM frame. The invariant mass gives the energy in the COM frame, and then gamma time this is total energy in any frame (in SR, of course). It is more than just a definition. Effectively, it requires that all forms of matter and energy contribute to inertia in a fixed way.

ergospherical
Gold Member
The invariant mass gives the energy in the COM frame, and then gamma time this is total energy in any frame (in SR, of course).
What is then ##\gamma## for a system of more than one particle?

PAllen
What is then ##\gamma## for a system of more than one particle?
It is a function of the speed of the COM in that specified frame.

• ergospherical
ergospherical
Gold Member
Ok I checked, you are correct:\begin{align*}
E = \dfrac{\sqrt{E^2-P^2}}{\sqrt{1-\dfrac{P^2}{E^2}}} = \dfrac{M}{\sqrt{1-V^2}}
\end{align*}

• vanhees71
Effectively, it requires that all forms of matter and energy contribute to inertia in a fixed way.
I think it is better to say: "Effectively, it requires that all forms of energy contribute to inertia in a fixed way." "Matter" doesn't matter.

vanhees71
Gold Member
How can you say “nobody” when I implied that I assume that? That’s kind of rude. I’m nobody?

I thought it was common for massive particles to define mass to be the energy as measured in the rest frame. What definition of mass are you using?
I wanted to say that nobody would assume a particle's energy to be propertional to ##m## in Newtonian physics. Why should one assume that in relativistic physics?

The mass of a particle is defined covariantly as ##p_{\mu} p^{\mu}=m^2 c^2##, where ##(p^{\mu})## is the four-momentum of the particle. This of course implies that in a momentaneous rest frame ##p^0=m c##.

vanhees71