Derivation of E=pc & E=MC2: Which Came First?

atyy · Dec 28, 2020

Nugatory said:

I'm not just completely quibbling here. In this case we end up with the right answer, but consider the similar problem with the Lorentz transformations (and the time dilation and length contraction formulas that follow from them). These are derived under assumptions that are equivalent to ##v<c##, just as your argument is derived under the assumption that ##m_0\ne 0## - but when we try to generalize by dumping that assumption and plugging in other values of ##v## the results are nonsensical in a way that has confused generations of physics students. Why should one procedure any less invalid than the other?

john t said:

Thanks Nugatory. I understand, and I do not think you are quibbling, and your analogy with the Lorenz situation makes the point clear. Can one say that my logic shows the consistency of the specific with the general equation, given the acceptance (naieve?) of relativistic mass?

Relativistic mass is not wrong, as it's found in many good books (eg. by Feynman, Purcell, Rindler) just that one has to be careful with how one uses it. In many cases, it is easier to avoid the relativistic mass for the purpose of obtaining a correct calculation.

Anyway, maybe to rephrase @john t's question, why does using an argument beyond the validity of its assumptions sometimes work? From the point of view of theory, massless photons are completely different from massive particles, as they are not at rest in any of the usual inertial reference frames, However, given that we don't know for sure that neutrinos and photons are massless, it seems that the equations for massless photons and for massive photons give almost the same results, ie. it would seem that the equations for massive particles should give results "close to" those for massless particles.

robphy · Dec 28, 2020

atyy said:

Relativistic mass is not wrong, as it's found in many good books (eg. by Feynman, Purcell, Rindler) just that one has to be careful with how one uses it. In many cases, it is easier to avoid the relativistic mass for the purpose of obtaining a correct calculation.

I think the issue that it is more likely that
a non-expert [who likely lacks knowledge of the "fine print"] will misuse "relativistic mass".
So, a better way has been sought to avoid its misuse by the general user.
(Maybe a license is needed to use "relativistic mass".)

[I do the same thing with "centrifugal force".]

In addition, sometimes it's not just about getting the right answer,
but it's about how one gets to that answer...
and hopefully, the method works for other problems... not just that one [or two] problems.
(In retrospect, one can then say [to the experts] "it is in this case, as if the mass were... ".)

atyy said:

Anyway, maybe to rephrase @john t's question, why does using an argument beyond the validity of its assumptions sometimes work? From the point of view of theory, massless photons are completely different from massive particles, as they are not at rest in any of the usual inertial reference frames, However, given that we don't know for sure that neutrinos and photons are massless, it seems that the equations for massless photons and for massive photons give almost the same results, ie. it would seem that the equations for massive particles should give results "close to" those for massless particles.

There is certainly value in "motivating a result"... say from observational data... then suggesting an extension.

But that is different from a "derivation",
hopefully from a more organized logical presentation of the theory.

My $0.02.

PeterDonis · Dec 29, 2020

Sagittarius A-Star said:

I think you could derive an equation, that is also valid for photons, by avoiding ##\gamma## as a factor

But you didn't do that. Rearranging ##E = \gamma m## to ##m = E / \gamma## is not avoiding the gamma factor.

The best way to get an equation valid for photons as well as massive particles is to start with ##m^2 = E^2 - p^2##, which is obtained by simply taking the norm of the 4-momentum vector. Then you can, as I described in an earlier post, specialize to the cases ##m > 0## and ##m = 0## as needed.

Sagittarius A-Star · Dec 29, 2020

PeterDonis said:

But you didn't do that. Rearranging ##E = \gamma m## to ##m = E / \gamma## is not avoiding the gamma factor.

I am not using ##m = E / \gamma##, which would be undefined for ##v=c##. What I wrote, is different for the case ##v=c##.

Also, the equation ##mc^2 = E \sqrt {1-\frac{v^2} {c^2} } ## shows nicely, that an object with an invariant mass of zero must move with ##c##.

vanhees71 · Dec 29, 2020

If there was not the obsession of even modern textbook writers to insist on interpreting photons as "particles" there'd be no need for massless classical particles to begin with and there'd not be that problem with "deriving" the energy-momentum relation for massless particles.

If you insist on the classical massless-particle picture you have to treat it somehow as a limiting case for massive particles for velocities ##\vec{v}## close to ##|\vec{v}|=c##. The reason is that there is no "natural" measure of time for massless particles, while for massive particles it's the proper time, which is used to define mass as a scalar quantity and make it in fact the same quantity as in the Newtonian limit. That's how the ##\gamma## factor comes into the definition for energy and momentum for classical particles:
$$p^{\mu}=m \mathrm{d}_{\tau} x^{\mu}$$
from which
$$p_{\mu} p^{\mu}=m^2 c^2$$
and now there's no problem to make ##m \rightarrow 0## and get ##E=p^0 c=c|\vec{p}|##.

It also becomes clear, why there is no "natural" time measure for massless particles! It's because there is no natural intrinsic scale present. For massive particles this scale is of course its invariant mass.

Sagittarius A-Star · Dec 29, 2020

PeterDonis said:

The best way to get an equation valid for photons as well as massive particles is to start with ##m^2 = E^2 - p^2##, which is obtained by simply taking the norm of the 4-momentum vector. Then you can, as I described in an earlier post, specialize to the cases ##m > 0## and ##m = 0## as needed.

Yes. Without a ##\gamma##:

(1) Four-momentum ## \mathbf P = (E/c , p_x, p_y, p_z) = (\frac {E}{c^2}c , \frac {E}{c^2} v_x , \frac {E}{c^2} v_y , \frac {E}{c^2} v_z) ##

(2) Pseudo-scalar product ## \mathbf P \cdot \mathbf P = \frac {E^2}{c^4} (c^2-v^2) = \frac {E^2}{c^2} (1-\frac{v^2}{c^2})##

(3) Minkowski-norm ## \left \|\mathbf P\right \| * c = E \sqrt {1-\frac{v^2}{c^2}} =
\begin{cases}
0 & \text{if } v=c \\
E_0 & \text{if } v=0 \\
E_0 & \text{if } v>0 \ \ \ and \ \ \ v<c (invariant, E\ depends\ on\ v)
\end{cases}
##

from (2), (1) and (3) => Pseudo-scalar product ## \mathbf P \cdot \mathbf P *c^2 = E^2 - p^2c^2 =
\begin{cases}
0 & \text{if } v=c \\
E_0^2 & \text{if } v<c
\end{cases}
##

vanhees71 · Dec 29, 2020

If you now just write ##E_0=m c##, everything is crystal clear :-).

Sagittarius A-Star · Dec 29, 2020

vanhees71 said:

If you now just write ##E_0=m c##, everything is crystal clear :-).

##m := \frac {\left \|\mathbf P\right \|}{c}##

vanhees71 · Dec 29, 2020

I don't like this notation, because it suggest there was a metric in Minkowski space, but there's none. It's an indefinite non-degenerate fundamental form but not a proper scalar product and thus doesn't induce a metric in the vector space.

Sagittarius A-Star · Dec 29, 2020

vanhees71 said:

I don't like this notation

##m_0 := \frac {E_0}{c^2}##

PeterDonis · Dec 29, 2020

Sagittarius A-Star said:

Yes. Without a ##\gamma##:

Much simpler, without a ##\gamma##: ##P^\mu = (E, \vec{p})##, so ##P^\mu P_\mu = E^2 - \vec{p} \cdot \vec{p} = E^2 - p^2 = m^2##. In other words, go directly from your first equation in your step (1), to your step (4).

jbergman · Jan 20, 2021

I asked a similar question here a few weeks ago and using 4-momentum to show that ##E^ 2=m^2 + p^2## is a circular argument. Just stating that the 4-momentum, ##P^\mu = (E,\vec{p})## obviously gives the relationship we are seeking, but it doesn't say anything why the zeroth component of 4-momentum is energy.

@vanhees71 gave the best answer on this thread that one must start with the relativistic Lagrangian as derived from Noether's theorem and then derive these relations for massive particles and take the limit as ##m \rightarrow 0##.

Alternatively, there are physical arguments that can be used to justify ##E=mc^2## like given by Feynman here in his lectures on physics. From which one can work out the rest. I think too many just start from accepting 4-momentum as a given without understanding why it is true.

lomidrevo · Jan 20, 2021

jbergman said:

Just stating that the 4-momentum, Pμ=(E,p→) obviously gives the relationship we are seeking, but it doesn't say anything why the zeroth component of 4-momentum is energy.

In analogy with Newtonian mechanics, it is natural to define four-momentum as mass times the four-velocity:
$$P^\mu=mU^\mu$$
Where four-velocity is (also naturally imo) defined as rate of change of position in spacetime with respect to proper time:
$$U^\mu=\frac{dx^\mu}{d\tau}, ~~~ d\tau^2 = - \eta_{\mu\nu}dx^\mu dx^\nu$$
Putting all together, a a result you can see that ##\mathbf P = (E, \vec p)##. Note, this is valid only for massive particles.

jbergman · Jan 20, 2021

lomidrevo said:

Putting all together, a a result you can see that ##\mathbf P = (E, \vec p)##. Note, this is valid only for massive particles.

Well, you didn't explain the most important part. How do you see that ##\mathbf P^0 = E##?

In the frame co-moving with the observer you get, ##P^0 = m dt/d\tau = m = E##. So you are using the Energy-Mass equivalence to show that ##\mathbf P = (E, \vec p)##. That's why, IMO, ##\mathbf P = (E, \vec p)## is not fundamental. One most first prove ##E=mc^2## or like mentioned before start with the Lagrangian.

Sagittarius A-Star · Jan 20, 2021

jbergman said:

I asked a similar question here a few weeks ago and using 4-momentum to show that ##E^ 2=m^2 + p^2## is a circular argument. Just stating that the 4-momentum, ##P^\mu = (E,\vec{p})## obviously gives the relationship we are seeking, but it doesn't say anything why the zeroth component of 4-momentum is energy.

An object moves with speed ##v= \sqrt{v_x^2+v_y^2+v_z^2}## in space and with velocity ##c## in ct-direction:
$$\frac{d}{dt}(ct, x, y, z) = (c, \vec{v})$$
Definition of relativistic momentum (experimentally verified):
$$ \mathbf P = (p_t , \vec{p}) = \frac{E}{c^2} (c, \vec{v}) = \frac{E}{c} (1, \vec{\beta})$$

jbergman · Jan 20, 2021

Sagittarius A-Star said:

An object moves with speed ##v= \sqrt{v_x^2+v_y^2+v_z^2}## in space and with velocity ##c## in ct-direction:
$$\frac{d}{dt}(ct, x, y, z) = (c, \vec{v})$$
Definition of relativistic momentum (experimentally verified):
$$ \mathbf P = (p_t , \vec{p}) = \frac{E}{c^2} (c, \vec{v}) = \frac{E}{c} (1, \vec{\beta})$$

So basically you are stating as an axiom that ##\mathbf P = (E, \vec{p})## and that we believe it because of experimental verification. A more physical way to prove this, IMO, is to first prove ##E=mc^2##. There are physical arguments for why that is true, and then derive the Energy - Momentum relationships.

Sagittarius A-Star · Jan 20, 2021

jbergman said:

A more physical way to prove this, IMO, is to first prove ##E=mc^2##.

That's not physical. You can ommit the redundant words "mass" and "energy" and speak only about ##P## (with ##\ \ \ p_t = \frac{c}{v} * p##). Conserved is:
$$ \mathbf P = p_t (1, \vec{\beta})$$

lomidrevo · Jan 20, 2021

jbergman said:

Well, you didn't explain the most important part. How do you see that ##\mathbf P^0 = E##?

Let's keep SI units for now, it is more illustrative. Based on the above post, you can write the time-like component of the 4-momentum like this:
$$P^0 = \frac{mc^2}{\sqrt {1-v^2/c^2}}$$
You can see this quantity has units of energy. Now, we can further examine it by considering case ##v << c##, it can be written as
$$P^0 \approx mc^2 + \frac{1}{2}mv^2$$
That looks like kinetic energy from Newtonian mechanics + some term that is not depending on the velocity: a rest energy?. However, this ##P^0## term is not invariant under Lorentz transformation, nor are the components of the "classical" 3-momentum. But if you put those components together and create a 4-vector (as defined above) you will see that its magnitude (inner product with itself) is scalar, invariant under Lorentz transformation. Voilà, we discovered conservation of energy and conservation of momentum.
It looks like energy, it has the same units, in Newtonian limit it includes well-known kinetic energy, it is conserved when properly combined with 3-momentum... Why not to call it energy?!

lomidrevo · Jan 20, 2021

jbergman said:

One most first prove E=mc2

No, you don't have to prove ##E=mc^2##. As I showed above, it appears there automatically.

jbergman said:

or like mentioned before start with the Lagrangian.

Yes, going like this you can show that hamiltonian is equal to time-like component of the 4-momentum. But I think, the above works fine too and it is more illustrative, imo.

jbergman · Jan 20, 2021

lomidrevo said:

Let's keep SI units for now, it is more illustrative. Based on the above post, you can write the time-like component of the 4-momentum like this:
$$P^0 = \frac{mc^2}{\sqrt {1-v^2/c^2}}$$
You can see this quantity has units of energy. Now, we can further examine it by considering case ##v << c##, it can be written as
$$P^0 \approx mc^2 + \frac{1}{2}mv^2$$
That looks like kinetic energy from Newtonian mechanics + some term that is not depending on the velocity: a rest energy?. However, this ##P^0## term is not invariant under Lorentz transformation, nor are the components of the "classical" 3-momentum. But if you put those components together and create a 4-vector (as defined above) you will see that its magnitude (inner product with itself) is scalar, invariant under Lorentz transformation. Voilà, we discovered conservation of energy and conservation of momentum.
It looks like energy, it has the same units, in Newtonian limit it includes well-known kinetic energy, it is conserved when properly combined with 3-momentum... Why not to call it energy?!

I know this proof. But again, for me, that isn't very satisfying to say that something having the correct units and in the low velocity limit includes the Newtonian kinetic energy means it is energy. Especially, when we can show that more directly from physics and the postulates of special relativity.

lomidrevo · Jan 20, 2021

jbergman said:

Especially, when we can show that more directly from physics and the postulates of special relativity.

Sure, I agree. Using the principle of least action is more universal, nobody stops you to do that. See my post #50.
But keep in mind that postulates of special relativity are still the same. I didn't use anything extra in my replies.

Before you said:

jbergman said:

but it doesn't say anything why the zeroth component of 4-momentum is energy.

I think I provided you several arguments in posts #44 and #49 that leads to a conclusion, "it must be a energy". So I find the words "it doesn't say anything" a bit unfair

Sagittarius A-Star · Jan 20, 2021

jbergman said:

A more physical way to prove this, IMO, is to first prove ##E=mc^2##.

I think you refer to the "relativistic mass" ##m_R##. That is avoided today. Therefore I replaced it by ##E/c^2##, which is the same anyway.

jbergman · Jan 20, 2021

lomidrevo said:

Sure, I agree. Using the principle of least action is more universal, nobody stops you to do that. See my post #50.
But keep in mind that postulates of special relativity are still the same. I didn't use anything extra in my replies.

Before you said:

I think I provided you several arguments in posts #44 and #49 that leads to a conclusion, "it must be a energy". So I find the words "it doesn't say anything" a bit unfair

You have a point. I guess each of has to decide what is a compelling proof. I am only chiming in on this thread because I worked through the same confusion several weeks ago. At the end of the day, the final proof is that it agrees with experiment. But there are other paths there. You provided strong motivation as to why it would be true. I, personally, found Feynman's proof of in his lecture notes based on the conservation of momentum nice.

It is obviously a complex issue, see Ohanian, who claims, Einstein never successfully proved this.

This site, https://plato.stanford.edu/entries/equivME/, has a great article on the history of this relation and attempts to prove it including disputes as to whether or not various proofs were correct.

Gaussian97 · Jan 20, 2021

I think the whole point here is how we want to define "energy".
One easy way is just to define ##E=p^0## there's nothing more to add. #49 has proved that this definition is consistent with the definition of "energy" used in Newtonian mechanics, so that's good.
So now, one can say that "energy" is defined in another way. And that would be also OK, then we would just need to prove that the two definitions are equivalent.
At the end of the day, if you think that proving ##E=p^0## can only be proved by "experimental agreement" is because your notion of energy requires some experiment to be defined.

PAllen · Jan 20, 2021

Demystifier said:

According to the book
https://www.amazon.com/dp/0393337685/?tag=pfamazon01-20
(which is a serious book, not a crackpot one), Einstein made 7 different proofs of ##E=mc^2## and all 7 proofs had a mistake.

It is not crank, but most claims in this book remain disputed by other experts

suremarc · Jan 20, 2021

I’m not an expert in physics by any means, but if you take ##E## as fixed, it follows that ##\gamma=\frac{E}{mc^2}## and ##p=\gamma mv=\frac{Ev}{c^2}##. Since ##\frac{v^2}{c^2}=1-\gamma^{-2}=1-\frac{(mc^2)^2}{E^2}##, taking ##m=0## yields ##v=c## and ##p=\frac{E}{c}##.

Whether or not this makes physical sense, I wouldn’t know, but for example the photon has a well-defined energy from Maxwell’s equations, no? So taking the limit as ##m## approaches zero is similar to approximating a photon as a point mass with a fixed energy and a very, very low mass.

PeterDonis · Jan 20, 2021

suremarc said:

it follows that ##\gamma=\frac{E}{mc^2}## and ##p=\gamma mv=\frac{Ev}{c^2}##.

But these formulas are only valid for ##m > 0##, so you can't use them to derive anything that applies to ##m = 0##. Or, if the "take the limit as ##m \rightarrow 0##" idea occurs to you, it's still not valid; formulas involving ##\gamma## are only valid for timelike vectors, not null vectors, and there is no continuous way to go from one to the other.

suremarc said:

the photon has a well-defined energy from Maxwell’s equations

Remember that we are talking classical physics, not quantum physics, so there is no such thing as a "photon". When books on classical relativity use the term "photon", what they really mean is "a pulse of light that lasts a very short time, so it can be approximated by a single null worldline in spacetime". In terms of Maxwell's Equations, this would be modeled as a wave packet with a very small "spread" in spacetime, modeled as a single null worldline using the geometric optics approximation. Such a wave packet does have a well-defined energy, yes, but it is frame-dependent.

suremarc said:

taking the limit as ##m## approaches zero is similar to approximating a photon as a point mass with a fixed energy and a very, very low mass.

No, it isn't, because, as I said above, timelike vectors and null vectors are fundamentally different, and you can't just wave your hands and "take a limit" to go from one to the other.

PAllen · Jan 20, 2021

Or put another way, a null vector is as much the limit of a ruler as it is of trajectory; both are invalid, and there is no reason to favor one as an approximation over the other.

robphy · Jan 21, 2021

With regards to limits,
the diagram below from this old post ( https://www.physicsforums.com/threads/massless-photon.900960/post-5842652 ) might be useful

(an extension of this diagram is here:
https://physics.stackexchange.com/a/551250/148184 )

vanhees71 · Jan 21, 2021

I'd say, if you are at this level of sophistication to ask "why do the physical laws look as they do", the best answer, found in connection with Einstein's works, are symmetry principles. The most general symmetry principles are related with the space-time models used in the different theories (Galilei-Newton, Einstein-Minkowski, Einstein(-Cartan)). The latter case is a bit more complicated, because GR is after all a gauge theory "gauging" the Lorentz group.

Galilei-Newton and Einstein-Minkowski are pretty much at the same level, i.e., given the space-time structure you derive the most generally valid continuous Lie-symmetry group for these space-time models and write down possible dynamical laws for the natural mathematical objects occurring in these space-time models.

For point mechanics (very natural and simple for Newtonian physics but on the edge of being inconsistent in special relativistic physics) and assuming a closed system of one particle (i.e., automatically a free particle, because there's nothing which the particle can interact with), using the analysis of Noether's theorem within the Lagrangian formulation of the action principle. The common subgroups of the Galilei (homogeneity of space and time, isotropy of space) you end up with the most general form of the Lagrangian (in Cartesian coordinates)
$$L=L(\dot{\vec{x}}^2).$$
The only difference is then in the boosts.

For an infinitesimal Galilei boost,
$$t'=t, \quad \vec{x}'=\vec{x}-\delta v \vec{n} t$$
The "symmetry condition" reads
$$-\vec{n} \frac{\partial L}{\partial \dot{\vec{x}}}+\frac{\mathrm{d}}{\mathrm{d} t} \Omega(\vec{x},t)=0.$$
From this you get
$$-2 \vec{n} \cdot \dot{\vec{x}} L'(\dot{\vec{x}}^2)+\dot{\vec{x}} \cdot \vec{\nabla} \Omega + \partial_t \Omega=0.$$
This is fulfilled for
$$\Omega=m \vec{n} \cdot \vec{x}$$
and
$$L'(\dot{\vec{x}}^2)=\frac{m}{2} \; \rightarrow\;L=\frac{m}{2} \dot{\vec{x}}^2.$$
with ##m=\text{const}##. The conserved quantity then turns out to be
$$m \vec{n} \cdot \vec{X}:=m \vec{n} (\vec{x}-\vec{v} t).$$
From Noether's theorem for temporal and spatial translation invariance this leads to the "definition" of the energy and momentum
$$E=H=\dot{\vec{x}} \cdot \vec{p}-L=\frac{m}{2} \dot{\vec{x}}^2, \quad \vec{p}=\frac{\partial L}{\partial \dot{\vec{x}}}=m \dot{\vec{x}}.$$
For and infinitesimal Lorentz boost you have
$$t'=t-\delta \vec{v} \cdot \vec{x}/c^2, \quad \vec{x}'=\vec{x}-\delta \vec{v} t.$$
A similar analysis using the condition for that to be a symmetry of the action indeed leads to
$$L=-m c^2 \sqrt{1-\dot{\vec{x}}^2/c^2},$$
and from the Noether symmetry under temporal and spatial translations you get
$$E=H=\frac{m c^2}{\sqrt{1-\dot{\vec{x}}^2/c^2}}, \quad \vec{p}=\frac{\partial L}{\partial \dot{\vec{p}}}=\frac{m \dot{\vec{x}}}{\sqrt{1-\dot{\vec{x}}^2/c^2}}.$$
From this it's easy to see that ##(E/c,\vec{p})## are four-vector components from realizing that
$$E/c=m \frac{\mathrm{d} c t}{\mathrm{d}\tau}, \quad \vec{p}=\frac{\mathrm{d} \vec{x}}{\mathrm{d} \tau},$$
where
$$\mathrm{d} \tau=\mathrm{d} t \sqrt{1-\dot{\vec{x}}^2/c^2} = \sqrt{\mathrm{d}t^2 - \mathrm{d} \vec{x}^2/c^2}$$
is the proper time of the (massive) particle. So
$$p^{\mu}=(E/c,\vec{p})=m \mathrm{d}_{\tau} x^{\mu}.$$
Since the proper time is a Lorentz scalar and ##x^{\mu}## are Lorentz-vector components, with ##m## being a scalar, also ##p^{\mu}## are Lorentz-vector components.

Derivation of E=pc & E=MC2: Which Came First?

Similar threads

Undergrad Euclidean geometry and gravity

Undergrad Synchronizing clocks in an inertial frame if light is anisotropic

Undergrad Question about Parallel Transport

Undergrad The Einstein Clock aka Light Clock

Graduate Assumptions of Hawking-Penrose 1970 Singularity Theorem

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers