I Derivation of E=pc & E=MC2: Which Came First?

  • Thread starter Thread starter imsmooth
  • Start date Start date
  • Tags Tags
    Derivation E=mc2
Click For Summary
The discussion centers on the derivation of the equations E=pc and E=mc^2, questioning which came first and how they relate. Participants argue that both equations stem from the relativistic energy-momentum relation, with E=pc applicable to massless particles and E=mc^2 for particles with mass. There is contention over the validity of using relativistic mass in these derivations, with some asserting that it complicates understanding and is no longer considered useful in modern physics. The conversation highlights the importance of starting from a general equation to derive specific cases, emphasizing that assumptions about mass must be carefully considered. Ultimately, the debate reflects ongoing discussions in physics about the interpretation and application of these foundational equations.
  • #31
Nugatory said:
I'm not just completely quibbling here. In this case we end up with the right answer, but consider the similar problem with the Lorentz transformations (and the time dilation and length contraction formulas that follow from them). These are derived under assumptions that are equivalent to ##v<c##, just as your argument is derived under the assumption that ##m_0\ne 0## - but when we try to generalize by dumping that assumption and plugging in other values of ##v## the results are nonsensical in a way that has confused generations of physics students. Why should one procedure any less invalid than the other?

john t said:
Thanks Nugatory. I understand, and I do not think you are quibbling, and your analogy with the Lorenz situation makes the point clear. Can one say that my logic shows the consistency of the specific with the general equation, given the acceptance (naieve?) of relativistic mass?

Relativistic mass is not wrong, as it's found in many good books (eg. by Feynman, Purcell, Rindler) just that one has to be careful with how one uses it. In many cases, it is easier to avoid the relativistic mass for the purpose of obtaining a correct calculation.

Anyway, maybe to rephrase @john t's question, why does using an argument beyond the validity of its assumptions sometimes work? From the point of view of theory, massless photons are completely different from massive particles, as they are not at rest in any of the usual inertial reference frames, However, given that we don't know for sure that neutrinos and photons are massless, it seems that the equations for massless photons and for massive photons give almost the same results, ie. it would seem that the equations for massive particles should give results "close to" those for massless particles.
 
Physics news on Phys.org
  • #32
atyy said:
Relativistic mass is not wrong, as it's found in many good books (eg. by Feynman, Purcell, Rindler) just that one has to be careful with how one uses it. In many cases, it is easier to avoid the relativistic mass for the purpose of obtaining a correct calculation.

I think the issue that it is more likely that
a non-expert [who likely lacks knowledge of the "fine print"] will misuse "relativistic mass".
So, a better way has been sought to avoid its misuse by the general user.
(Maybe a license is needed to use "relativistic mass".)

[I do the same thing with "centrifugal force".]

In addition, sometimes it's not just about getting the right answer,
but it's about how one gets to that answer...
and hopefully, the method works for other problems... not just that one [or two] problems.
(In retrospect, one can then say [to the experts] "it is in this case, as if the mass were... ".)

atyy said:
Anyway, maybe to rephrase @john t's question, why does using an argument beyond the validity of its assumptions sometimes work? From the point of view of theory, massless photons are completely different from massive particles, as they are not at rest in any of the usual inertial reference frames, However, given that we don't know for sure that neutrinos and photons are massless, it seems that the equations for massless photons and for massive photons give almost the same results, ie. it would seem that the equations for massive particles should give results "close to" those for massless particles.

There is certainly value in "motivating a result"... say from observational data... then suggesting an extension.

But that is different from a "derivation",
hopefully from a more organized logical presentation of the theory.

My $0.02.
 
  • Like
Likes atyy
  • #33
Sagittarius A-Star said:
I think you could derive an equation, that is also valid for photons, by avoiding ##\gamma## as a factor

But you didn't do that. Rearranging ##E = \gamma m## to ##m = E / \gamma## is not avoiding the gamma factor.

The best way to get an equation valid for photons as well as massive particles is to start with ##m^2 = E^2 - p^2##, which is obtained by simply taking the norm of the 4-momentum vector. Then you can, as I described in an earlier post, specialize to the cases ##m > 0## and ##m = 0## as needed.
 
  • #34
PeterDonis said:
But you didn't do that. Rearranging ##E = \gamma m## to ##m = E / \gamma## is not avoiding the gamma factor.
I am not using ##m = E / \gamma##, which would be undefined for ##v=c##. What I wrote, is different for the case ##v=c##.

Also, the equation ##mc^2 = E \sqrt {1-\frac{v^2} {c^2} } ## shows nicely, that an object with an invariant mass of zero must move with ##c##.
 
Last edited:
  • #35
If there was not the obsession of even modern textbook writers to insist on interpreting photons as "particles" there'd be no need for massless classical particles to begin with and there'd not be that problem with "deriving" the energy-momentum relation for massless particles.

If you insist on the classical massless-particle picture you have to treat it somehow as a limiting case for massive particles for velocities ##\vec{v}## close to ##|\vec{v}|=c##. The reason is that there is no "natural" measure of time for massless particles, while for massive particles it's the proper time, which is used to define mass as a scalar quantity and make it in fact the same quantity as in the Newtonian limit. That's how the ##\gamma## factor comes into the definition for energy and momentum for classical particles:
$$p^{\mu}=m \mathrm{d}_{\tau} x^{\mu}$$
from which
$$p_{\mu} p^{\mu}=m^2 c^2$$
and now there's no problem to make ##m \rightarrow 0## and get ##E=p^0 c=c|\vec{p}|##.

It also becomes clear, why there is no "natural" time measure for massless particles! It's because there is no natural intrinsic scale present. For massive particles this scale is of course its invariant mass.
 
  • Like
Likes jbergman, Sagittarius A-Star and atyy
  • #36
PeterDonis said:
The best way to get an equation valid for photons as well as massive particles is to start with ##m^2 = E^2 - p^2##, which is obtained by simply taking the norm of the 4-momentum vector. Then you can, as I described in an earlier post, specialize to the cases ##m > 0## and ##m = 0## as needed.
Yes. Without a ##\gamma##:

(1) Four-momentum ## \mathbf P = (E/c , p_x, p_y, p_z) = (\frac {E}{c^2}c , \frac {E}{c^2} v_x , \frac {E}{c^2} v_y , \frac {E}{c^2} v_z) ##

(2) Pseudo-scalar product ## \mathbf P \cdot \mathbf P = \frac {E^2}{c^4} (c^2-v^2) = \frac {E^2}{c^2} (1-\frac{v^2}{c^2})##

(3) Minkowski-norm ## \left \|\mathbf P\right \| * c = E \sqrt {1-\frac{v^2}{c^2}} =
\begin{cases}
0 & \text{if } v=c \\
E_0 & \text{if } v=0 \\
E_0 & \text{if } v>0 \ \ \ and \ \ \ v<c (invariant, E\ depends\ on\ v)
\end{cases}
##

from (2), (1) and (3) => Pseudo-scalar product ## \mathbf P \cdot \mathbf P *c^2 = E^2 - p^2c^2 =
\begin{cases}
0 & \text{if } v=c \\
E_0^2 & \text{if } v<c
\end{cases}
##
 
Last edited:
  • Like
Likes vanhees71
  • #37
If you now just write ##E_0=m c##, everything is crystal clear :-).
 
  • #38
vanhees71 said:
If you now just write ##E_0=m c##, everything is crystal clear :-).
##m := \frac {\left \|\mathbf P\right \|}{c}##
 
  • #39
I don't like this notation, because it suggest there was a metric in Minkowski space, but there's none. It's an indefinite non-degenerate fundamental form but not a proper scalar product and thus doesn't induce a metric in the vector space.
 
  • #40
vanhees71 said:
I don't like this notation
##m_0 := \frac {E_0}{c^2}##
 
  • Like
Likes SiennaTheGr8 and vanhees71
  • #41
Sagittarius A-Star said:
Yes. Without a ##\gamma##:

Much simpler, without a ##\gamma##: ##P^\mu = (E, \vec{p})##, so ##P^\mu P_\mu = E^2 - \vec{p} \cdot \vec{p} = E^2 - p^2 = m^2##. In other words, go directly from your first equation in your step (1), to your step (4).
 
  • Like
Likes Sagittarius A-Star and vanhees71
  • #42
I asked a similar question here a few weeks ago and using 4-momentum to show that ##E^ 2=m^2 + p^2## is a circular argument. Just stating that the 4-momentum, ##P^\mu = (E,\vec{p})## obviously gives the relationship we are seeking, but it doesn't say anything why the zeroth component of 4-momentum is energy.

@vanhees71 gave the best answer on this thread that one must start with the relativistic Lagrangian as derived from Noether's theorem and then derive these relations for massive particles and take the limit as ##m \rightarrow 0##.

Alternatively, there are physical arguments that can be used to justify ##E=mc^2## like given by Feynman here in his lectures on physics. From which one can work out the rest. I think too many just start from accepting 4-momentum as a given without understanding why it is true.
 
  • Like
Likes vanhees71
  • #43
jbergman said:
Just stating that the 4-momentum, Pμ=(E,p→) obviously gives the relationship we are seeking, but it doesn't say anything why the zeroth component of 4-momentum is energy.

In analogy with Newtonian mechanics, it is natural to define four-momentum as mass times the four-velocity:
$$P^\mu=mU^\mu$$
Where four-velocity is (also naturally imo) defined as rate of change of position in spacetime with respect to proper time:
$$U^\mu=\frac{dx^\mu}{d\tau}, ~~~ d\tau^2 = - \eta_{\mu\nu}dx^\mu dx^\nu$$
Putting all together, a a result you can see that ##\mathbf P = (E, \vec p)##. Note, this is valid only for massive particles.
 
  • Like
Likes vanhees71
  • #44
lomidrevo said:
Putting all together, a a result you can see that ##\mathbf P = (E, \vec p)##. Note, this is valid only for massive particles.

Well, you didn't explain the most important part. How do you see that ##\mathbf P^0 = E##?

In the frame co-moving with the observer you get, ##P^0 = m dt/d\tau = m = E##. So you are using the Energy-Mass equivalence to show that ##\mathbf P = (E, \vec p)##. That's why, IMO, ##\mathbf P = (E, \vec p)## is not fundamental. One most first prove ##E=mc^2## or like mentioned before start with the Lagrangian.
 
  • #45
jbergman said:
I asked a similar question here a few weeks ago and using 4-momentum to show that ##E^ 2=m^2 + p^2## is a circular argument. Just stating that the 4-momentum, ##P^\mu = (E,\vec{p})## obviously gives the relationship we are seeking, but it doesn't say anything why the zeroth component of 4-momentum is energy.

An object moves with speed ##v= \sqrt{v_x^2+v_y^2+v_z^2}## in space and with velocity ##c## in ct-direction:
$$\frac{d}{dt}(ct, x, y, z) = (c, \vec{v})$$
Definition of relativistic momentum (experimentally verified):
$$ \mathbf P = (p_t , \vec{p}) = \frac{E}{c^2} (c, \vec{v}) = \frac{E}{c} (1, \vec{\beta})$$
 
  • #46
Sagittarius A-Star said:
An object moves with speed ##v= \sqrt{v_x^2+v_y^2+v_z^2}## in space and with velocity ##c## in ct-direction:
$$\frac{d}{dt}(ct, x, y, z) = (c, \vec{v})$$
Definition of relativistic momentum (experimentally verified):
$$ \mathbf P = (p_t , \vec{p}) = \frac{E}{c^2} (c, \vec{v}) = \frac{E}{c} (1, \vec{\beta})$$
So basically you are stating as an axiom that ##\mathbf P = (E, \vec{p})## and that we believe it because of experimental verification. A more physical way to prove this, IMO, is to first prove ##E=mc^2##. There are physical arguments for why that is true, and then derive the Energy - Momentum relationships.
 
  • #47
jbergman said:
A more physical way to prove this, IMO, is to first prove ##E=mc^2##.
That's not physical. You can ommit the redundant words "mass" and "energy" and speak only about ##P## (with ##\ \ \ p_t = \frac{c}{v} * p##). Conserved is:
$$ \mathbf P = p_t (1, \vec{\beta})$$
 
Last edited:
  • Skeptical
Likes jbergman
  • #48
jbergman said:
Well, you didn't explain the most important part. How do you see that ##\mathbf P^0 = E##?
Let's keep SI units for now, it is more illustrative. Based on the above post, you can write the time-like component of the 4-momentum like this:
$$P^0 = \frac{mc^2}{\sqrt {1-v^2/c^2}}$$
You can see this quantity has units of energy. Now, we can further examine it by considering case ##v << c##, it can be written as
$$P^0 \approx mc^2 + \frac{1}{2}mv^2$$
That looks like kinetic energy from Newtonian mechanics + some term that is not depending on the velocity: a rest energy?. However, this ##P^0## term is not invariant under Lorentz transformation, nor are the components of the "classical" 3-momentum. But if you put those components together and create a 4-vector (as defined above) you will see that its magnitude (inner product with itself) is scalar, invariant under Lorentz transformation. Voilà, we discovered conservation of energy and conservation of momentum.
It looks like energy, it has the same units, in Newtonian limit it includes well-known kinetic energy, it is conserved when properly combined with 3-momentum... Why not to call it energy?!
 
  • #49
jbergman said:
One most first prove E=mc2
No, you don't have to prove ##E=mc^2##. As I showed above, it appears there automatically.
jbergman said:
or like mentioned before start with the Lagrangian.
Yes, going like this you can show that hamiltonian is equal to time-like component of the 4-momentum. But I think, the above works fine too and it is more illustrative, imo.
 
  • #50
lomidrevo said:
Let's keep SI units for now, it is more illustrative. Based on the above post, you can write the time-like component of the 4-momentum like this:
$$P^0 = \frac{mc^2}{\sqrt {1-v^2/c^2}}$$
You can see this quantity has units of energy. Now, we can further examine it by considering case ##v << c##, it can be written as
$$P^0 \approx mc^2 + \frac{1}{2}mv^2$$
That looks like kinetic energy from Newtonian mechanics + some term that is not depending on the velocity: a rest energy?. However, this ##P^0## term is not invariant under Lorentz transformation, nor are the components of the "classical" 3-momentum. But if you put those components together and create a 4-vector (as defined above) you will see that its magnitude (inner product with itself) is scalar, invariant under Lorentz transformation. Voilà, we discovered conservation of energy and conservation of momentum.
It looks like energy, it has the same units, in Newtonian limit it includes well-known kinetic energy, it is conserved when properly combined with 3-momentum... Why not to call it energy?!
I know this proof. But again, for me, that isn't very satisfying to say that something having the correct units and in the low velocity limit includes the Newtonian kinetic energy means it is energy. Especially, when we can show that more directly from physics and the postulates of special relativity.
 
  • #51
jbergman said:
Especially, when we can show that more directly from physics and the postulates of special relativity.
Sure, I agree. Using the principle of least action is more universal, nobody stops you to do that. See my post #50.
But keep in mind that postulates of special relativity are still the same. I didn't use anything extra in my replies.

Before you said:
jbergman said:
but it doesn't say anything why the zeroth component of 4-momentum is energy.
I think I provided you several arguments in posts #44 and #49 that leads to a conclusion, "it must be a energy". So I find the words "it doesn't say anything" a bit unfair :smile:
 
  • Like
Likes jbergman
  • #52
jbergman said:
A more physical way to prove this, IMO, is to first prove ##E=mc^2##.
I think you refer to the "relativistic mass" ##m_R##. That is avoided today. Therefore I replaced it by ##E/c^2##, which is the same anyway.
 
  • #53
lomidrevo said:
Sure, I agree. Using the principle of least action is more universal, nobody stops you to do that. See my post #50.
But keep in mind that postulates of special relativity are still the same. I didn't use anything extra in my replies.

Before you said:

I think I provided you several arguments in posts #44 and #49 that leads to a conclusion, "it must be a energy". So I find the words "it doesn't say anything" a bit unfair :smile:
You have a point. I guess each of has to decide what is a compelling proof. I am only chiming in on this thread because I worked through the same confusion several weeks ago. At the end of the day, the final proof is that it agrees with experiment. But there are other paths there. You provided strong motivation as to why it would be true. I, personally, found Feynman's proof of in his lecture notes based on the conservation of momentum nice.

It is obviously a complex issue, see Ohanian, who claims, Einstein never successfully proved this.

This site, https://plato.stanford.edu/entries/equivME/, has a great article on the history of this relation and attempts to prove it including disputes as to whether or not various proofs were correct.
 
  • Like
Likes lomidrevo
  • #54
I think the whole point here is how we want to define "energy".
One easy way is just to define ##E=p^0## there's nothing more to add. #49 has proved that this definition is consistent with the definition of "energy" used in Newtonian mechanics, so that's good.
So now, one can say that "energy" is defined in another way. And that would be also OK, then we would just need to prove that the two definitions are equivalent.
At the end of the day, if you think that proving ##E=p^0## can only be proved by "experimental agreement" is because your notion of energy requires some experiment to be defined.
 
  • Like
Likes jbergman and lomidrevo
  • #55
Demystifier said:
According to the book
https://www.amazon.com/dp/0393337685/?tag=pfamazon01-20
(which is a serious book, not a crackpot one), Einstein made 7 different proofs of ##E=mc^2## and all 7 proofs had a mistake.
It is not crank, but most claims in this book remain disputed by other experts
 
Last edited:
  • Like
Likes jbergman
  • #56
I’m not an expert in physics by any means, but if you take ##E## as fixed, it follows that ##\gamma=\frac{E}{mc^2}## and ##p=\gamma mv=\frac{Ev}{c^2}##. Since ##\frac{v^2}{c^2}=1-\gamma^{-2}=1-\frac{(mc^2)^2}{E^2}##, taking ##m=0## yields ##v=c## and ##p=\frac{E}{c}##.

Whether or not this makes physical sense, I wouldn’t know, but for example the photon has a well-defined energy from Maxwell’s equations, no? So taking the limit as ##m## approaches zero is similar to approximating a photon as a point mass with a fixed energy and a very, very low mass.
 
  • #57
suremarc said:
it follows that ##\gamma=\frac{E}{mc^2}## and ##p=\gamma mv=\frac{Ev}{c^2}##.

But these formulas are only valid for ##m > 0##, so you can't use them to derive anything that applies to ##m = 0##. Or, if the "take the limit as ##m \rightarrow 0##" idea occurs to you, it's still not valid; formulas involving ##\gamma## are only valid for timelike vectors, not null vectors, and there is no continuous way to go from one to the other.

suremarc said:
the photon has a well-defined energy from Maxwell’s equations

Remember that we are talking classical physics, not quantum physics, so there is no such thing as a "photon". When books on classical relativity use the term "photon", what they really mean is "a pulse of light that lasts a very short time, so it can be approximated by a single null worldline in spacetime". In terms of Maxwell's Equations, this would be modeled as a wave packet with a very small "spread" in spacetime, modeled as a single null worldline using the geometric optics approximation. Such a wave packet does have a well-defined energy, yes, but it is frame-dependent.

suremarc said:
taking the limit as ##m## approaches zero is similar to approximating a photon as a point mass with a fixed energy and a very, very low mass.

No, it isn't, because, as I said above, timelike vectors and null vectors are fundamentally different, and you can't just wave your hands and "take a limit" to go from one to the other.
 
  • Like
Likes suremarc
  • #58
Or put another way, a null vector is as much the limit of a ruler as it is of trajectory; both are invalid, and there is no reason to favor one as an approximation over the other.
 
  • Like
Likes suremarc
  • #59
With regards to limits,
the diagram below from this old post ( https://www.physicsforums.com/threads/massless-photon.900960/post-5842652 ) might be useful

1611209344163.png


(an extension of this diagram is here:
https://physics.stackexchange.com/a/551250/148184 )
 
  • Like
Likes suremarc
  • #60
I'd say, if you are at this level of sophistication to ask "why do the physical laws look as they do", the best answer, found in connection with Einstein's works, are symmetry principles. The most general symmetry principles are related with the space-time models used in the different theories (Galilei-Newton, Einstein-Minkowski, Einstein(-Cartan)). The latter case is a bit more complicated, because GR is after all a gauge theory "gauging" the Lorentz group.

Galilei-Newton and Einstein-Minkowski are pretty much at the same level, i.e., given the space-time structure you derive the most generally valid continuous Lie-symmetry group for these space-time models and write down possible dynamical laws for the natural mathematical objects occurring in these space-time models.

For point mechanics (very natural and simple for Newtonian physics but on the edge of being inconsistent in special relativistic physics) and assuming a closed system of one particle (i.e., automatically a free particle, because there's nothing which the particle can interact with), using the analysis of Noether's theorem within the Lagrangian formulation of the action principle. The common subgroups of the Galilei (homogeneity of space and time, isotropy of space) you end up with the most general form of the Lagrangian (in Cartesian coordinates)
$$L=L(\dot{\vec{x}}^2).$$
The only difference is then in the boosts.

For an infinitesimal Galilei boost,
$$t'=t, \quad \vec{x}'=\vec{x}-\delta v \vec{n} t$$
The "symmetry condition" reads
$$-\vec{n} \frac{\partial L}{\partial \dot{\vec{x}}}+\frac{\mathrm{d}}{\mathrm{d} t} \Omega(\vec{x},t)=0.$$
From this you get
$$-2 \vec{n} \cdot \dot{\vec{x}} L'(\dot{\vec{x}}^2)+\dot{\vec{x}} \cdot \vec{\nabla} \Omega + \partial_t \Omega=0.$$
This is fulfilled for
$$\Omega=m \vec{n} \cdot \vec{x}$$
and
$$L'(\dot{\vec{x}}^2)=\frac{m}{2} \; \rightarrow\;L=\frac{m}{2} \dot{\vec{x}}^2.$$
with ##m=\text{const}##. The conserved quantity then turns out to be
$$m \vec{n} \cdot \vec{X}:=m \vec{n} (\vec{x}-\vec{v} t).$$
From Noether's theorem for temporal and spatial translation invariance this leads to the "definition" of the energy and momentum
$$E=H=\dot{\vec{x}} \cdot \vec{p}-L=\frac{m}{2} \dot{\vec{x}}^2, \quad \vec{p}=\frac{\partial L}{\partial \dot{\vec{x}}}=m \dot{\vec{x}}.$$
For and infinitesimal Lorentz boost you have
$$t'=t-\delta \vec{v} \cdot \vec{x}/c^2, \quad \vec{x}'=\vec{x}-\delta \vec{v} t.$$
A similar analysis using the condition for that to be a symmetry of the action indeed leads to
$$L=-m c^2 \sqrt{1-\dot{\vec{x}}^2/c^2},$$
and from the Noether symmetry under temporal and spatial translations you get
$$E=H=\frac{m c^2}{\sqrt{1-\dot{\vec{x}}^2/c^2}}, \quad \vec{p}=\frac{\partial L}{\partial \dot{\vec{p}}}=\frac{m \dot{\vec{x}}}{\sqrt{1-\dot{\vec{x}}^2/c^2}}.$$
From this it's easy to see that ##(E/c,\vec{p})## are four-vector components from realizing that
$$E/c=m \frac{\mathrm{d} c t}{\mathrm{d}\tau}, \quad \vec{p}=\frac{\mathrm{d} \vec{x}}{\mathrm{d} \tau},$$
where
$$\mathrm{d} \tau=\mathrm{d} t \sqrt{1-\dot{\vec{x}}^2/c^2} = \sqrt{\mathrm{d}t^2 - \mathrm{d} \vec{x}^2/c^2}$$
is the proper time of the (massive) particle. So
$$p^{\mu}=(E/c,\vec{p})=m \mathrm{d}_{\tau} x^{\mu}.$$
Since the proper time is a Lorentz scalar and ##x^{\mu}## are Lorentz-vector components, with ##m## being a scalar, also ##p^{\mu}## are Lorentz-vector components.
 
  • Like
Likes Sagittarius A-Star, PAllen and lomidrevo

Similar threads

  • · Replies 12 ·
Replies
12
Views
1K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 41 ·
2
Replies
41
Views
6K
  • · Replies 15 ·
Replies
15
Views
2K
  • · Replies 4 ·
Replies
4
Views
1K
Replies
15
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K