# Finding the rest energy

snoopies622
I came across a book this afternoon called, "Einstein's Mistakes" by Hans C. Ohanian. If I'm remembering correctly, he said that the first error-free derivation of $E=mc ^2$ did not exist until Max von Laue produced one in 1911, and that it was based on "mathematical properties of the stress-energy tensor".

Is this correct? I know one can arrive at

$$d(KE) = d (\gamma m_0 c^2)$$

with a couple thought experiments and a little calculus, but can one find that the rest energy of an object is $m_0 c^2$ in a similar manner?

Last edited:

## Answers and Replies

netheril96
I came across a book this afternoon called, "Einstein's Mistakes" by Hans C. Ohanian. If I'm remembering correctly, he said that the first error-free derivation of $E=mc ^2$ did not exist until Max von Laue produced one in 1911, and that it was based on "mathematical properties of the stress-energy tensor".

Is this correct? I know one can arrive at

$$d(KE) = d (\gamma m_0 c^2)$$

with a couple thought experiments and a little calculus, but can one find that the rest energy of an object is $m_0 c^2$ in a similar manner?

I don't quite understand what you mean,but the rest energy is just an assumption in relativity.Relativity itself cannot prove the $${m_0}{c^2}$$ is a physical entity.It can be only tested by experiments.

yuiop
I came across a book this afternoon called, "Einstein's Mistakes" by Hans C. Ohanian. If I'm remembering correctly, he said that the first error-free derivation of $E=mc ^2$ did not exist until Max von Laue produced one in 1911, and that it was based on "mathematical properties of the stress-energy tensor".

Is this correct? I know one can arrive at

$$d(KE) = d (\gamma m_0 c^2)$$

with a couple thought experiments and a little calculus, but can one find that the rest energy of an object is $m_0 c^2$ in a similar manner?

$$KE = \frac{m_0 c^2}{\sqrt{1-v^2/c^2}}$$

and note that there is a residual energy equal to:

$$KE = {m_0 c^2}$$

when v = 0 ?

To be more rigorous, I guess it would have to some how be demonstrated that

$$d(KE) = d (\gamma m_0 c^2)$$

implies:

$$(KE) = (\gamma m_0 c^2)$$

yuiop
Hi, Kev. Check out this short thread..

Yep, I posted too quick The relativistic equation for kinetic energy is in fact:

$$KE = \frac{m_0}{\sqrt{1-v^2/c^2}} - m_0c^2$$

so KE goes to zero when v=0. Hmm.. Obviously I'll have to give this some more thought.

[EDIT] it is however true that total energy is:

$$TE = \frac{m_0 c^2}{\sqrt{1-v^2/c^2}}$$

and when v goes to zero:

$$TE = {m_0 c^2}$$

Staff Emeritus
Gold Member
I came across a book this afternoon called, "Einstein's Mistakes" by Hans C. Ohanian. If I'm remembering correctly, he said that the first error-free derivation of $E=mc ^2$ did not exist until Max von Laue produced one in 1911, and that it was based on "mathematical properties of the stress-energy tensor".

IMO Ohanion is full of baloney.

I don't quite understand what you mean,but the rest energy is just an assumption in relativity.Relativity itself cannot prove the $${m_0}{c^2}$$ is a physical entity.It can be only tested by experiments.
No, this is incorrect. There are various axiomatic frameworks that can form the logical basis of SR. All of them are powerful enough to prove E=mc2, provided that you also impose the condition that the theory is consistent with Newtonian physics in the appropriate limit, i.e., it has to obey the correspondence principle.

FAQ: Where does E=mc2 come from?

Einstein found this result in a 1905 paper, titled "Does the inertia of a body depend upon its energy content?" This paper is very short and readable, and is available online. A summary of the argument is as follows. Define a frame of reference A, and let an object O, initially at rest in this frame, emit two flashes of light in opposite directions. Now define another frame of reference B, in motion relative to A along the same axis as the one along which the light was emitted. Then in order to preserve conservation of energy in both frames, we are forced to attribute a different inertial mass to O before and after it emits the light. The interpretation is that mass and energy are equivalent. By giving up a quantity of energy E, the object has reduced its mass by an amount E/c2, where c is the speed of light.

Although Einstein's original derivation happens to involve the speed of light, E=mc2 can be derived without talking about light at all. One can derive the Lorentz transformations using a set of postulates that don't say anything about light (see, e.g., Rindler 1979). The constant c is then interpreted simply as the maximum speed of causality, not necessarily the speed of light. Constructing the mass-energy four-vector of a particle, we find that its norm E2-p2c2 is frame-invariant, and can be interpreted as m2c4, where m is the particle's rest mass. In the case where the particle is at rest, p=0, and we recover E=mc2.

A. Einstein, Annalen der Physik. 18 (1905) 639, available online at http://www.fourmilab.ch/etexts/einstein/E_mc2/www/

Rindler, Essential Relativity: Special, General, and Cosmological, 1979, p. 51

snoopies622
Constructing the mass-energy four-vector of a particle, we find that its norm E2-p2c2 is frame-invariant, and can be interpreted as m2c4, where m is the particle's rest mass. In the case where the particle is at rest, p=0, and we recover E=mc2.

This seems a little circular to me since the norm of any four-vector in Minkowski space is frame-invariant.

yuiop
I came across a book this afternoon called, "Einstein's Mistakes" by Hans C. Ohanian. If I'm remembering correctly, he said that the first error-free derivation of $E=mc ^2$ did not exist until Max von Laue produced one in 1911, and that it was based on "mathematical properties of the stress-energy tensor".

Is this correct? I know one can arrive at

$$d(KE) = d (\gamma m_0 c^2)$$

with a couple thought experiments and a little calculus, but can one find that the rest energy of an object is $m_0 c^2$ in a similar manner?

This http://en.wikipedia.org/wiki/Special_relativity#Kinetic_energy" suggests that:

$$\Delta(KE) = \gamma_1 m_0 c^2 - \gamma_0 m_0 c^2$$

which when the initial state of the body was at rest and $y_0=1$ becomes:

$$\Delta(KE) = \gamma m_0 c^2 - m_0 c^2$$

This strongly hints that the body has an a residual energy when it is at rest equal to $m_0 c^2$. I don't know if that is any help to you.

Last edited by a moderator:
Staff Emeritus
Gold Member
This seems a little circular to me since the norm of any four-vector in Minkowski space is frame-invariant.

I don't think the reasoning is circular, although it could be expressed better. I'll change the wording to this: "Constructing the mass-energy four-vector of a particle, we find that its frame-invariant norm is E2-p2c2,..."

snoopies622
"Constructing the mass-energy four-vector of a particle, we find that its frame-invariant norm is E2-p2c2,..."

But doesn't expressing the four-momentum vector in that way (E, p) start off by already assuming that its time component

$$m_{0} c^2 \frac {dt}{d \tau}$$

is the total energy of the particle?

(Edit: Unfortunately I now have to go visit my parents for Easter dinner. I'll be back tonight..)

Last edited:
Staff Emeritus
Gold Member
But doesn't expressing the four-momentum vector in that way (E, p) start off by already assuming that its time component

$$m_{0} c^2 \frac {dt}{d \tau}$$

is the total energy of the particle?

I would run the logic in the following way. We know that conservation of energy and momentum are valid in Newtonian mechanics. Therefore we expect that there will be some kind of similar conservation law(s) in relativity, with the Newtonian version holding in the appropriate limit. Since momentum is a vector, clearly the conserved thing in SR can't just be a scalar; we expect to find some conserved four-vector. It really does have to transform as a four-vector, because otherwise the corresponding conservation law wouldn't be form-invariant when you changed frames of reference.

Now given a particle with rest mass m and velocity four-vector v, there is only one way of satisfying the correspondence principle by putting them together to make something that is consistent with Newtonian E and p conservation in the appropriate limit. Therefore there is only one candidate for the relativistically conserved quantity, which is mv, where m is the rest mass and v is the four-velocity. Either it's conserved or SR isn't a viable theory; otherwise we would have observed macroscopic violations of conservation of energy and momentum in bulk matter, since the typical speeds of electrons are ~1-10% of c.

Given that this thing is indeed conserved, we naturally want to interpret it. The Newtonian limit tells us that we can interpret it as a four-vector made out of the mass-energy and the momentum three-vector.

Staff Emeritus
Gold Member
You might be interested in this thread, specifically entries 9-11.

Yeah, I'd forgotten that we had that discussion back then. I'm still mystified by what Schutz might have had in mind. Possibly the underlying problem here is that we're trying to apply concepts like "postulate" and "proof" in a physical context, where they can't really mean the same thing they'd mean in a mathematical context. Whatever notions of "postulate" and "proof" Schutz has in mind are probably not ones that I would consider appropriate in a physical context.

Staff Emeritus
Gold Member
Another thought re #12-13: We seem to have at least three different points of view.

(1) Einstein thought his 1905 paper demonstrated E=mc2.
(2) Ohanian thinks there was something wrong with Einstein's proof, but thinks E=mc2 can still be proved rather than introduced as an independent postulate.
(3) Schutz thinks E=mc2 can't be proved, and has to be taken as an independent postulate.

I agree with 1. I'm inclined to discount 3 completely, since Schutz doesn't seem (based on the quotes you've posted) to have formulated a sufficiently complete argument to allow anyone to know what he had in mind.

snoopies622
I suppose a good mathematical challenge would be: prove (or disprove) that the four-momentum vector is the only Minkowski four-vector that adheres to the correspondence principle regarding energy and momentum conservation.

I find that Lieber's "argument" about four-momentum conservation mentioned in entry #28 of that thread very appealing, but I know it's not a proof.

Staff Emeritus
Gold Member
I suppose a good mathematical challenge would be: prove (or disprove) that the four-momentum vector is the only Minkowski four-vector that adheres to the correspondence principle regarding energy and momentum conservation.

If we're only allowed to use the scalar m and the four-vector $v^i$ as ingredients, then I certainly think it's true that there is no other possibility. There are only certain operations you have available that take tensors and turn them into other tenors: (1) addition of things that have the same units, (2) multiplication (possibly with implied sums over repeated indices), (3) dividing a scalar by a scalar, and (4) arbitrary functions that take dimensionless scalar inputs and give dimensionless scalar outputs.

Starting from m and $v^i$, after one iteration of operations 1-3 above, you get the following new objects:
1a: $mv^i$
1b: $v^iv_j$
1c: $m^2$
But 1b simply equals 1 (in a +--- signature), and 1c, although legal, is clearly pretty pointless, because something with units of kg2 is a dead end in terms of all the allowed operations.

You can keep on going this way with more iterations, and I don't think you get anything particularly interesting. You never get a chance to use operation 4, because you can never produce anything that's dimensionless (other than 1). Operation 2 allows you to form all kinds of tensors of arbitrary rank, but the only way to get these back down to rank 1 at the end is to do contractions, so you end up with stuff of the form $v^iv_iv^jv_j\ldots v^k$, which is simply the same as $v^k$.

The above isn't a complete, worked-out, formal proof, but it's enough that I've convinced myself :-)

I can think of a couple of ways of loosening the rules so as to allow alternative expressions to be formed: (i) allow dimensionful constants to occur, i.e., introduce a new scale; (ii) allow higher-order derivatives of position to occur.

Possibility (i) only really allows anything new to happen if the new dimensionful constant has units of mass. Call this new mass scale $m_o$. Then the kinds of things you can do with it are to form expressions such as $m(1+m/m_o)v^i$. The problem with this is that it isn't additive, and we expect conserved quantities to be additive. That is, two particles considered as a single object should not have a different momentum than the same two particles considered as two separate objects.

I think possibility (ii) is also unsatisfactory from a physical point of view. It makes initial-value problems not have unique solutions.

snoopies622
Well bc, that all seems very reasonable to me, but it's pretty deep too, so I'll have to think about it some more over the next couple days and absorb it. In the meantime I wonder if we could get a math-heavy or two like Hurkyl or Fredrik to weigh in.

Homework Helper
Gold Member
The key thing that is needed in the derivation/identification of mc² as the rest energy is the assumption that all dynamical laws have Lorentz symmetry. After all, energy and momentum are dynamical quantities which have no meaning in purely kinematical considerations. Once the requirement of Lorentz symmetry is assumed, the most direct path to the relativistic energy and momentum formulas is from the action principle of mechanics. We must consturct an action scalar S which is Lorentz invariant. For a free particle with a spacetime trajectory γ = xμ(τ), the simplest scalar invariant that we can construct is a quantity proportional to the proper time

$$\tau = \int \sqrt{ \eta_{\mu}_{\nu}dx^{\mu}dx^{\nu}} = \int ds$$

so we set

$$S[\gamma] = -m\tau = -m\int ds = -m\int \sqrt{1 - v^2}dt$$

where 'm' is a positive constant characterizing the particle. Thus, the lagrangian is L(v) = -m√(1 - v²) and the energy is

$$E = v(\partial L/ \partial v) - L = \frac{m}{\sqrt{1 - v^2}}$$

Setting v = 0, we see that the rest energy is E = m. By restoring ordinary units for c and comparing the low velocity limit of the relativistic formula with the Newtonian expressions, we see that the 'm' we have introduced here corresponds to the Newtonian mass.

Last edited:
snoopies622
I like it! But I have a question:

Suppose a particle moves with constant velocity from point A to point B. In my frame of reference, the distance it travels is $\Delta x$, the time it takes to make the trip is $\Delta t$, the particle's momentum is p and its kinetic energy is T. In the reference frame of someone who is moving relative to me along the line which includes A and B, all of these values will be the different. Why then should we assume that the action will be the same?

Homework Helper
Gold Member
Why then should we assume that the action will be the same?

It is just the assumption of Lorentz invariance, the basic assumption of relativistic mechanics. Changing viewpoint from one inertial observer to another is just a Lorentz transformation.

snoopies622
I thought the assumption of Lorentz invariance was that the spacetime distance between two points is the same for all inertial observers, or - to put it another way - that spacetime is described by the Minkowski metric.

Homework Helper
Gold Member
Yes, the Minkowski metric has Lorentz symmetry, but the key point for us is the stronger statement that all laws of nature have Lorentz symmetry. The latter does not follow logically from the former, although they are related. For example, the laws of motion for the electromagnetic field, and the laws of interaction between charges are Lorentz invariant. This means that if we write an action or a Lagrangian for Electrodynamics, it will be Lorentz invariant. In fact, the electromagnetic field Lagrangian is proportional to FμνFμν, which is a Lorentz invariant scalar.

Last edited:
snoopies622
Interesting.. I thought in this case it meant something like, since the principle of least action is true in one inertial frame, it's true in all inertial frames. It doesn't follow from that assumption that the actual value of the action will be the same in all frames.

Edit: I should have added - I'm assuming here that the spacetime coordinates of the two inertial frames are related to each other by a Lorentz transformation.

Last edited:
snoopies622
I've given this some more thought and I'm afraid I still don't see these two things

- the laws of nature are Lorentz invariant

- action is Lorentz invariant

as logically equivalent. I've also never seen the second statement given as a premise of special relativity before. Why is action Lorentz invariant while other scalars like distance, time, and energy are not?

PhilDSP
Einstein's 1905 paper didn't demonstrate that $$E = mc^2$$ did it? It rather demonstrated that $$\delta(E) = \delta(mc^2)$$

Last edited:
Homework Helper
Gold Member
I've given this some more thought and I'm afraid I still don't see these two things

- the laws of nature are Lorentz invariant

- action is Lorentz invariant

as logically equivalent. I've also never seen the second statement given as a premise of special relativity before. Why is action Lorentz invariant while other scalars like distance, time, and energy are not?

'Laws of nature' is a little ambiguous. Maybe it's more clear if we say 'the equations of motion are Lorentz invariant'. The equations of motion in the action formulation result from the extremization of the action.

Staff Emeritus
Gold Member
Einstein's 1905 paper didn't demonstrate that $$E = mc^2$$ did it? It rather demonstrated that $$\delta(E) = \delta(mc^2)$$

Yes, I think that's right: http://www.fourmilab.ch/etexts/einstein/E_mc2/www/ . The final conclusion near the end is: "If a body gives off the energy L in the form of radiation, its mass diminishes by L/c²."

The distinction between E and $\Delta E$ in this context brings up a lot of serious difficulties. In pre-1905 physics, total energies were arbitrary up to an additive constant. But when we measure the inertia of an object, we're measuring m, not $\Delta m$, so it seems as though we're able to determine *absolute* energies in this way. I'm not satisfied with my own understanding of how this is resolved in various theories such as QED and GR.

Last edited by a moderator:
snoopies622
...'the equations of motion are Lorentz invariant'. The equations of motion in the action formulation result from the extremization of the action.

I agree with you there. I just don't see how the idea that action is extremized in all inertial frames of reference implies that the action quantity is itself Lorentz invariant.

Homework Helper
Gold Member
I agree with you there. I just don't see how the idea that action is extremized in all inertial frames of reference implies that the action quantity is itself Lorentz invariant.

"Action is extremized in all inertial frames of reference" is actually not what we are asserting here, although it is true. To make this clear, I will explain the following in a single inertial reference frame. (Assume 1 + 1 dimensional spacetime for simplicity.)

Consider a system moving in this frame, say a scalar field thay fills the whole of space. Then a function φ(x,t) will be a solution to the equations of motion if S has a stationary point at φ. The requirement of Lorentz invariance for the equations of motion is equivalent to requiring that, if we apply an active Lorentz transformation to φ to get a new field φ'(x,t), then φ' will also solve the equations of motion. Let Γ be the space of 'motions'. Lorentz transformations will generate a vector field vL in this space, which represents infinitesimal Lorentz transformations, i.e. if φ is some motion, then φ + εvL(φ) is the result of applying a boost with speed ε to φ. If φ is a solution, then the 1-parameter familty φ(ε) = φ + εvL(φ) will all be solutions to the equations of motion (if the equations of motion are Lorentz invariant.) An integral curve of the vector field vL (call it γ(λ)) which passes through φ when its parameter λ is 0 will be tangent to the curve φ(ε) = φ + εvL(φ) at λ = 0, and the function S'(λ) = S(γ(λ)) must be stationary at each value of λ. A function f(x) can be stationary for each value of x only if it is constant, so S'(λ) is constant and S is Lorentz invariant.

Last edited:
Count Iblis
Let's be clear that Ohanian is not a kook, he has written electromagnetism textbooks that are used in university. I think that in one of these textbooks he makes basically the same point that DX makes here. I think he wrote that the fact that a conserved four momentum exists at all should be derived (e.g. by assuming Lorentz invariance of the Lagrangian and then using Noether's theorem, but he doesn't elaborate, I think he only mentions that it requires a knowledge of field theory) and that simply assuming that there exists a conserved four momentum and then deriving the expression is not a rigorous argument.

Count Iblis
Einstein's 1905 paper didn't demonstrate that $$E = mc^2$$ did it? It rather demonstrated that $$\delta(E) = \delta(mc^2)$$

But it then directly follows that adding an energy of E to an object in any form will increase its mass by E/c^2. Because after the radiation is absorbed, the energy can be converted in any other form (such a transformation is irrelevant in the argument). The whole point of Einstein's reasoning is to pinpoint the reason why an object has an inertia. He shows that the intertia depends on energy and nothing else than energy. Therefore we don't need to postulate an independent physical quantity for inertia (i.e. mass) anymore. What we call mass is the same as the energy content of an object.

Staff Emeritus
Gold Member
I think he wrote that the fact that a conserved four momentum exists at all should be derived (e.g. by assuming Lorentz invariance of the Lagrangian and then using Noether's theorem, but he doesn't elaborate, I think he only mentions that it requires a knowledge of field theory) and that simply assuming that there exists a conserved four momentum and then deriving the expression is not a rigorous argument.

I guess this is a matter of taste, then, because I see it the other way around. Writing down a Lagrangian is most likely going to boil down to making a guess based on aesthetics, looking for the simplest expression that will make sense physically. IMO the other version is much more rigorous.

Of course we don't have anything in physics that is totally equivalent to the mathematician's concept of a proof -- at least not in the kind of context we're talking about here, where we're trying to extend an old theory to be consistent with newly imposed principles. This is probably why there's so much room for disagreement, with, e.g., Einstein and Ohanian disagreeing on whether Einstein's 1905 derivation is correct.

PhilDSP
Einstein's 1905 paper didn't demonstrate that $$E = mc^2$$ did it? It rather demonstrated that $$\delta(E) = \delta(mc^2)$$

Sorry about the strange looking equation by the way. Thanks to bcrowell's post I see now that the tex Delta argument must be capitalized.

$$\Delta(E) = \Delta(mc^2)$$

snoopies622
Well, thanks to everyone for helping me out here. I'm still trying to understand dx's posts but he and I have started discussing them via visitor messages so this thread won't become burdened with my ignorance of the calculus of variations. I'm also still thinking about bcrowell's post #16 and I like the reasoning behind it.