Dificulty with Einstein's derivation of TAO

disknoir · Dec 4, 2009

Hi,

I'm trying to work my way through Einstein's 1905 paper On the electrodynamics of moving bodies.

I'm having problems with section § 3. I follow until he says

"If we place x'=x-vt, it is clear that a point at rest in the system k must have a system of values x', y, z, independent of time."

I understand this as follows:

Because the point is at rest in system k it's position is independent of time as measured by an observer also at rest in system k. However, the position of this point would not be independent of time when measured by an observer at rest in system K.

Why use x' instead of [tex]\xi[/tex]? Is he using a third refernce frame?

I'm confused here. I've searched for explanations of this and in the few I have found the author doesn't seem to understand it either.

JesseM · Dec 4, 2009

Here was my attempt to follow Einstein's 1905 derivation and explain conceptually what was going on with some of the steps:

First Einstein assumes you have to coordinate systems defined by
measurements on a system of rigid measuring rods and clocks, K and k,
with all the spatial axes parallel and with k moving at velocity v along
K's x-axis. K's coordinates are (x,y,z,t) while k's coordinates are (xi,
eta, zeta, tau). Then he defines a new coordinate x'=x-vt, and says "it
is clear that a point at rest in the system k must have a system of
values x', y, z, independent of time". Although he doesn't state it this
way, what he has effectively done is to introduce a *third* coordinate
system Kg, with y,z,t coordinates identical to K, but with x' coordinate
given by x-vt. To make this a little clearer, I'm going to modify his
notation and say that coordinate system Kg uses coordinates x',y',z',t',
with these coordinates related to K's coordinates x,y,z,t by a Galilei
transform:

x'=x-vt
y'=y
z'=z
t'=t

An important thing to notice is that, unlike k and K, Kg doesn't
necessarily correspond to the measurements of any observer using a
system of measuring-rods and clocks; it isn't really an inertial
reference frame at all. So, the postulate that all observers must
measure the speed of light to be c in their own rest frame doesn't apply
to Kg. In fact, since we know light must travel at c in both directions
in K, and Kg is related to K by a Galilei transform, it must be true
that light travels at (c-v) in the +x' direction of Kg, and (c-v) in the
-x' direction of Kg.

So now a light beam is sent from the origin of k at tau0, reflected by a
mirror at rest in k at tau1, and returned to the origin at tau2. As
Einstein said, any point which is at rest in k must also be at rest in
this new coordinate system which I am calling Kg, so neither the point
of origin nor the mirror are moving in Kg. So if the origin of Kg
coincides with the origin of k, and if we say the mirror is at position
x'=xm' in Kg, then since light travels at (c-v) in the +x' direction of
Kg, the light will take time xm'/(c-v) to reach the mirror in Kg, and
since light travels at (c+v) in the -x' direction of Kg, it will take an
additional time of xm'/(c+v) to return to the origin. Thus, if the light
is emitted at t'=t0' in Kg's frame, it is reflected at t'=t0' +
xm'/(c-v), and returns to the origin at t'=t0' + xm'/(c-v) + xm'/(c+v)

So, if k's coordinate tau is expressed as a function of Kg's coordinates
like tau(x',y',z',t') then we have:

tau0 = tau(0, 0, 0, t0')
tau1 = tau(xm', 0, 0, t0' + xm'/(c-v))
tau2 = tau(0, 0, 0, t0' + xm'/(c-v) + xm'/(c+v))

Now, since k represents the actual measurements of a non-accelerating
observer using his measuring-rods and clocks, we know that light must
travel at c in both directions in this coordinate system, and since the
origin and the mirror are at rest in k, the light must take the same
amount of time to reach the mirror as it takes to be reflected back to
the origin in k. So, this gives 1/2(tau0 + tau2) = tau1, which
substituting in the expressions above gives

1/2[tau(0, 0, 0, t0') + tau(0, 0, 0, t0' + xm'/(c-v) + xm'/(c+v))] =
tau(xm', 0, 0, t0' + xm'/(c-v))

Then he goes from this to the equation 1/2(1/(c-v) + 1/(c+v))*(dtau/dt')
= dtau/dx' + (1/c-v)*(dtau/dt'), which also confused me for a little
while because I didn't know what calculus rule he was using to go from
the last equation to this one. But then I realized that if you just
ignore the y' and z' coordinates and look at tau(x',t'), then since he
says "if x' is chosen infinitesimally small", you can just assume tau is
a slanted plane in the 3D space with x',t' as the horizontal axes and
tau as the vertical axes. The general equation for a slanted plane in
these coordinates which goes through some point xp', tp', and taup would be:

tau(x',t') = Sx'*(x' - xp') + St'*(t' - tp') + taup

Where Sx' is the slope of the plane along the x' axis and St' is the
slope of the plane along the t' axis. If we say this plane must go
through the three points tau0, tau1 and tau2 earlier, then we can use
tau0's coordinates for xp', tp' and taup, giving:

tau(x',t') = Sx'*x' + St'*t' + tau0

So, plugging in tau1 = tau(xm', t0' + xm'/(c-v)) gives

tau1 = Sx'*xm' + St'*(t0' + xm'/(c-v)) + tau0

And plugging in tau2 = tau(0, t0' + xm'/(c-v) + xm'/(c+v)) gives:

tau2 = St'*(t0' + xm'/(c-v) + xm'/(c+v)) + tau0

So plugging these into 1/2(tau0 + tau2) = tau1 gives:

1/2(tau0 + St'*(t0' + xm'/(c-v) + xm'/(c+v)) + tau0 ) = Sx'*xm' +
St'*(t0' + xm'/(c-v)) + tau0

With a little algebra, this reduces to:

(1/2)*St'*(1/(c-v) + 1/(c+v)) = Sx' + St'*(1/(c-v))

And since tau(x',t') is just a plane, of course St' = dtau/dt' and Sx' =
dtau/dx', so this gives the equation 1/2(1/(c-v) + 1/(c+v))*(dtau/dt') =
dtau/dx' + (1/c-v)*(dtau/dt') which Einstein got.

He then reduces this to dtau/dx' + dtau/dt'*(v/(c^2 - v^2)) = 0, which
is just algebra. He also says that light moves along the y'-axis and
z'-axis of Kg at velocity squareroot(c^2 - v^2), which isn't too hard to
see--a light beam moving vertically along the zeta-axis of k will also
be moving vertically along the z'-axis of Kg since these coordinate
systems aren't moving wrt one another, but in K the light beam must be
moving diagonally since k is moving at v in k, so if you look at a
triangle with ct as the hypotenuse and vt as the horizontal side, the
vertical side must have length t*squareroot(c^2 - v^2), and since z'=z
and t'=t the light beam must also travel that distance in time t' in
Kg's coordinate system. The same kind of argument shows the velocity is
also squareroot(c^2 - v^2) in the y'-direction.

Since tau(x',y',z',t') is a linear function (ie tau(x',y',z',t') = Ax' +
By' + Cz' + Dt'), then from this you can conclude that if dtau/dx' +
dtau/dt'*(v/(c^2 - v^2)) = 0 for a light ray moving along the x'-axis,
tau(x',t') must have the form tau = a(t' - vx'/(c^2 - v^2)) where a is
some function of v (so that dtau/dt' = a and dtau/dx' = -av/(c^2 - v^2),
which means dtau/dx' = (-v/(c^2 - v^2))*dtau/dt').

Next he says that in the k coordinate system the light ray's position as
a function of time would just be xi(tau)=c*tau, so plugging that
expression for tau in gives xi=ac(t' - vx'/(c^2 - v^2)). But in system
Kg, this light ray is moving at velocity (c-v), so its t' coordinate as
a function of x' is t'(x') = x'/(c-v). Plugging this in gives xi =
ac(x'/(c-v) - vx'/(c^2 - v^2)) = a*(c^2/(c^2 - v^2))*x'.

Similarly, if light is going in the eta-direction then eta(tau)=c*tau,
so plugging in tau = a(t' - vx'/(c^2 - v^2)) gives eta=ac(t' - vx'/(c^2
- v^2)). In Kg this ray is moving at squareroot(c^2 - v^2) in the
y'-direction, so t'(y')=y'/squareroot(c^2 - v^2), and plugging this in
gives eta= a(y'/squareroot(c^2 - v^2) - vx'/(c^2 - v^2))...since x'=0
for this ray, this reduces to eta=(ac/squareroot(c^2 - v^2))* y'. The
relation between zeta and z' is exactly the same, so we have:

tau = a(t' - vx'/(c^2 - v^2))
xi = a*(c^2/(c^2 - v^2))*x'
eta = (ac/squareroot(c^2 - v^2))* y'
zeta=(ac/squareroot(c^2 - v^2))* z'

Since the relation between Kg coordinates (x',y',z',t') and K
coordinates (x,y,z,t) is just x'=x-vt, y'=y, z'=z, t'=t, we can plug in
and simplify to get:

tau = Phi(v) * Beta * (t - vx/c^2)
xi = Phi(v) * Beta * (x - vt)
eta = Phi(v) * y
zeta = Phi(v) * z

Where where Beta = c/squareroot(c^2 - v^2) = 1/squareroot(1 - v^2/c^2),
and Phi(v) = ac/squareroot(c^2 - v^2) (since a was an undetermined
function of v in the first place he just writes Phi(v)).

To find Phi(v), he then imagines a coordinate system K' which is moving
at -v relative to k (and unlike Kg, this coordinate system is supposed
to correspond to the measurements made on a system of measuring-rods and
clocks, so it's a valid inertial reference frame). *He uses
(x',y',z',t') for the K' coordinate system, but since I've already used
that for Kg, let's call the K'-coordinates (x",y",z",t"). Then the
transform from k-coordinates to K'-coordinates would have to be:

t" = Phi(-v) * Beta(-v) * (tau + v*xi/c^2) = Phi(v)*Phi(-v)*t
x" = Phi(-v) * Beta(-v) * (xi + v*tau) = Phi(v)*Phi(-v)*x
y" = Phi(-v) * eta = Phi(v)*Phi(-v)*y
z" = Phi(-v) * zeta = Phi(v)*Phi(-v)*z

If K' is moving at -v in the k-system, and k is moving at +v in the
K-system, then K and K' should really be the same system, so
Phi(v)*Phi(-v) should be 1. Then he argues that if you have a rod of
lenght l lying along the eta-axis of k and at rest in that system, then
if you transform the coordinates of its ends into the K-system, you find
that its length in the K-system is l/Phi(v), and by symmetry he argues
that the length of a vertical rod moving horizontally can only depend on
the magnitude of the velocity and not the direction, so l/Phi(v) =
l/Phi(-v), which means Phi(v) = Phi(-v)...combining with Phi(v)*Phi(-v)
= 1 which he obtained earlier, he concludes that Phi(v)=1, which gives
him the Lorentz transform.

Also, in the paragraph above that starts "Then he goes from this to the equation 1/2(1/(c-v) + 1/(c+v))*(dtau/dt') = dtau/dx' + (1/c-v)*(dtau/dt'), which also confused me for a little while because I didn't know what calculus rule he was using to go from the last equation to this one", I used an argument about what a smooth function looks like when you zoom in on an infinitesimal region, but in that step you can also just use the chain rule for partial derivatives, as explained in post #6 of this thread.

Entropia · Dec 11, 2009

Hi there,

I can understand your confusion with Einstein's derivation of TAO. It is a complex concept and can be difficult to grasp at first. Let me try to provide some clarification.

In section § 3, Einstein is introducing the concept of a moving reference frame, which he denotes as k. This is a reference frame that is moving at a constant velocity v relative to a stationary reference frame, denoted as K. In this scenario, we have a point at rest in the moving reference frame k, which Einstein denotes as x'. This point's position is independent of time as measured by an observer in the moving reference frame k. However, when measured by an observer in the stationary reference frame K, the position of this point would change over time.

Einstein is using x' instead of ξ because he is introducing a new coordinate system (x', y, z) for the moving reference frame k. This is different from the coordinate system (ξ, η, ζ) used for the stationary reference frame K. This is necessary to accurately describe the motion of objects in both reference frames.

I hope this helps clarify things for you. It may also be helpful to read through some other explanations or seek out additional resources to better understand Einstein's derivation. Keep at it, and don't be discouraged if it takes some time to fully grasp the concept. It is a challenging but important concept in understanding the theory of relativity. Best of luck!

Dificulty with Einstein's derivation of TAO

1. What is TAO and why is it difficult to derive?

2. What challenges did Einstein face in deriving TAO?

3. How did Einstein's derivation of TAO impact the scientific community?

4. Is Einstein's derivation of TAO still relevant in modern science?

5. What are some potential implications of successfully deriving TAO?

Similar threads

Hot Threads

Recent Insights