Mathematically, the first two assumptions are expressed through the concept of divergence. If we imagine the electric field with lines of force, as in a high-school physics textbook, divergence basically tells us how the lines are "spreading out". For the lines to spread out, there must be something, intuitively speaking, to "fill the gaps": these things would be particles of charge. But there are no such things in empty space, so we can say that the divergence of the electric field in empty space is identically zero:
div E = 0.
The electric field is a vector field: the force it produces has a strength as well as a direction. The divergence of a vector field in a given coordinate system is computed through partial derivatives of the vector components:
div v = ∂vx + ∂vy + ∂vz
∂x ∂y ∂z
So far so good. What we said about the electric field also applies to the magnetic field of course:
What about the vorticity? The vorticity of a vector field is also computed through partial derivatives:
curl v = ┌
∂vz – ∂vy
∂y ∂z
∂vx – ∂vz
∂z ∂x
∂vy – ∂vx
∂x ∂y
Unlike the divergence of a vector field, which is a number field (called a scalar field), the vorticity of a vector field is another vector field. Intuitively what it means is that a vortex not only has strength, but it also has an axis pointing in a specific direction.
In this mathematical formalism, the second pair of Maxwell's equations in empty space can be expressed as:
curl B = ∂E/∂t, and
curl E = –∂B/∂t.
To further simplify calculations, we'll assume that the field depends only on one spatial coordinate, say, x. Feynman offers the example of a large (infinite?) charged sheet in the y-z plane that moves in a direction perpendicular to its surface as a source of this field. The same computation can be performed in the general case, but it is a lot more complicated (and a lot less instructive.)
In this case, the first pair of Maxwell's equations tells us that Ex and Bx must be constant functions.
The second pair of Maxwell's equations reduces to the following simple set:
– ∂Bz = ∂Ey
∂x ∂t
∂By = ∂Ez
∂x ∂t
– ∂Ez = – ∂By
∂x ∂t
∂Ey = – ∂Bz .
∂x ∂t
Using the first and the fourth equation, for instance, we can find a solution for Bz (or Ey). Consider:
– ∂2Bz = ∂2Ey .
∂x2 ∂x∂t
and
– ∂2Bz = ∂2Ey ,
∂t2 ∂x∂t
or
∂2Bz – ∂2Bz = 0.
∂t2 ∂x2
This can be rewritten as:
∂ – ∂
∂ + ∂
Bz = 0.
∂t ∂x ∂t ∂x
Solutions to this equation can be found in the form:
Bz = f1(t – x) + f2(t + x),
where f1 and f2 are arbitrary functions. The same solution exists for By, Ey, and Ez.
If we set f2 = 0, then
Bz = f1(t – x),
which is a legitimate solution to Maxwell's equations. What this means is that if the field has a certain value at t = 0, x = 0, then it'll have the same value at t = t0, x = t0. Similarly, if we set f1 = 0, a field that has a certain value at t = 0, x = 0, then it'll have the same value at t = t0, x = –t0. Thus we can say that the electromagnetic field represented by this solution is moving at unit velocity along the x-axis in either of two directions.
So what's this unit velocity business? Though it was perhaps not evident, in the derivation so far we made no attempt to use units of measure that are commonly used in engineering. This is quite legitimate, since different units of measure would only introduce constant multipliers that leave the structure of the equations unchanged. Had we used SI units throughout, we'd have found the final result appear only slightly different:
Bz = f1(ct – x).
Our choice of units (or no units, as the case might be) simply meant that we chose to have the constant c = 1. Using another set of units, e.g., SI units, we might find that c is equal to something else, such as 299,792.5 km/s.
What is important to realize is that regardless of what units we choose, the observed speed will be the same to all observers. Same regardless of where they are. Regardless of when they make their measurements. Regardless of how fast they themselves are moving, and in which direction they are facing. Whether you move towards a light source or away from it, the speed appears the same.
This of course makes no sense in ordinary Euclidean spacetime: when you are running ahead of a moving train, it'll appear slower (i.e., take longer to hit you) than when you're running towards it.
Special relativity is simply the most economical way to solve this dilemma. The idea is to find the simplest geometry in which all our initial assumptions can be simultaneously true.
Why geometry? If you think about it, when you switch from a stationary coordinate system to a moving one (i.e., from a coordinate system fixed to the clock of a railway station to one that is fixed to the main axis of your steam engine) it's really just a simple coordinate transformation: t' = t, x' = x – vt. And herein lies the problem: after this coordinate transformation, in the new coordinate system a ray of light no longer satisfies the conditions that we derived previously. If, in the old coordinate system, an electromagnetic field had the same value at t = 0, x = 0 and t = t0, x = t0, in the new coordinate system, it'll have the same values at t' = 0, x' = 0 and t' = t0, x' = t0 – vt0, and this contradicts what we just learned about Maxwell's equations as x' won't be equal to t'.
The simple geometry of special relativity, Minkowski spacetime, is built around the assumption that the quantity dt2 – dx2 – dy2 – dz2 remains constant under a "boost", i.e., when you change from one moving coordinate system to another. In our simple scenario with only one spatial coordinate, this reduces to dt2 – dx2 remaining constant when you switch from a stationary to a moving system. For rays of light moving in either direction, dt2 – dx2 remains 0 regardless whether you measure it from a moving or stationary system, which is precisely what we want in order to remain consistent with Maxwell's equations..
This assumption leads to a new form of coordinate transformation, the Lorentz transformation. To see why, compare the values for the station and the train in the diagram above. For the station, dt = t0, dx = x0 = vt0 (this, after all, is how we define the train's velocity v) and therefore, dt2 – dx2 is t02 – v2t02. For the train, dx' = 0 and thus dt'2 – dx'2 is t'02. We want the values for the station and the train to be equal:
t'02 = t02 – v2t02
t'02 = t02(1 – v2)
t0' = t0√(1 – v²)
And this, of course, is the fabled Lorentz transform.
Any other approach would either have to use a more complicated geometry (the late 19th century concept of "ether" can be viewed as an attempt to do just this) or it would require giving up at least some of our initial assumptions. And what's wrong with that, you ask? Well, those assumptions are supported by an enormous number of physical observations, not the least of which is the observation that this computer in front of me is functioning as expected, even though it is moving about at a not altogether inconsiderable velocity as the Earth spins, moves around the Sun and, along with the Sun, moves about in the Universe...