PeterDonis said:
You could ask a different question, namely, why does light, the actual physical phenomenon, travel on null worldlines? Answering that requires going beyond the theory of relativity; it's a question about quantum field theory and why the quantum field describing light is massless (which is the QFT way of saying "travels on null worldlines"). In relativity, the fact that light travels on null worldlines is taken as a given, a property of light that doesn't have any further explanation within that theory.
That's again a tricky "why question". First of all I don't know, what you take as basis for an "explanation", because any "explanation" must start from something you consider as a fundamental law of nature, and we can only figure these out by observations and careful quantitative experiments.
In my opinion, the status of the question, why electromagnetic fields are described by massless vector fields is not clear at all, and one must indeed take it as a fundamental law fitting all known observations so far with astonishing precision. So one could stop here, but it's anyway interesting to follow the question a bit.
I try to answer it on the level of special relativity (i.e., leaving the general relativity and thus gravity out of this discussion, because then we really would leave safe ground ;-)). The most fundamental theory we have about the world is indeed the special-relativistic space-time model, describing space and time as a four-dimensional continuum, called the Minkowski space together with quantum-field theory based on it. Here the most comprehensive model we have is the Standard Model of elementary particles.
So the question can be split in two questions: First of all, "why" is it the Minkowski space which describes the observed properties of what we call space and time well. Here the answer also is that of all space-time models it describes very many phenomena best (it's known that it must be modified again when taking into account gravity, leading to general relativity). You may argue in a bit more depth by invoking symmetry principles. One can start with the assumption that the principle of inertia holds, i.e., that there is a class of reference frames, where a body upon which no forces act, always move with constant velocity with respect to the corresponding observer who is at rest in one of these reference frames, the socalled inertial frames. Further, assuming that any inertial observer finds when measuring lengths of objects that are at rest relative to him that the corresponding geometry is Euclidean, implying that his space is homogeneous and isotropic. Further also time is assumed to be homogeneous, i.e., the laws of nature do not depend on the space and time where and when an inertial observer observes them. An analysis of the then following possible space-time symmetries shows that only two space-time models are left, namely the Galilei-Newton and the Einstein-Minkowski spacetime. The main difference is that in Einstein-Minkowski space time there is a fundamental "limiting speed", i.e., any object can only move with a velocity with respect to any inertial observer with at most this limiting speed ##c##, while in the case of Galilei-Newton space-time no such fundamental speed parameter exists. You can criticize this pretty complicated approach, however, because the assumptions going into it are pretty strong, but in my opinion it gives an idea, why there may exist space-time models with a fundamental limiting-speed parameter, independent of a concrete physical model like classical electrodynamics, which was the historical starting point for the theorists in the 19th century to think about these issues, with Einstein the one who has given the most convincing argument in terms of a space-time model (Einstein 1905).
Now around 1925 it was discovered that the classical description of matter is inadequate too, and one discovered quantum theory as a better description. First attempts to formulate quantum theory within relativistic physics was not very successful and that's why first the non-relativistic theory was developed (Heisenberg+Born+Jordan 1925, Schrödinger 1926, Dirac 1926). Then of course, after having learned to deal with non-relativistic quantum theory, also the relativistic theory was worked out. Soon it became clear that it is very hard to find a consistent theory which describes only a single interacting particle. This is understandable nowadays, because we deal with the creation and destruction of particles in accelerators of the highest energies on a quite familiar basis.
Now it was also known from non-relativistic quantum theory that for many-body systems or systems with a non-fixed number of particles, there is an equivalent description of quantum theory, known as quantum field theory, because it can heuristically derived by taking the Schrödinger equation, formulating it with Hamilton's principle (analogous to canonical mechanics of point particles) and "quantize" it, i.e., making the fields operator valued, and the field describing annihilation and destruction processes of particles. This was the perfect starting point for a relativistic quantum field theory, and one can again use the powerful tool of group theory and the space-time symmetries of the Einstein-Minkowski spacetime, with the Poincare group as symmetry group (Wigner 1940). Together with some additional assumptions (locality, microcausality, existence of a state of lowest energy) you are lead to the local relativistic quantum field theories which are very successful (although not yet free of all mathematical obstacles for interacting particles).
In this analysis it occurred that there are two rough classes of fields, belonging to two posibilities to realize Poincare symmetry in the sense of quantum theory: the massive and the massless fields. Quantizing the non-interacting massive fields leads to massive particles with any type of spin 0, 1/2, 1,3/2,... The corresponding field equations are the Klein-Gordon equation, the Dirac equation, etc. These have a well-defined non-relativistic limit leading to non-relativistic quantum theory of the corresponding particles with spin.
There's however also the class of field theories with massless fields. It's quantization is a bit more tricky, and they have no non-relativistic limit (this is also a deep property of the underlying space-time symmetries; while in Minkowski space massless fields (and even classical particles to some extent) make sense, massless particles make no sense in non-relativistic quantum theory, which is due to the different structure of the underlying space-time symmetry groups (Poincare group in the case of Einstein-Minkowski and the Galilei group in the case of the Galilei-Newton space-time).
The standard model is built on these general QFT structure, but it took more discoveries of more symmetries concerning the whole zoo of particles found since the 1950ies. One of the most important discoveries is that of the socalled local gauge theories. The above mentioned analysis of the Poincare group reveals that massless particles with spin ##s \geq 1## cannot be simply described by fields but by classes of fields. This is known already from classical electrodynamics: Using the four-potential several four-potentials which just differ in a four-gradient field, describe the same situation. Taking this symmetry into account and quantizing it (which is a puzzling business and an interesting story of its own) leads among other things to the fact that a massless vector particle has not three spin states as a massive vector particle, but only two (represented by, e.g., helicity eigenstates with the helicity being the projection of the total angular momentum two the momentum direction of the particle and taking only the two values ##\pm 1##).
The other way around, starting from gauge symmetry, it is most naural to assume that the corresponding vector field is massless. Of course, you also want to introduce charged matter particles to make a model for electromagnetically interacting charged particles and the electromagnetic field. Gauge invariance implies that necessarily electromagnetic charge must be conserved and that it is most simple to couple the vector field to a conserved current. The gauge transformation of the matter fields is invariance of the equations of motion under multiplication of these fields with a space-time dependent phase factor, with the phase (modulo multiplicative constants which represent the coupling strength between the particles and the vector field) being the same as in the gauge transformation of the vector field. Since the multiplication with a phase factor corresponds to symmetry under the Ablian group U(1), this is called an Abelian gauge theory, and it was pretty soon clear that such a theory describes electromagnetism very well, leading to quantum electrodynamics.
However, as it turns out, you can as well formulate the theory with massive vector bosons, still keeping the theory gauge symmetric (Stueckelberg model). So gauge invariance is not a true "answer" to the question, why photons are massless quanta of a vector field. So it's still not answering the question, "why" photons are massless.
Now the Standard Model rests on an even more general type of gauge symmetry. Already in the 1940ies Heisenberg discovered that one an describe also observed symmetries among particles with help of group theory. In his case he took the proton and the neutron which have (almost) the same mass as one and the same particle but just carrying another quantum number (called isospin). Thus he took proton and neutron as a isospin 1/2 doublet, i.e., the two eigenstates of the isospin-z component (with isospin +1/2 for the proton and isospin -1/2 for the neutron). As long as you consider only the strong interaction in scattering processes the isospin is (approximately) conserved.
Then Yang and Mills had the brillant idea to ask what happens, if you "gauge" such non-Abelian symmetries (in this case under the isospin group SU(2)). "Gauging" means you assume that a global symmetry (you can rotate only with a constant SU(2) transformation in isospin space, but not locally, because the field theory contains derivatives which by themselves do not lead to simple transformation laws for the field derivatives when the SU(2) transformation is made space-time dependent) becomes global. It turns out that you can make a global symmetry of this kind a local symmetry by introducing appropriate vector fields, the gauge bosons of this symmetry. Although the original Yang-Mills model was not successful in describing the strong interactions, the gauge models turned out to be the key to build the Standard Model.
Here it turned out that, contrary to the Abelian case, it is very difficult to give the gauge bosons a mass without distroying local gauge invariance. Only the famous Higgs mechanism could provide such a thing. So still you can make the non-Abelian gauge bosons massive, i.e., there is no real veto for massive vector bosons based on non-Abelian gauge symmetry.
So the short conclusion of this long-wound try to "explain" the masslessness of the photon is: We don't have a better answer than the fact that all empirical observations are to a very high accuracy consistent with the assumption of a massless photon, being described in the standard model as a U(1) gauge theory with the gauge boson assumed to be massless.