# Why demand gauge invariance?

• Bobhawke
It's more or less just a convenient fiction that we use to make the calculations go a little bit more easily. However, eventually, we'll realize that this is all we need, and that all the other "oddities" are actually just details.In summary, local gauge invariance is a phase invariance that is imposed on a theory in order to keep the equations of motion local. It is a constraint that was first imposed to make a theory relativistic based on the causality condition. However, with the development of QM, it was realized that the complex phase of the wavefunction was an excellent choice for describing a theory with non-interacting fields.

#### Bobhawke

I apologise if this question has been asked before, but I coudlnt find it, so:

Is there some deeper reason for demanding gauge invariance other than that it allows us to include interactions between the gauge field and the fermions?

I have seen people claim that it is "in keeping with the spirit of relativity" but I wasnt entirely sure what was meant by that.

thanks

you are asking of LOCAL gauge invariance.

The phase of fields are arbitrary, it is not an observable. Each observer may choose his/her own phase, and the lagrangian invariant under such phase transformation:
$$\phi \rightarrow e^{i\Theta}\phi$$

From this GLOBAL phase transformation you will, with Noethers Theorem, deduce that the Electric Charge is constant.

Now, require that this phase depends on space-time, thus an observer choosing phase \Theta _1 at space time X_1, should not be able to choose that particular phase outside his/her Light Cone.

That is true also, but historically one first only used the causality condition to impose local gauge invariance - at least as far I know.

Let me give you a slightly different take on Glenn's position (and I think there is a recent set of CTEQ lectures that comes at it this way.)

In QM, we only observe $\psi \psi^*$, never $\psi$ itself. I can make the transformation $\psi \rightarrow \psi e^{i\Theta}$ with no observable effects, because $\psi^*$ transforms as $\psi^* \rightarrow \psi^* e^{-i\Theta}$.

Now, if this phase $\Theta$ is unobservable, there should be no reason to insist that it's constant in space and time. So instead of $\Theta$, I really should have written down $\Theta(x,y,z,t)$. This is what is meant by Local Gauge Invariance, and as you can see, it's really a phase invariance. Because the equations of motions have derivatives in them, keeping $\Theta$ unobservable is a powerful constraint on the kinds of theories one can write down.

Note that this is not, strictly speaking, necessary. I could have instead imposed by fiat that $\Theta$ is constant throughout time and space. That would be a different theory, and it would certainly open up a can of worms as to why something unobservable would be constant, but there's nothing intrinsically wrong about it.

• phoenix95
The textbook argument (and historical) is essentially this: general relativity taught us that it is meaningless to compare directions at different spacetime points until we specify a connection, or a particular metric. Allowing the metric to play a dynamical role, we find a relativistic theory of gravity. Historically, the first attempt at a gauge theory was to require a connection for comparing sizes, thus the name "gauge". Alas, this was not correct, as it would predict changes in spectral lines of the elements in stars, etc. However, with the advent of QM, it was realized that the complex phase of the wavefunction is an excellent choice, and indeed makes Maxwellian EM simply drop out of the formalism. Yang and Mills then extended this idea to other symmetries. So if you wanted a neat soundbite, it might be something like "because of relativity, there is no reason to assume that anything can be compared without specifying a connection (i.e. a gauge field)". This is what people mean when they say it's keeping with the spirit of relativity.

However, in the interest of causing some good-natured ruckus, I'd like to offer another point of view. I think that eventually, we will come to see gauge theories as simply a convenient fiction. The reason is this: physics is about connecting observables, and any spilt ink on things which are by definition unobservable is simply not physics. However, this must be tempered by the realisation that reality is complicated beyond our ability to deal with it; therefore, it is often helpful to introduce odd or unnatural constructions to mould the problem into one that is tractable. One such general class of issues is that of constructing a decent Hilbert space for our problem. Something nice and clean and inductively defined, like Fock space, has very neat properties and allows us to get quite far in analysis. Non-interacting fields of bosons or even fermions can be handled this way, in a variety of algebraic and geometrical means. Gauge fields however are naturally defined as what's called moduli space; essentially, it means that you take a simple field like a vector field, and "mod out" or "quotient out" certain subspaces, by saying that some of them are actually the same thing. Mathematically, you take your vector field, and say that any configuration joined by a gauge transform is actually the same physical state. The resulting Hilbert space is unfortunately quite messy, and no one in their right mind would go about studying this if it wasn't for the fact that nature seems to be quite fond of it. Notice that in this more general perspective of trying to construct the Hilbert space of some theory, gauge fields are really only one possibility; others which have been under serious investigation are anyonic systems and lattice systems with infinite on-site repulsion (you can only put one fermion on each site, so there are 3 instead of 4 states per site). So one question might be "why gauge fields?" My personal belief is this: gauge fields are constrained to have massless particles, which make them suitable for modelling the low energy effective theory. Particles with mass tend to pick up more mass as one renormalises downwards, eventually pushing them out of the low energy sector, or causes an inconsistency ("naturalness problem"). Indeed, both anyonic systems and the repulsive lattice systems I've mentioned above have descriptions in terms of gauge fields coupled to matter (often fractionalised and strongly coupled). I think as we explore more of the theory space, we will come to understand gauge theories as one peculiar little corner of relative analytic simplicity, and so very useful as an ingredient of physical theories.

The phase can in some cases be an observable, eg a Berrys phase.

Proving that there is something really to the geometry of gauge fields afterall, they aren't simply pure mathematical constructs with limited physical significance.

In addition to all the good answers listed above, there is the "redundancy of information" problem:

We know from elementary quantum mechanics that a MASSLESS, spin-1 boson has TWO polarization states. However, the vector potential has FOUR polarization states (it has a vector index). Therefore there are two (2) too many degrees of freedom in the field to describe the physical particle. Gauge invariance is precisely the statement that you can always turn two of these (unphysical) polarizations off, which is exactly what you need.

Nevertheless, SR always singles out a "preferred" gauge: The one where
the vector potentials transform like vectors. Other gauges get quite nasty
in this respect, with instantaneous Coulomb forces and so on.

There is no "physical" longitudinal polarization state for massless particles
because the effects of the longitudinal $A^z$ and temporal polarization $A^o$
always cancel each other: $-\partial_z A^o-\partial_o A^z=0$

Regards, Hans.

Last edited:
first, just a point of terminology: I thought when someone talks about a gauge invariance, that means a local symmetry, so saying a local gauge invariance would be redundant. Is this correct?

Malawi: But is there any deep reason why an observer should not be able to choose the same phase everywhere?

Genneth: I am sorry but I am not sure I fully understood everything you said in the first paragraph. It seems that you saying that because of relativity we cannot assume that we can compare quantities at different spacetime points without specifying a connection. But if we simply choose the same gauge everywhere, we know there is no need for a connection. So the we come back to the original question: why can't we just choose the same gauge everywhere (other than the fact that it gives us interactions and other nice stuff)? Again I am sorry if I have just entirely misunderstood your post, please be patient with me.

Atyy: I went through those notes you linked. I have quite a lot to say about this, it seems to be coming from a different perspective.

Basically t'hooft says that if the gauge field was a total derivative of some function, then the magnitude of the field itself could be arbitrarily large, but we would get no contribution to the energy because $$F_{\mu\nu}=0$$. He says that it is unacceptable to have an arbitrarily strong field around that makes no contribution to the energy.

He goes on to say that there is only one way to cure this problem, and that is to make sure the replacement $$A_{\mu} \rightarrow A_{\mu} + \partial_{\mu}\Lambda$$ does not affect the physics.

Lets suppose for a moment that we didnt enforce this condition. Then we could write a mass term for the gauge fields, but as a side effect we would get new fields $$\Lambda$$ which don't correspond to anyhting in nature, and we would also have large fields flying around that make no energy contribution, so that's no good

But if we enforce this, we cannot write a mass term for the gauge field, which is needed for W and Z, because it not invariant. We therefore have to introduce mass term via the Higgs field. However, the gauge fields acquire their mass through the covariant derivative in the Higgs sector - ie we must replace normal derivative by covariant derivative in at least the Higgs sector to get a mass term for the gauge fields.

We can also replace the normal derivative in the fermion sector with covariant derivatives, and this of course leads to interactions.

But of course replacement of normal derivatives by covariant derivatives wasnt required by the condition t'hooft specified initially, which was the physics should be invariant under $$A_{\mu} \rightarrow A_{\mu} + \partial_{\mu}\Lambda$$. It is conceivable that the bosonic sector could have this local symmetry in order to avoid the large fields that make no energy contribution, but the Higgs and fermion sectors don't have this symmetry, rather they have the global version. Of course there would be plenty of problems in such a theory, like no gauge field mass, and no interactions, but I think we are again right back at the start of my question, but maybe slightly modified. That is:

I can accept we should have invariance under $$A_{\mu} \rightarrow A_{\mu} + \partial_{\mu}\Lambda$$ to avoid the large but 0 energy fields t'hooft talks about, but there is still no deep reason to extend this symmetry to the rest of the standard model by replacing derivatives with covariant derivatives other than it produces some nice effects. But I already knew about the nice effects - my question was really getting at the deeper reason for local symmetry, which I still don't think t'hooft addresses.

I must admit that I have never heard GLOBAL gauge invariance, but LOCAL gauge invariance I've heard frequently (e.g in the book Gauge Field Theories by Frampton).

We should require causality, that one can not interfere outside the Light Cone.

Another, obvious, reason is that WHY should my gauge be the same as the gauge choosen by an observer on a planet in the andromeda galaxy? The space-time dependence of the phase is quite intuitive I think -> Physics should not depend on whether I choose phase_1 and you choose phase_2.

Bobhawke: the connection is the vector potential --- so it is precisely the vector potential that allows us to compare values at different spacetime coordinates. In GR, the vector potential are the Christoffel symbols; in EM, they are the A-fields; in non-abelian gauge theories, the corresponding endomorphism valued 1-forms (usually called A too). The gauge freedom comes from the fact that there is not, a priori, a good way to put a coordinate system on the space of connections. Picking a specific gauge is then analogous to picking a particular basis for a vector space.

Malawi, I don't think choosing a global phase violates causality. As you said the phase is unobservable, and any choice gives the same physics. Choosing a phase doesn't cause anything at all, so how can it violate causality?

I think maybe I am looking for a deeper explanation than actually exists. I guess it comes down to there being no reason why we should have the same phase at every point. I suppose I am very used to working with global group transformations, so when a local one pops up I immediately want to know why, whereas if I had been working with local group transformations for a long time I'd probably have no problem with it. And then of course there is fact that local symmetry has some very nice consequences.

Anyways thanks for taking the time to reply everyone!

I was struggling with the same question for many years. Finally, I decided that all "physical" reasons proposed to justify the local gauge invariance are not convincing. In my opinion, this is just a formal mathematical trick whose only real value is in providing relativistically invariant and renormalizable interaction operators in QFT. Somewhat miraculously, these interactions do agree with what's going on in nature. So, I agree when you say:

Bobhawke said:
And then of course there is fact that local symmetry has some very nice consequences.

However, if you decide to take the local gauge invariance seriously, you invite a lot of confusion.

Bobhawke said:
Malawi, I don't think choosing a global phase violates causality. As you said the phase is unobservable, and any choice gives the same physics. Choosing a phase doesn't cause anything at all, so how can it violate causality?

I think maybe I am looking for a deeper explanation than actually exists. I guess it comes down to there being no reason why we should have the same phase at every point. I suppose I am very used to working with global group transformations, so when a local one pops up I immediately want to know why, whereas if I had been working with local group transformations for a long time I'd probably have no problem with it. And then of course there is fact that local symmetry has some very nice consequences.

Anyways thanks for taking the time to reply everyone!

But observers can communicate.

Anyway I think you have found the essence now "I guess it comes down to there being no reason why we should have the same phase at every point."

* Maybe I am misinterpreting the original questioner, if so, just ignore this post.

But maybe he is raising the general question of the naure of symmetry arguments? In that general sense, the question so why gauge invariance is no difference than why lorentz invariance, why diffeomorphism invarance.

It was briefly mentioned in the last post of this thread.

"Are there evidences of a discrete space-time"

malawi_glenn said:
The space-time dependence of the phase is quite intuitive I think -> Physics should not depend on whether I choose phase_1 and you choose phase_2.

An interesting note is that in this picture, "the physics", is somehow defined as the connection or relation between the views, or choices of different observers.

But do we not need indeed, need another observer to realize these relation? Ie. the views of the difference observers still need to be communicated to a third observer, before the relation or non-relation can be established. In particular does it suggest that the "choices" of phases, really aren'y arbitrary either. The choice a particular observer makes, may be a result of physical processes.

I think this really blurs the distinction malawi_glenn tries to make in his example. Don't get me wrong, I understand what he says, but you can I think question this one step further.

The way I see symmetry arguments is as the way which nature in fact makes progress and it drives evolution. In that sense, symmetries need not be "perfect" to be useful. Even imperfect symmetries exists. One could probably even argue that perfect symmetries are trivial. It's the process of emergence and breaking of symmetries that is the interesting part.

Not sure if I missed the point or not. I don't disagree with the standard motivations, I just meant to suggest a possible way to interpret the question further, since I got a feeling the OP was not please with the answers. And I agree that there is more here to understand. But this applies to symmetries in general.

I think the key to appreciate the fuzz is to note that whatever relations between observers we picture. IF we are to not take off into bird views and realist illusions, we should constrain the PROCESS of establishing a relation, or sets of relations, (an ultimately symmetries) from interaction histories.

/Fredrik

Fra said:
An interesting note is that in this picture, "the physics", is somehow defined as the connection or relation between the views, or choices of different observers.

But do we not need indeed, need another observer to realize these relation? Ie. the views of the difference observers still need to be communicated to a third observer, before the relation or non-relation can be established. In particular does it suggest that the "choices" of phases, really aren'y arbitrary either. The choice a particular observer makes, may be a result of physical processes.

This is as far as I know how one for instance motivates the equivalence principle in general relativity - an observer being accelerated at 9.8 m/s^2 should be able to perform the same experiments with the same physical outcome as an observer being fixed on earth.

Same, observer-arguments, one often encounters when discussing translational, mirror-, rotation and time-invariance of physics.

malawi_glenn said:
This is as far as I know how one for instance motivates the equivalence principle in general relativity - an observer being accelerated at 9.8 m/s^2 should be able to perform the same experiments with the same physical outcome as an observer being fixed on earth.

Same, observer-arguments, one often encounters when discussing translational, mirror-, rotation and time-invariance of physics.

Yes, the argument is essently universal. And it's what I also meant, with that part of the question of the OP, is general. It applies to observer-observer symmetries in general.

But the typical arguments simplies, or ignorse, the physical process whereby the observers results are communicated to one place, and compared. And how this is also a physical process. This ultimately suggests that the estalblished relation, is subject to a similar argument as is the single observations from different observers.

I think this reasoning is powerful, but it is not 100% certain, or deductive. Sometimes the impression is often given that the symmetries, when used as constraints, are universal and eternal. That's exactly what does not make sense.

If you have that in mind, it's easy to object to the standard arguments. But the objection doesn't suggest that the argumetns are wrong, only that they are imperfect, and that the nature of physical law might not be eternal.

/Fredrik

Fra said:
Yes, the argument is essently universal. And it's what I also meant, with that part of the question of the OP, is general. It applies to observer-observer symmetries in general.

But the typical arguments simplies, or ignorse, the physical process whereby the observers results are communicated to one place, and compared. And how this is also a physical process. This ultimately suggests that the estalblished relation, is subject to a similar argument as is the single observations from different observers.

I think this reasoning is powerful, but it is not 100% certain, or deductive. Sometimes the impression is often given that the symmetries, when used as constraints, are universal and eternal. That's exactly what does not make sense.

If you have that in mind, it's easy to object to the standard arguments. But the objection doesn't suggest that the argumetns are wrong, only that they are imperfect, and that the nature of physical law might not be eternal.

/Fredrik

We can never make pure deductive claims of nature anyway. We don't prove that the speed of light is constant, it is a premiss - a postulate. Same with these symmetries, we impose them, since we think that these symmetries SHOULD be 'valid' and physical.

There is a funny thought experiment :) Conside that you, orbit earth, and make observations on scientists on earth. You see them interact with each other, as well as with the environment. Occasionaly they fire stuff out into space.

Your job is to try to understand the "laws of physics" that governt the interacting scientists.

One then realizes something interesting, that the scientist behaviour are not only determined by the laws of physics as YOU know it, it is also controlled by what the scientists understanding of law of physics, and that this understanding evolves.

It is like what they say, you ask infere what someone knows, from what kind of questions they ask. Similarly with the scientists.

Similarly I think with nature in general. Even scientists fiddling around are natural phenomena. Only that they are complex systems. The notion of absolute eternal law and symmetry is a realist abstraction, because it always takes an inside observer to infere it. And this inference is a physical process.

From thinking about that, I think some of the critics towards the symmetry reasoning as deductive reasoning is better understood.

On one plane, I don't think anyone can seriously object ot some of the points here, yet the insights that should come from this, are not (yet) visible in the way we think physical nature works. I think the standard response is that a lot of people see no reason to think that that laws of nature, as in laws of physics, and the laws of inference as executed by scientists have any relation whatsoever.

/Fredrik

malawi_glenn said:
We can never make pure deductive claims of nature anyway. We don't prove that the speed of light is constant, it is a premiss - a postulate. Same with these symmetries, we impose them, since we think that these symmetries SHOULD be 'valid' and physical.

Yes I agree. But do you also then agree that physical laws are not eternal? The "inferece of law" are subject to roughly the same cricits as the relationalism of spacetime. LAW can only be compared, and judged by "interactions" between two scientists? :)

Somehow, that's undeniably what we have. YET, this insight is not seen in our theories. I find this odd.

But should we not, further question, the physics behind "why we think that these symmetries are valid". I think we should. And I think the same question applies to subatomic level as well.

At least "half" of this is already present in QM. The fact that systems interact, as if they act upon the information they have about the other parts. (think the scientist on Earth analogy). It's just that the full relationalism is not yet implemented in quantum theory.

I see this open problem closely related to the question of the nature of symmetry.

/Fredrik

law is description, what we impose to be "laws" are just our descriptions of observed phenomena.

So you are suggesting that one should DERIVE those symmetry operations? What kind of premisses should one then adopt? And why should those premisses be of a more fundamental kind then the symmetry operations themselves?

malawi_glenn said:
So you are suggesting that one should DERIVE those symmetry operations? What kind of premisses should one then adopt? And why should those premisses be of a more fundamental kind then the symmetry operations themselves?

I think this is an open question, and I don't have a complete answer.

My main point here is first of all to insist that we ask some different questions. I don't have the answers yet.

But briefly, the route I envision is to explore the analogy that others have expressed, for example Ariel Caticha, that the laws of physics are closely related to the laws of inference, and that the symmetries that are manifest in nature (ie the symmetries that WE have infered) are a result of self-organisation, similar to biological evolution, but that goes down to the lowest level of nature.

The first step could be to abstract an evolving subsystem of the universe, and it's actions with an unknown environment. Then picture what happens if these are interacting. This provides a mutual selection for mutual consistency. And the result is a kind of local objectivity.

Smolin picture evolving law, through black holes. I think there is another way, but the general idea is similar.

So, are there any meta-laws that governs the evolution of law? Smoling has not answered this clearly (judging from a talk he made, that is available as mp3). I think there are no objective meta laws.

Then the idea after that, is to start with the simplest possible non-trivial observer. And see, what OBSERVABLE symmetries that emerge, as the observers complexity scales up.

There should be predictions from this. Wether they will end up right or wrong I don't know. Even if I have great faith in the idea, the ultimate justification is wether this solves problems more efficiently than other approaches.

At this point, I'm just "interacting with others".

Why this is better than fundamental symmetries? As I see it it's simply because in a certain
sense this could be interpreted as another symmetry - symmetry of symmetries. This is the sense in which a symmetry-liker could see the point.

Except I don't really see the symmetry of symmetries as something definite, but conceptually it's what it is almost like.

/Fredrik

Also the greatest utility as I see of this idea, is not that you deduce everything from abstract thinking. The theories interaction with the environment is essential. So the "justification" and selection, is the feedback from experiment. the idea I suggest lies more at the level of "describing the life of a theory", and then you associate a theory with one observer, a material observer even. Thus the emergence of matter as an self-organising process should be another abstraction.

We still need an ordinary scientific process of course, but I see this as the next logical step in revolutionizing theory building. And I think it will help in the quantum gravity quest.

/Fredrik

Fra said:
Why this is better than fundamental symmetries? As I see it it's simply because in a certainsense this could be interpreted as another symmetry - symmetry of symmetries. This is the sense in which a symmetry-liker could see the point.

I think the exploit is that, when you constrain this picture to low-complexity observer, I picture a combinatorial approach. And there is a BOUND to what symmetries that can be distinguished from the inside observer. So we somehow get around the problem of choosing initial conditions, because the observable state space of intital conditions shrink to triviality as the observers loose complexity.

If you object to this, then my defence is that it's just the best idea I have at the moment. Which somehow settles my direction, regardless of wether I'll later change my mind.

/Fredrik

Bobhawke said:
Is there some deeper reason for demanding gauge invariance other than that it allows us to include interactions between the gauge field and the fermions?

I have seen people claim that it is "in keeping with the spirit of relativity" but I wasnt entirely sure what was meant by that.

thanks

The answer is simple: in order to get the known Maxwell-Lorentz equations expressed via field tensions, which are gauge invariant.

The roots of such "demandings" or "postulates" is in our reversed way of the equation "deriving". Many think that there are some more "fundamental" things than the equations themselves.

Normally the physical equations (Newton, Maxwell, diffusion, etc.) were formulated as the experimental data generalizations. Then it was realized that they could be "derived" from some mathematical "demandings". I take the word "derived" in the double commas on purpose: there is always an "extra" dust apart from the equations obtained in this way, which should be handled additionaly. The simplest example is the principle of the least action for the Newton equations. The simplest Newton theory is a second order equation furnished with the initial position r(t1) and velocity r_dot(t1) data. This is a physical problem formulation.

When you "derive" the equations from the least action principle, you arrive at the same equations furnished with different constant fixing conditions - initial and final positions r(t1) and r(t2). Although mathematically possible, the latter formulation is not physical: nobody knows the future position r(t2). So this kind of deriving brings something extra which was not foreseen. Usually they "close eyes" at it in the following way: in fact, the physicists replace the "future" data r(t2) with the initial velocity r_dot(t1) to return to a physical problem formulation. That means abandoning the "least action" principle.

Such kinds of "abandonings" happen all the time in "mathematical" derivations of equations.
To arrive at the Maxwell-Lorentz equations from "the first principles", you have to demand the Lagrangian gauge invariance if it is expressed via the four-vector potential. This is a restriction to the Lagrangian L(A) because the vector potential is not as "fundamental " as the field tensions. A is somewhat convenient but not unambiguous.

Even with this requirement (gauge invariance, Lorentz covariance) you get some other problems here - divergences due to automatically introducing the self-action. It is so because the interactions term jA is not correctly "guessed" in this "theoretical" or "mathematical" approach. Renormalizations is nothing but discarding some perturbative contributions. This means again abandoning the original sense of the fundamental constants or the original equations "derived" in the "theoretical" approach.

It looks OK if the "theory" happens to be renormalizable and if it works (like QED), but there are non renormalizable "theories" derived in the very "theoretical" way. Also there are renormalizable theories that do not describe the physical phenomena. So the requirement of renormalizability is not fundamental either.

Good physical theory does not need renormalizations. I have tried to explain how we had got into this trap in my paper "Reformulation instead of Renormalizations", available at http://arxiv.org/abs/0811.4416. There I also outlined the correct, in my opinion, formulation that is free from the self-action "demand". In this formulation QED is finite and much more physical.