General Relativity as a Gauge Theory

[Total: 10    Average: 3.9/5]

The fundamental interactions of the Standard Model are described by Yang-Mills theory. This is a gauge theory, which means the following: (1) Choose a group of global i.e. spacetime-independent symmetries and write down the Lie algebra which generates it, (2) ‘Gauge’ this algebra, i.e. make the symmetries spacetime-dependent, and (3) Introduce gauge fields which compensate for the local transformations. The local symmetries are realized on the gauge fields by the adjoint representation, and these gauge fields mediate the interactions. The other fields which are affected by these interactions then sit in other representations, like the fundamental one. In the Standard Model the groups of symmetries which one uses to describe electromagnetic, weak and strong interactions are U(1), SU(2) and SU(3). But how about that other (fundamental?) interaction called gravity?

Gravity is described by Einstein’s theory of General Relativity (GR). There one also deals with local symmetries, namely general coordinate transformations (GCT’s) in spacetime and local Lorentz transformations (LLT’s) in the tangent spacetime. It turns out that this theory can also be described as a gauge theory, albeit different from Yang-Mills. So in what follows we’ll look at this claim, the differences with Yang-Mills theory and the central role of ‘removing the local translations’.

The global symmetries one has when gravity is ‘switched off’ are the Poincaré symmetries of spacetime. This flat spacetime has Minkowski metric [itex]\eta_{AB}[/itex], and the symmetries are generated by spacetime-translations [itex]P_A[/itex] and spacetime-rotations (i.e. Lorentz transformations) [itex]M_{AB}[/itex]. On the coordinates [itex]\{x^{A}\}[/itex] of spacetime (which can be identified with the tangent space) they act infinitesimally like
\delta_P x^A & = \zeta^A \,, \nonumber\\
\delta_M x^A & = \lambda^A{}_B x^B \,. \label{infpoincoord}
These transformations keep the spacetime interval invariant. Here [itex]. A,B,\ldots[/itex]. are running from 0 to (D-1) and are tangent-space indices. Just to keep notation a bit tractable, I won’t be bothered too much with whether these flat indices should be up or down. The Poincaré transformations \eqref{infpoincoord} generate the Poincaré algebra, as you can check:
[P_{A},P_{B}] & = 0 \,, \nonumber\\
[M_{BC},P_{A}] & = -2\eta_{A[B}P_{C]}\,, \nonumber\\
[M_{CD},M_{EF}] & = 4 \eta_{[C [E}M_{F]D]} \,. \label{Poincarealgebra}
The [itex] [AB][/itex] means antisymmetrization with weight one such that e.g. for an antisymmetric tensor [itex]F_{AB}[/itex] one has [itex]F_{AB} = F_{[AB]}[/itex]. From this one obtains the non-zero structure constants
f_{\{BC\}A}^D = -2 \eta_{A[B}\delta_{C]}^D, \ \ \ f_{\{CD\}\{EF\}}^{\{AB\}} = 4\eta_{[C[E}\delta_{F]}^{[A} \delta_{D]}^{B]}\,.
So now we gauge this algebra. In this gauging we first treat the transformations as acting on an abstract, internal space which only later on (!) will be identified as the tangent space. The ‘translational’ gauge field is (suggestively) called [itex]e_{\mu}{}^A[/itex] while the ‘rotational’ gauge field is [itex]\omega_{\mu}{}^{AB}[/itex]. Here the curved indices [itex]\mu\nu\ldots[/itex] indicate spacetime indices. These fields sit in the adjoint representation of the algebra \eqref{Poincarealgebra}. The non-zero local Poincaré symmetries are then realized as follows:
\delta_P e_{\mu}{}^A & = \partial_{\mu}\zeta^A – \omega_{\mu}{}^{AB} \zeta^B , \ \ \delta_M e_{\mu}{}^A = \lambda^{AB}e_{\mu}{}^B \,, \nonumber\\
\delta_M \omega_{\mu}{}^{AB} & = \partial_{\mu}\lambda^{AB} + 2 \lambda^{C[A}\omega_{\mu}{}^{B]C}\,. \label{The gauge transformations}
You can check for yourself that these transformations realize the algebra \eqref{Poincarealgebra} on the two gauge fields by calculating the commutators. Besides the gauge transformations the two gauge fields are spacetime vectors, and they transform as such under GCT’s [itex]\delta x^{\mu} = \xi^{\mu}(x)[/itex]:
\delta_{\xi} e_{\mu}{}^A = \xi^{\lambda} \partial_{\lambda} e_{\mu}{}^A + \xi^{\lambda}\partial_{\mu} e_{\lambda}{}^A \,,\label{gcte}
and similar for [itex]\omega_{\mu}{}^{AB}[/itex]. The field strengths of the two gauge fields read
R_{\mu\nu}{}^A(P) & = 2 \Bigl( \partial_{[\mu}e_{\nu]}{}^A – \omega_{[\mu}{}^{AB} e_{\nu]}{}^B
\Bigr) \,,\nonumber\\
R_{\mu\nu}{}^{AB}(M) & = 2 \Bigl( \partial_{[\mu}\omega_{\nu]}{}^{AB} – \omega_{[\mu}{}^{CA}\omega_{\nu]}{}^B{}_C \Bigr) \,. \label{PoinCurvatures}
These field strengths transform covariantly under the gauge transformations, i.e. without derivatives on the parameters [itex]\zeta^A[/itex] and [itex]\lambda^A{}_B[/itex].

Untill now this all seems a bit formal and you can wonder what the relation is between our gauge theory and Einstein’s GR. In GR one has GCT’s and LLT’s but no local translations. So if we could somehow get rid of these local translations, we would be in business. This can be achieved by the following very important constraint:
R_{\mu\nu}{}^A(P) = 0 \,. \label{RPconstraint}
We just put the field strength of local translations to zero! To see why this does the trick, you have to look at the following, also very important, relation
\delta_{\xi}(\xi^{\lambda})e_{\mu}{}^A & +
\xi^\lambda R_{\mu\lambda}{}^A(P)
– \delta_P (\zeta^C = \xi^{\lambda}e_{\lambda}{}^{C})e_{\mu}{}^A \nonumber\\
& – \delta_M (\lambda^{C}{}_D = \xi^{\lambda} \omega_{\lambda}{}^{C}{}_D)e_{\mu}{}^A = 0 \,. \label{VIE}
A similar relation holds for [itex]\omega_{\mu}{}^{AB}[/itex]. You see that if we take field dependent parameters in our gauge transformations of a particular gauge field, there is a relation between all the gauge transformations, GCT’s and the field strength! Similar relations hold in every gauge theory. For instance, in ordinary U(1) Yang-Mills theory with gauge field [itex]A_{\mu}[/itex] and gauge transformation [itex]\delta_{\Lambda} A_{\mu} = \partial_{\mu}\Lambda[/itex] one has
\delta_{\xi}(\xi^{\lambda})A_{\mu} + \xi^{\lambda}F_{\mu\lambda} – \delta_{\Lambda}(\Lambda = \xi^{\lambda}A_{\lambda})A_{\mu} = 0 \,, \label{VIEMax}
as you can check by yourself. So if we would put the field strength in eqn.\eqref{VIEMax} to zero, GCT’s and field-dependent gauge transformations become the same. These field-dependent transformations do not span a U(1) algebra anymore, i.e. they do not commute! Instead, they form what is called a soft algebra, which is an algebra having field-dependent structure ‘constants’. In the U(1)-case this is not so interesting; having just one Abelian field strength, we would make our gauge field pure gauge by the zero field strength constraint. So let’s return to our Poincaré theory. Using the constraint \eqref{RPconstraint} in the relation \eqref{VIE}, we get
\delta_P (\xi^{\lambda}e_{\lambda}{}^{C})e_{\mu}{}^A = \delta_{gct}(\xi^{\lambda})e_{\mu}{}^A
– \delta_M (\xi^{\lambda} \omega_{\lambda}{}^{CD})e_{\mu}{}^A \,. \label{VIE2}
This means that if we can reformulate every gauge parameter [itex]\zeta^C[/itex] as [itex]\xi^{\lambda}e_{\lambda}{}^{C}[/itex], we can effectively rewrite every local translation in terms of the GCT’s and LLT’s! To put it differently, we want to get a one-to-one relation between [itex]\zeta^C[/itex] and [itex]\xi^{\lambda}[/itex] such that we we dump the local translations. This can be achieved by interpreting [itex]e_{\lambda}{}^{C}[/itex] as an invertible matrix. Hence, we introduce ‘inverse’ fields for it:
e_{\mu}{}^A e^{\mu}{}_B = \delta^A_B, \ \ \ e_{\mu}{}^A e^{\nu}{}_A = \delta^{\nu}_{\mu}\ \ \ \,.
As such we can relate
\zeta^A \equiv \xi^{\lambda}e_{\lambda}{}^{A}, \ \ \ e^{\mu}{}_A \zeta^A \equiv \xi^{\mu} \,.
With the relation [itex]\zeta^C \equiv \xi^{\lambda}e_{\lambda}{}^{C}[/itex] in eqn. \eqref{VIE2} we effectively have removed the local translations from our algebra; they can all be written as linear combinations of GCT’s and LLT’s. The fact that this deforms our original Poincaré algebra we started from (the local translations don’t commute anymore) doesn’t bother us. On the contrary: it is expected, since GCT’s generally don’t commute!

The fun is not over yet; the constraint \eqref{RPconstraint} does something else for us. Namely, now we have an inverse for the gauge field at our disposal, we notice that the field [itex]\omega_{\mu}{}^{AB}[/itex] appears algebraically in the constraint with an invertible matrix and that its number of independent components matches the independent number of constraints in \eqref{RPconstraint}. This means we can solve completely for it, namely
\omega_{\mu}{}^{AB} = 2e^{\lambda [A} \partial_{[\lambda}e_{\mu]}{}^{B]} +
e_{\mu}{}^C e^{\lambda\,A} e ^{\rho\,B } \partial_{[\lambda}e_{\rho]}{}^C\,. \label{omega}
So there you have it. Imposing the constraint \eqref{RPconstraint} enables us to (1) ‘remove’ the local translations from our algebra and (2) makes the field [itex]\omega_{\mu}{}^{AB}[/itex] dependent, leaving us with [itex]e_{\mu}{}^A[/itex] as only independent field. Now a couple of remarks are in order here.

First, we are tempted to identify [itex]e_{\mu}{}^A[/itex] with the vielbein and [itex]\omega_{\mu}{}^{AB}[/itex] with the spin connection of GR. This means our indices A,B,… label the tangent space or ‘inertial’ indices. It’s okay to succumb to this temptation. The constraint \eqref{RPconstraint} then simply says that torsion vanishes, which we know is true in GR.

Second, we see that we did two things which looks quite alien to Yang-Mills theories: we’ve put one field strength to zero and we introduced an ‘inverse gauge field’ to remove the local translations.

Third, you might worry about the transformation of the spin connection, now being a dependent field. Worry not: the variations of the constraint \eqref{RPconstraint} must also be zero, which indicate that the spin connection still transforms under LLT’s as \eqref{The gauge transformations} dictates. If you like tedious calculations, this can also be checked by applying the LLT [itex]\delta_M e_{\mu}{}^A = \lambda^A{}_B e_{\mu}{}^B[/itex] directly to the solution \eqref{omega}. Dito for the GCT \eqref{gcte}.

Fourth: how are solving for the spin connection and removing the local translations from the vielbein related? They both follow from our constraint \eqref{RPconstraint} but seem two completely unrelated things! Their relation however is obscured by the fact that we applied the gauging procedure to the Poincaré algebra. This algebra can be obtained from the (Anti)deSitter algebra ((A)dS) by a so-called Inönü-Wigner contraction, i.e. a degenerate transformation on the Lie algebra which makes some structure constants to vanish. In this (A)dS algebra, the commutator [P,P] = 0 of the Poincaré algebra gets replaced by
[P_A, P_B] = \Lambda M_{AB} \,, \label{Sit}
where [itex]\Lambda[/itex] is the cosmological constant. As a result, the transformations \eqref{The gauge transformations} now change: the a priori independent spin-connection now also transforms under local translations. At first sight this looks like a disaster. Because by writing down the analog of \eqref{VIE2} for the spin connection you could think that in order to remove the local translations from the spin connection, we now also have to put the rotational field strength to zero. However, this is not necessary. The constraint \eqref{RPconstraint} already removes the local translations from the spin connection by making this field completely dependent on the vielbein. This fact is obscured in the Poincaré algebra, because there the independent spin connection didn’t transform under local translations in the first place. The Inönü-Wigner contraction consists of sending [itex]\Lambda \rightarrow 0[/itex] in eqn.\eqref{Sit}, giving us the Poincaré algebra again.

Fifth, we still have to derive the dynamics. The Einstein-Hilbert action is
S \sim \int e^{\mu}{}_A e^{\nu}{}_B R_{\mu\nu}{}^{AB}(M) |e| d^D x \,,
with [itex]|e|[/itex] the (absolute value of the) determinant of the vielbein. It has the nice property that varying with respect to the spin connection gives us the constraint \eqref{RPconstraint}. This approach to take both the vielbein and spin connection a priori as being independent is known as the Palatini formalism. That makes us happy, because this is exactly what we do in our gauging procedure. However, the Einstein-Hilbert action is linear in the field strength and not quadratic, as for Yang-Mills theory. If one starts from the gauge theory of (A)dS, one can indeed write down an action which is quadratic in the full field strength [itex]R_{\mu\nu}{}^A P_A + R_{\mu\nu}{}^{AB} M_{AB}[/itex] and invariant under parity, GCT’s and LLT’s. Expanding this full field strength gives us a topological term and the Einstein-Hilbert action including the cosmological constant. So we see that the Einstein-Hilbert action can also be obtained from a Lagrangian quadratic in the full field strength. This is explained in ref.[7].

Sixth and last: why do we care? Well, to start with, it gives us a different view on GR. Different views are useful for being ‘a good physicist’, as Feynman once explained in his The Character of Physical Law. But the great strength of this gauging procedure lies in the fact that it can be applied to all kinds of other Lie algebras, giving all kinds of different theories of gravity! The first example is minimal supersymmetry in D=4 described by the super-Poincaré algebra [3,4,5]. This algebra contains a spinorial supercharge [itex]Q[/itex], with (anti)commutators reading schematically [itex]\{Q,Q\} \sim P[/itex] and [itex][M,Q] \sim Q[/itex]. The supercharge gauge field is the gravitino, and we see that [itex]\{Q,Q\} \sim P[/itex] introduces an extra term in the translational field strength. Putting it to zero again to remove the local translations now introduces torsion. A similar thing can be done for the super-AdS algebra, whereas super-dS is not allowed by the Jacobi identities. Other algebras on which the procedure can be applied are (among others) the (super)conformal-, (super)Bargmann-, (super)Newton-Hooke (the non-relativistic version of (A)dS), (super)Schrödinger- and Carroll-algebras. The (super)conformal gives us a method to couple matter to Poincaré Supergravity, the notorious superconformal tensor calculus [6]. The (super)Bargmann-, (super)Newton-Hooke- and (super)Schrödinger-algebras give us (super)Newton-Cartan [8,9], i.e. non-relativistic but general-covariant theories of gravity. The Carroll algebra gives us general-covariant, ultrarelativistic gravity [10]. I haven’t checked it, but I’m pretty sure the gauging procedure as outlined here can also be applied to other deformations of the Poincaré algebra, such as ‘Doubly Special Relativity’ [11].

My outline here stresses the role of the curvature constraint \eqref{RPconstraint}. It’s often called conventional, because one uses it to solve for the spin connection. However, as I understand it, this is a consequence of something deeper: it allows us to remove the local translations from the algebra while maintaining its corresponding gauge field.

For further reading (including both pedagogical and original exposures), see

[1] R. Utiyama, ”Invariant theoretical interpretation of interaction”, Phys. Rev. 101, 1597 (1955),
[2] T.W.B. Kibble,”Lorentz invariance and the gravitational field”, J.Math.Phys. 2 (1961) 212-221, 
[3] S.W. MacDowell, F. Mansouri, ”Unified Geometric Theory of Gravity and Supergravity”, Phys.Rev.Lett. 38 (1977) 739,
[4] A.H. Chamseddine and P.C. West, ”Supergravity as a Gauge Theory of Supersymmetry”, Nucl.Phys. B129 (1977) 39-44
[5] T. Ortin, ”Gravity and strings”, Cambridge Monographs on Mathematical Physics
Cambridge University Press (2015-03-26)
ISBN: 9780521768139, 9780521768139 (Print)
[6] D.Z. Freedman and A. Van Proeyen, ”Supergravity”, Cambridge, UK: Cambridge Univ. Press (2012-05-20)
ISBN: 9781139368063 (eBook), 9780521194013 (Print)
[7] P.G.O. Freund,“Introduction To Supersymmetry”, Cambridge Monographs on Mathematical Physics
Cambridge, UK: Cambridge Univ. Press (2012-05-11)
ISBN: 9781139241939 (eBook), 9780521356756 (Print)
[8] R. Andringa, ”Newton-Cartan gravity revisited”,!null
[9] E.A. Bergshoeff, J. Rosseel and T. Zojer, ”Newton-Cartan supergravity with torsion and Schrodinger supergravity”, JHEP 1511 (2015) 180 (2015-11-25),
[10] J. Hartong, ”Gauging the Carroll Algebra and Ultra-Relativistic Gravity”, JHEP 1508 (2015) 069
[11] G.A. Camelia, ”Relativity in space-times with short-distance structure governed by an observer-independent (Planckian) length scale”, Int.J.Mod.Phys. D11 (2002) 35-60,



24 replies
Newer Comments »
  1. Orodruin
    Orodruin says:

    Nice read.

    Just to nitpick: The gauge group corresponding to electromagnetism is U(1), not SU(1) (SU(1) is the trivial group and therefore not very exciting to build a Yang-Mills theory on). Also, the U(1) in the SM before spontaneous symmetry breaking is not the electromagnetic U(1) (the generator is hypercharge, not charge) and the SU(2) of the SM is not just related to weak interactions. Upon SSB, the U(1)xSU(2) symmetry of the SM is broken into the electromagnetic U(1), whose generator is a linear combination of the hypercharge U(1) and one of the SU(2) generators.

  2. vanhees71
    vanhees71 says:

    Great article. It would be great if you could put the complete references at the end, e.g., for the first oneR. Utiyama, Invariant theoretical interpretation of interaction, Phys. Rev. 101, 1597 (1955)

  3. RockyMarciano
    RockyMarciano says:

    Interesting insight.

    Maybe you could clarify more about the difference between GR gauge and Yang-Mills gauge. You have delved into how to keep GCTs while removing local translations by using a curvature constraint.  One first question is that this curvature constraint comes about through the differential Bianchi identities(that reflect the dependencies that Noether's second theorem for local gauge theories talk about in her seminal 1918 article) in the absence of local curvature source, so it can be viewd as an analogous for instance to the Ward identities in QED that are obtained automatically by applying Noether's second theorem in the context of QFT, so wouldn't you say this is a common point between Yang-Mills and GR gauges rather than the difference?

    My other worry that I'm not completely sure if you have addressed in the insight, is that this constraint, if I have understood correctly what you refer to by it- the constraint leading to vacuum field equations-, is not a general feature of GR though, it only applies to certain isolated central objects situations.  But curvature is indeed a general feature of GR as are the differential Bianchi identities that make clear the impossibility  of using flat coordinates globally in a curved spacetime. So is GR only a gauge theory for those special isolated in vacuum objects cases? But then how are GCTs together with the LLTs kept in the general case when the curvature constraint is not present?. The dependencies and the Bianchi identities are still there, how are the local translations removed in the general case?

  4. haushofer
    haushofer says:

    I'm not sure what you mean, so I hope this answer satisfies. The Bianchi identities for both curvatures imply, with the constraint R(P)=0 and the Vielbein postulate, the usual Bianchi identies for the Riemann tensor. These identities continue to hold when we couple the theory to matter, so I don't see any problem here.

    I think I don't really understand your statement " the curvature constraint only applies to certain isolated central objects situations"

  5. RockyMarciano
    RockyMarciano says:

    [QUOTE="haushofer, post: 5600084, member: 20128"]

    I think I don't really understand your statement " the curvature constraint only applies to certain isolated central objects situations"[/QUOTE]

    Maybe I'm misinterpreting what you mean by R(P)=0? I thought you meant by it a constraint that only applies in vacuum i.e. where Rab=0.

  6. RockyMarciano
    RockyMarciano says:

    I've managed to get a copy of the second referenced paper(Kibble) in order to get some more understanding and maybe get up to speed on the subject of the insights article and what I found at the end of its section 6 is that after the gauging procedure, that includes removing the local translations and writing them as linear combinations of GCTs and LLTs as mentioned in the insight, what is obtained is a gauge theory of gravitation that is not exactly GR but the variant known as Einstein-Cartan-Sciama-Kibble theory, that doesn't have symmetric connection or Ricci tensors because it has torsion that allows to couple with matter with spin. Only then you can set the torsion to zero  and then you get not the full EFE but the vacuum field equations of GR as I thought.

  7. haushofer
    haushofer says:

    Ah, I see. Yes, I haven't stressed it too much, but the idea is that the gauging procedure gives the vacuum equations. As far as I can see those are exactly equivalent to the vacuum equations of GR. After that you can couple matter to your theory.

    It's been a while since I've read that Kibble paper, but I don't see how one can keep torsion while removing the local translations: the R(P) curvature is the torsion. So R(P)=0 puts the torsion to zero, and the Bianchi identities then give you the corresponding symmetries on the Riemann tensor.

    I wrote this Insight after having a discussion with Urs here,

    Maybe you find that interesting too :)

Newer Comments »

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply