# A Short Proof of Birkhoff’s Theorem

Birkhoff’s theorem is a very useful result in General Relativity, and pretty much any textbook has a proof of it. The one I first read was in Misner, Thorne, & Wheeler (MTW), many years ago, but it was only much later that I realized that MTW’s statement of the proof does something that, strictly speaking, is not mathematically correct. (In fact, this something is not limited to their proof of Birkhoff’s theorem; they do it throughout the book.) MTW write the general line element for a spherically symmetric spacetime as follows:

[tex]ds^2 = – e^{2 \Phi} dt^2 + e^{2 \Lambda} dr^2 + r^2 d\Omega^2[/tex]

where [itex]\Phi[/itex] and [itex]\Lambda[/itex] are, in general, functions of both [itex]r[/itex] and [itex]t[/itex], and [itex]d\Omega^2[/itex] is the standard metric on a unit 2-sphere. MTW note carefully that, to be fully general, the signs of the [itex]dt^2[/itex] and [itex]dr^2[/itex] terms cannot be restricted; but the way the metric is written, with exponentials in those coefficients, the signs *are* restricted, because all of the functions involved are real-valued.

What I’m going to do in this post is re-do the proof that MTW give of Birkhoff’s theorem, but without this deficiency. Doing this is simple; we start by re-writing the above line element in a way that clearly sets no limitation on the signs of the [itex]dt^2[/itex] and [itex]dr^2[/itex] terms:

[tex]ds^2 = j(r, t’) dt’^2 + k(r, t’) dr^2 + r^2 d\Omega^2[/tex]

(I have used [itex]t'[/itex] as the first coordinate for reasons which will become apparent below.) Similar to the above, the first two coefficients, [itex]j[/itex] and [itex]k[/itex], can, in general, be functions of both [itex]r[/itex] and [itex]t'[/itex]. MTW go into some detail in showing how any spherically symmetric spacetime can be described by a line element of this form, and I’m not going to redo any of that, but just take it as established.

The thing to do now is to compute the Einstein tensor of the above metric and apply the vacuum Einstein Field Equation; it will turn out that we only need to look at the components that involve [itex]t'[/itex] and [itex]r[/itex]:

[tex]G^{t’}{}_{t’} = \frac{r \partial k / \partial r + k^2 – k}{k^2 r^2} = 0[/tex]

[tex]G^{t’}{}_r = \frac{\partial k / \partial t’}{k^2 r} = 0[/tex]

[tex]G^r{}_{t’} = \frac{\partial k / \partial t’}{j k r} = 0[/tex]

[tex]G^r{}_r = \frac{r \partial j / \partial r – j k + j}{j k r^2} = 0[/tex]

The second and third equations show that [itex]k[/itex] is a function of [itex]r[/itex] only. Given that, the first equation can be solved for [itex]k[/itex]; the easiest way to do it, based on the principle that when working the solution of a problem, it helps to already know the answer, is to try the ansatz [itex]k = r / ( r – 2m )[/itex], where [itex]m[/itex] is a constant, and find that it solves the equation. So we have now shown that our metric can be written:

[tex]ds^2 = j(r, t’) dt’^2 + \frac{dr^2}{1 – 2m / r} + r^2 d\Omega^2[/tex]

where now we only have one undetermined function left, [itex]j[/itex]. The easiest way to solve for it is to look at the [itex]G^r{}_r[/itex] equation. (We could also look at the [itex]G^\theta{}_\theta[/itex] or [itex]G^\phi{}_\phi[/itex] equations, which are both identical, but they are also considerably more complicated, and contain no additional information.) That equation now looks like this:

[tex]G^r{}_r = \frac{\partial j / \partial r ( r^2 – 2 m r) – 2 j m}{j r^3} = 0[/tex]

Once again, it helps to know the answer; the ansatz [itex]j = – ( 1 – 2m / r) f(t’)[/itex] solves the above equation, where [itex]f(t’)[/itex] is an arbitrary function of [itex]t'[/itex] (though it should be noted that it must be positive, i.e., [itex]f(t’) > 0[/itex] must hold for all [itex]t'[/itex]). So now we have shown that our metric can be written in this form:

[tex]ds^2 = – \left( 1 – \frac{2m}{r} \right) f(t’) dt’^2 + \frac{dr^2}{1 – 2m / r} + r^2 d\Omega^2[/tex]

(Btw, you may have noticed that the leading minus sign in the ansatz for [itex]j[/itex] above could have been eliminated, and it would still solve the [itex]G^r{}_r[/itex] equation. However, since the line element as a whole has to be Lorentzian, the signs of [itex]g_{tt}[/itex] and [itex]g_{rr}[/itex] must be opposite, and that requires the minus sign in front of the expression [itex]( 1 – 2m / r )[/itex] in [itex]g_{tt}[/itex]. Note that there is no similar freedom in choosing the sign in front of the expression [itex]( 1 – 2m / r )[/itex] in [itex]g_{rr}[/itex]; only the positive sign, as given here, gives a valid solution of the [itex]G^{t’}_{t’}[/itex] equation shown above.)

The final step is to deal with that arbitrary function of [itex]t'[/itex] in the first coefficient. None of the components of the EFE constrain it at all; but what that actually means is that we can re-scale the time coordinate however we want to; in particular, we can adopt a new time coordinate [itex]t[/itex] given by

[tex]dt = \sqrt{f(t’)} dt'[/tex]

(The fact that we take a square root here is why the function [itex]f(t’)[/itex] must be positive–more precisely, that plus the fact that we must have [itex]dt[/itex] nonzero whenever [itex]dt'[/itex] is nonzero.) With this change of coordinates, we now have the line element in the standard Schwarzschild form, which completes the proof of Birkhoff’s theorem:

[tex]ds^2 = – \left( 1 – \frac{2m}{r} \right) dt^2 + \frac{dr^2}{1 – 2m / r} + r^2 d\Omega^2[/tex]

So we have shown that we can write the line element in a form that is entirely independent of the [itex]t[/itex] coordinate. In other words, we have shown that any spherically symmetric, vacuum spacetime must have an extra Killing vector field, [itex]\partial / \partial t[/itex], over and above the three KVFs that it has by virtue of spherical symmetry. But it is important to stress that we have *not* shown that the coordinate [itex]t[/itex] is a “time” coordinate; there is nothing in the above that requires [itex]t[/itex] to be timelike. The above derivation is valid for any value of [itex]r[/itex] except [itex]r = 0[/itex] (which won’t work because [itex]r[/itex] appears in the denominator of the EFE components) and [itex]r = 2m[/itex] (because the solution for [itex]g_{rr}[/itex] becomes singular there). In particular, it is valid for [itex]r < 2m[/itex], and when [itex]r < 2m[/itex], the signs of [itex]g_{tt}[/itex] and [itex]g_{rr}[/itex] switch, so the [itex]t[/itex] coordinate is not timelike there, even though [itex]\partial / \partial t[/itex] is still a Killing vector field (because the metric is still independent of [itex]t[/itex]).

Nice work Peter!

Nice job. For comparison, I have a proof in my GR book [URL]http://www.lightandmatter.com/genrel/[/URL] , section 7.4, but it’s really very similar. Some other statements and proofs of the theorem that I’ve seen:

Birkhoff’s original proof, in Birkhoff, Relativity and Modern Physics, 1923. A horrible, long monstrosity with an out of date attitude toward the significance of coordinates.

Hawking and Ellis: “Any C^2 solution of Einstein’s empty space equations which is spherically symmetric in an open set V, is locally equivalent to part of the maximally extended Schwarzschild solution in V.” The part about “maximally extended” is a good point — I always tend to think about just part of the Schwarzschild spacetime (2 of the 4 regions) and forget that it can be extended.

[URL]http://arxiv.org/abs/gr-qc/0408067[/URL] — “Schwarzschild and Birkhoff a la Weyl,” Deser and Franklin. Birkhoff’s thm is equivalent to proving that the m in the Schwarzschild metric is constant.

As you point out, the existence of the ##partial_t## Killing vector doesn’t mean that the spacetime is static. However, it *is* asymptotically static, which is kind of the only nontrivial thing being proved. If we knew in advance that it was asymptotically static, then Birkhoff’s theorem would amount to no more than the usual derivation of the Schwarzschild metric. Essentially we’re seeing that there’s no such thing as gravitational monopole radiation.

For a really rigorous proof, I think one needs to deal with the possibility that the metric coefficients blow up or go to zero, and show that these would be only coordinate singularities– but I don’t do that either, just mention it in a footnote.

[QUOTE=”bcrowell, post: 5202410, member: 211768″]Nice job.[/QUOTE]

Thanks!

[QUOTE=”bcrowell, post: 5202410, member: 211768″]For a really rigorous proof, I think one needs to deal with the possibility that the metric coefficients blow up or go to zero[/QUOTE]

Yes, this is true, and IIRC MTW do deal with this in their proof. I think what they do still goes the same way once the wart in their proof is removed as I (and you) remove it. :wink:

Very nice indeed, but you can save MTW’s ansatz (which I never understood, why it is made in so many textbooks and not your way, which is more general) by allowing the exponents to become complex. Since the exponentials must be real in the pseudometric, this amounts to adding ##2 n pi mathrm{i}## (which changes nothing) or ##(2n+1) pi mathrm{i}## with ##n in mathbb{Z}##. Then the Minkowskian signature of the metric constrains these possibilities to the solutions you gave. Of course, it’s much more simple to just use your ansatz and staying with real quantities all the time during the derivation.

Here’s a question: How does this compare with Schwarzschild’s original derivation? Is the only difference that Schwarzschild started out assuming a time-independent solution, while this derivation proves that time-independence follows from the assumption of spherical symmetry?

[QUOTE=”stevendaryl, post: 5208165, member: 372855″]Is the only difference that Schwarzschild started out assuming a time-independent solution, while this derivation proves that time-independence follows from the assumption of spherical symmetry?[/QUOTE]

No, although that’s one difference. The other, more important difference is that Schwarzschild’s original derivation used different coordinates; his radial coordinate, which I’ll call ##rho##, was defined in such a way that ##rho = 0## corresponded to the horizon, not the singularity at what we now call ##r = 0##. This led to several decades of confusion because it was not fully appreciated that (a) a given coordinate chart might not cover all of a given spacetime, and (b) coordinates in themselves have no physical meaning; the physics of any solution is contained in the invariants.

Here’s a previous PF discussion on Schwarzschild’s original solution:

[URL]https://www.physicsforums.com/threads/schwarzschilds-metric-1916.708045/page-5[/URL]

As you’ll see from some of the links and posts in this thread, there are still people who make the same error that Schwarzschild originally made, thinking that if they defined a “radial coordinate” that had value zero at the horizon, that somehow meant that, physically, there couldn’t be any other region of spacetime beneath the horizon.

[QUOTE=”PeterDonis, post: 5208206, member: 197831″]No, although that’s one difference. The other, more important difference is that Schwarzschild’s original derivation used different coordinates; his radial coordinate, which I’ll call ##rho##, was defined in such a way that ##rho = 0## corresponded to the horizon, not the singularity at what we now call ##r = 0##. This led to several decades of confusion because it was not fully appreciated that (a) a given coordinate chart might not cover all of a given spacetime, and (b) coordinates in themselves have no physical meaning; the physics of any solution is contained in the invariants.[/QUOTE]

Thanks. With the hindsight of 100 years of GR, it’s hard to get back into the frame of mind that glosses over the distinction between coordinate-dependent features of a solution and physically meaningful features.