Where does the Einstein-Hilbert action come from?

birdhen · Nov 21, 2010

Hi,

I have been looking all over on the internet and I can't find out how the action is derived. I know it is used to derive the field equations, but why does it look the way it does? Why use the determinant of the metric, where does the 1/(16 pi G) come from?

Any help would be great,
thanks!

Phrak · Nov 21, 2010

Whoa! Why is this called the Einstein-Hilbert action? Einstein had nothing to do with it. Am I wrong? In my understanding, Einstein had everything to do with raising the question of the equivalence principle and working it to the very end, yet Hilbert snatched the brass ring away from Einstein in first publishing the stress-energy equation attributed to Einstein.

Einstein, to favor the simplest form of the stress-energy equation among many candidates, obtained the same result as Hilbert--perhaps motivated, in part, by the conclusion of Hilbert, I would suspect. Hilbert, on the other hand, was very missive in taking credit, and insisted that Einstein deserved the credit.

The action is the invention of Hilbert.

I'd like to hear, as well, from anyone, on how Hilbert came up with it. I'm glad you brought up Hilbert. Hilbert's action formulation, in my opinion, needs more attention than it get.

arkajad · Nov 21, 2010

birdhen said:

Hi,
Why use the determinant of the metric, where does the 1/(16 pi G) come from?

Any help would be great,
thanks!

Square root of the determinant is the simplest possible volume form. R is the simplest possible scalar. 1/16 pi G or whatever is needed if you want to get Newton's law for the simplest solution.

Hilbert himself would be busy exclusively with pure mathematics without Einstein.

Daverz · Nov 21, 2010

Einstein did all the heavy lifting, so it makes sense to call it the "Einstein-Hilbert action". You might want to take a look at Pais's book on Einstein.

atyy · Nov 21, 2010

The Hilbert action comes from postulating that gravity comes from making the metric dynamical, and that the dynamical equations come from an action, which is a scalar.

There are more complex terms consistent with this idea, and the Hilbert action is only the simplest. This seems to be enough so far, the other terms probably arise as tiny quantum corrections at low energy, and make the theory perturbatively nonrenormalizable at high energies. (Look up the "Asymptotic Safety" conjecture.)

The 1/(16.pi.G) is just to match units.

haushofer · Nov 21, 2010

You could check Sean Carroll's notes on this :) I would say that any proper book on GR explains the form of the Hilbert action.

Phrak · Nov 22, 2010

Daverz said:

Einstein did all the heavy lifting, so it makes sense to call it the "Einstein-Hilbert action". You might want to take a look at Pais's book on Einstein.

Yes, Einstein did all the heavy lifting on general relativity, but to my best knowledge, the action is Hilbert's. Does Pais say otherwise?

Daverz · Nov 22, 2010

Phrak said:

Yes, Einstein did all the heavy lifting on general relativity, but to my best knowledge, the action is Hilbert's. Does Pais say otherwise?

No. Hilbert was the first to write down the action correctly. My point was that writing down the action depended on Einstein's previous work, so the hyphenated name is justified IMO.

Phrak · Nov 22, 2010

Daverz said:

No. Hilbert was the first to write down the action correctly. My point was that writing down the action depended on Einstein's previous work, so the hyphenated name is justified IMO.

Each to his own. Here's the Voigt-Lorentz transformation for electromagnetic radiation before Lorentz came up with it.
http://en.wikipedia.org/wiki/Woldemar_Voigt"

samalkhaiat · Nov 30, 2010

birdhen said:

Hi,

I have been looking all over on the internet and I can't find out how the action is derived. I know it is used to derive the field equations, but why does it look the way it does?

This is a very good question! Indeed by deriving the action, we can learn a lot about the axiomatic structure of the theory. I do not know why textbooks (including the good ones) shy away from deriving the E-H action? I think it can be done and since it is snowing, I might have just enough time tonight to do for you. I will post the result if I managed to do it.

regards

sam

arkajad · Nov 30, 2010

Perhaps it is worthwhile to notice that a particular form of the action density will depend on what are the independent variables of your theory. Is it just a metric tensor? Or is it a metric and a connection as independent variables? Or, perhaps, only a connection? There are many versions and options. Einstein-Hilbert is, of course, the one that is best known for historical reasons.

samalkhaiat · Dec 5, 2010

samalkhaiat said:

This is a very good question! Indeed by deriving the action, we can learn a lot about the axiomatic structure of the theory. I do not know why textbooks (including the good ones) shy away from deriving the E-H action? I think it can be done and since it is snowing, I might have just enough time tonight to do for you. I will post the result if I managed to do it.

regards

sam

First: some introduction;

Let us assume that all dynamical properties of our space-time are comprised in the metric tensor, [itex]g_{ab}(x)[/itex], which, at the same time, characterizes the behaviour of measuring apparatus. Thus, in a field-like theory, the metric tensor together with its derivatives to finite order can be taken as “dynamical” type variables provided an appropriate SCALAR “Lagrangian” can be constructed out of them:

[tex]\mathcal{L}(x) = \mathcal{L}(g_{ab},g_{ab,c},g_{ab,cd}, \ ...)[/tex]

We always have in mind the group of general coordinate transformations subject to appropriate differentiability conditions (the manifold mapping group of diffeomorphisms). Since tensors form representations of this group, they are natural objects to serve as the building blocks of a generally covariant physical theory. To ensure the general covariance of the resulting theory, we need to form an invariant “action” integral, so that the statement [itex]\delta S[g_{ab}]= 0[/itex] is generally covariant, and so also is the dynamics derived from this statement. However, the integral; [itex]\int_{D} \ d^{4}x \mathcal{L}(x)[/itex], over an invariantly fixed domain D, would not be an invariant as long as [itex]\mathcal{L}(x)[/itex] is a scalar quantity, because

[tex]\int_{D} d^{4}x \mathcal{L}(x) = \int_{\bar{D}} d^{4}\bar{x}\ J(\frac{x}{\bar{x}}) \bar{\mathcal{L}}(\bar{x}) \neq \int_{\bar{D}} d^{4}\bar{x} \bar{\mathcal{L}}(\bar{x})[/tex]

when the Jacobian [itex]J(x/ \bar{x}) \neq 1[/itex]. So we need some quantity [itex]a(x)[/itex] such that

[tex]\int_{D}d^{4}x a(x) \mathcal{L}(x) = \int_{\bar{D}}d^{4}\bar{x} \bar{a}(\bar{x}) \bar{\mathcal{L}}(\bar{x})[/tex]

holds, i.e., [itex]a(x)[/itex] needs to be a scalar density transforming as

[tex]\bar{a}(\bar{x}) = J(\frac{x}{\bar{x}}) a(x)[/tex]

Now, by taking the determinants of both sides of the transformation law of the metric tensor, we find

[tex]\bar{g}(\bar{x}) = J^{2}(\frac{x}{\bar{x}})g(x)[/tex]

Since Lorentz signature implies [itex]g = \det (g_{ab}) < 0[/itex], hence

[tex]\sqrt{- \bar{g}} = J(\frac{x}{\bar{x}}) \sqrt{-g},[/tex]

and our invariant action becomes ([itex]a(x) = \sqrt{-g}[/itex]),

[tex]S[g_{ab}] = \int_{D} d^{4}x \sqrt{- g} \mathcal{L}(x)[/tex]

By varying the action with respect to the metric tensor, we get the following E-L equation of motion;

[tex]\frac{\partial \hat{\mathcal{L}}}{\partial g_{ab}} - \partial_{c}\left( \frac{\partial \hat{\mathcal{L}}}{\partial g_{ab,c}}\right) + \partial_{c}\partial_{d}\left( \frac{\partial \hat{\mathcal{L}}}{\partial g_{ab,cd}}\right) - \ ... = 0 \ \ (1)[/tex]

where [itex]\hat{\mathcal{L}} = \sqrt{-g}\mathcal{L}[/itex].

Deriving the form of the Lagrangian:

This will be based on the following principles;

1)The principle of equivalence:
“At every point in an arbitrary curved space-time, we can choose a locally inertial frame in which the laws of physics take the same form as in a global inertial frame of flat space-time”; at any point p, one can choose a coordinate system such that [itex]g_{ab}(p) = \eta_{ab}[/itex] and [itex]g_{ab,c}(p) = 0[/itex]. The principle states that in the neighbourhood of this point, the physics is Lorentzian.

2) Prejudice:
“The metric tensor obeys a second order partial differential equation”; since almost all of the differential equations of physics are second order, we may regard the above prejudice as a principle and see what it implies.

3)The principle of general covariance:
“The form of physical laws is invariant under the group of general coordinate transformations”. The covariance (form invariance) means that physical laws must be tensorial. To apply the principle, we need a mathematical representation of it. The obvious choice is;

[tex]\eta_{ab} \rightarrow g_{ab}(x)[/tex]
[tex]\partial_{a} \rightarrow \nabla_{a}[/tex]

So, the principle of general covariance represents a technical way to express the transition from SR to GR.
Notice that we have already used the principle of general covariance when we assumed that [itex]\mathcal{L}(x)[/itex] is some unspecified scalar.

Ok, we are ready to do the job.

First notice that [itex]g_{ab}(x)[/itex] would obey a second order differential equation if [itex]\mathcal{L}(x)[/itex] were a function of [itex]g_{ab}[/itex] and [itex]\partial_{c}g_{ab}[/itex] only. But, the principle of equivalence makes it impossible to have a non-trivial scalar function [itex]\mathcal{L}(g_{ab},g_{ab,c})[/itex]; any such function can be made equal to the constant [itex]\mathcal{L}(\eta_{ab},0)[/itex] because, it is always possible to set [itex]g_{ab}=\eta_{ab}[/itex] and [itex]g_{ab,c}=0[/itex] at any point by coordinate transformation. The only way out of this is to let [itex]\mathcal{L}[/itex] to depend on [itex]g_{ab}[/itex] and its frist and second derivatives but demand that [itex]\frac{\partial \mathcal{L}}{\partial g_{ab,cd}}[/itex] be a function of [itex]g_{ab}[/itex] only. According to eq(1), [itex]g_{ab}[/itex] will then satisfy a second order differential equation. So, let us put [itex]\mathcal{L}[/itex] in the form;

[tex] \mathcal{L}(g_{ab},g_{ab,c},g_{ab,cd}) = g_{ab,cd}(x)A^{abcd}(g_{ab}) + B(g_{ab},g_{ab,c})[/tex]

Let us evaluate this Lagrangian in a locally inertial system; At the point [itex]x^{a}=0[/itex], we choose coordinates such that [itex]g_{ab}(0)=\eta_{ab}[/itex] and [itex]g_{ab,c}(0) = 0[/itex]. Hence

[tex]\mathcal{L}(\eta , 0 , \partial^{2} g) = g_{ab,cd}A^{abcd}(\eta) + b \ \ (2)[/tex]

where [itex]b= B(\eta , 0)[/itex].

In a new coordinate system related to the x-system by a Lorentz transformation [itex]x^{a}=\Lambda^{a}{}_{b}\bar{x}^{b}[/itex], we still have [itex]\bar{g}_{ab}=\eta_{ab}[/itex] and [itex]\bar{g}_{ab,c}=0[/itex], but

[tex]\bar{g}_{ab,cd} = \Lambda^{m}{}_{a}\Lambda^{n}{}_{b}\Lambda^{p}{}_{c}\Lambda^{q}{}_{d}g_{mn,pq} \ \ (3)[/tex]

Since [itex]\mathcal{L}[/itex] is a Lorentz scalar, we must have

[tex]\bar{g}_{ab,cd}A^{abcd} = g_{ab,cd}A^{abcd} \ \ (4)[/tex]

Eq(3) and Eq(4) imply that [itex]A^{abcd}(\eta)[/itex] is an invariant Lorentz tensor. In 4-dimensional space-time, the most general rank-4 invariant tensor is

[tex]A^{abcd} = a\eta^{ab}\eta^{cd} + a_{1}\eta^{ac}\eta^{bd} + a_{2}\eta^{ad}\eta^{bc} + a_{3}\epsilon^{abcd}[/tex]

Using the symmetry of [itex]g_{ab,cd}[/itex] in (ab) and (cd), we can write Eq(2) in the form

[tex]\mathcal{L} = g_{ab,cd} \left( a \eta^{ab}\eta^{cd} + c \eta^{ac} \eta^{bd}\right) + b[/tex]

where a and c are constants.
Next, we go to yet another (locally inertial) coordinate system related to the x-system by
[tex] x^{a} = \bar{x}^{a} + (1/6) \eta^{ae}C_{ebcd}\bar{x}^{b}\bar{x}^{c}\bar{x}^{d}[/tex]

where the constant [itex]C_{ebcd}[/itex] is symmetric in b, c and d. With some boring calculation, one finds, at [itex]\bar{x}^{a}=0[/itex], that

[tex]\bar{g}_{ab,cd} = g_{ab,cd} + C_{abcd} + C_{bacd}[/tex]
Now, it is easy to see that the invariance of [itex]\mathcal{L}[/itex] under this transformation;

[tex]\mathcal{L}(\eta, 0, \bar{g}_{ab,cd}) = \mathcal{L}(\eta, 0, g_{ab.cd}),[/tex]

implies [itex]a = -c[/itex].

Thus, our Lagrangian becomes

[tex] \mathcal{L} = c g_{ab,cd}\left(\eta^{ac}\eta^{bd} - \eta^{ab}\eta^{cd}\right) + b[/tex]

Now comes the most difficult part, we play with the indices and rewrite the first term as

[tex] (c/2)\eta^{bc}\eta^{ae}\partial_{a}\left(g_{be,c} + g_{ce,b} - g_{bc,e}\right) - (c/2)\eta^{ae}\eta^{bc}\partial_{c}\left(g_{be,a} + g_{ae,b} - g_{ba,e}\right)[/tex]

and, after introducing the connection coefficients, we conclude that in a locally inertial frame our Lagrangian has the form

[tex]\mathcal{L}= c \eta^{bc}\left(\partial_{a}\Gamma^{a}_{bc} - \partial_{c}\Gamma^{a}_{ba}\right) + b[/tex]

Finally, we are in good position to use the principle of general covariance and write an expression for [itex]\mathcal{L}(x)[/itex] which holds true in any (completely arbitrary) reference frame

[tex]\mathcal{L}(x) = c R + b \ \ (5)[/tex]

where R is the (curvature) scalar

[tex] R = g^{ab}\left( \nabla_{c}\Gamma^{c}_{ab} - \nabla_{b}\Gamma^{c}_{ac}\right) [/tex]

The values of c and b must be determined by experiment. For [itex]b = 0[/itex], the value of c can be determined by comparing the Newtonian limit of the theory with Newton’s gravity;

[tex]c = \frac{1}{16 \pi G}[/tex]

Thus, for [itex]b = \frac{\Lambda}{16\piG} \neq 0[/itex], we arrive at the H-E action

[tex]S[g_{ab}] = \frac{1}{16\pi G} \int d^{4}x \sqrt{-g} ( R + \Lambda)[/tex]

So, in spite of all the fuss, general covariance seems to be a powerful and important principle which determines acceptable forms of physical laws.

dextercioby · Dec 6, 2010

Hi, Sam. Great post! The best contribution I've seen on PF this year. I followed your approach closely but couldn't replicate your calculations hidden under <With some boring calculation, one finds, ..., that ...>. I mean exactly how you're getting the 2 C's linking the 2 metrics in that formula.

Could you spell them out or post a strong hint as how to start them ?

Thanks

arkajad · Dec 6, 2010

samalkhaiat said:

[tex]\bar{a}(\bar{x}) = J(\frac{x}{\bar{x}}) a(x)[/tex]

Now, by taking the determinants of both sides of the transformation law of the metric tensor, we find

[tex]\bar{g}(\bar{x}) = J^{2}(\frac{x}{\bar{x}})g(x)[/tex]

Well, can't it be, say, the determinant of the Ricci tensor? What then?

dextercioby · Dec 6, 2010

Wouldn't the theory be more than second order in derivatives, if the Lagrangian density would be the product between the sqrt of the Ricci tensor's determinant and the Ricci scalar ? My guess is yes. It's the same reason we would reject quadratic, cubic,... terms in the Ricci scalar.

arkajad · Dec 6, 2010

bigubau said:

Wouldn't the theory be more than second order in derivatives, if the Lagrangian density would be the product between the sqrt of the Ricci tensor's determinant and the Ricci scalar ? My guess is yes. It's the same reason we would reject quadratic, cubic,... terms in the Ricci scalar.

Perhaps, if you have det of the Ricci tensor, you will not need the Ricci scalar anymore?

dextercioby · Dec 6, 2010

Are you suggesting that the Lagrangian density would simply be the sqrt of the Ricci tensor's determinant ? If so, then the theory wouldn't be local anymore, it would have an infinite number of derivatives. And the determinant couldn't be without the sqrt acting on it, because it wouldn't be scalar, but a scalar density of weight 2.

arkajad · Dec 6, 2010

What is wrong with square root? You are using square root anyway. Why would the theory have an infinite number of derivatives? In fact, to take square root of the det(Ricci), you do not need metric at all. You just need a connection. One can start with a symmetric one.

dextercioby · Dec 6, 2010

True, in the most general case, the Ricci tensor (and the determinant of its matrix in a coordinate basis) can be defined independently of a metric on the manifold. So what's the Langrangian density you propose ? And what degree in the derivatives are its Euler-Lagrange equations ?

arkajad · Dec 7, 2010

bigubau said:

True, in the most general case, the Ricci tensor (and the determinant of its matrix in a coordinate basis) can be defined independently of a metric on the manifold. So what's the Langrangian density you propose ? And what degree in the derivatives are its Euler-Lagrange equations ?

From Schrödinger's "Space-Time Structure", Cambridge University Press 1950, p. 113, Chapter on "Pure affine theory"

See also:

http://arxiv.org/abs/gr-qc/0406088"
Authors: Jerzy Kijowski, Roman Werpachowski

Abstract: Affine variational principle for General Relativity, proposed in 1978 by one of us (J.K.), is a good remedy for the non-universal properties of the standard, metric formulation, arising when the matter Lagrangian depends upon the metric derivatives. Affine version of the theory cures the standard drawback of the metric version, where the leading (second order) term of the field equations depends upon matter fields and its causal structure violates the light cone structure of the metric. Choosing the affine connection (and not the metric one) as the gravitational configuration, simplifies considerably the canonical structure of the theory and is more suitable for purposes of its quantization along the lines of Ashtekar and Lewandowski (see this http URL). We show how the affine formulation provides a simple method to handle boundary integrals in general relativity theory.

samalkhaiat · Dec 7, 2010

bigubau said:

Hi, Sam. Great post! The best contribution I've seen on PF this year. I followed your approach closely but couldn't replicate your calculations hidden under <With some boring calculation, one finds, ..., that ...>. I mean exactly how you're getting the 2 C's linking the 2 metrics in that formula.

Could you spell them out or post a strong hint as how to start them ?

Thanks

I have not done the calculation. It looked obvious to me. But don’t worry, below I will show you how I knew the result without doing the actual calculation. But first, let us try to derive that particular equation.

Let us write the transformation of metric

[tex]\bar{g}_{ab}(\bar{x}) = \frac{\partial x^{c}}{\partial \bar{x}^{a}} \frac{\partial x^{d}}{\partial \bar{x}^{b}} g_{cd}(x)[/tex]

do the differentiations of coordinate transformation, you get

[tex]\bar{g}_{ab}(\bar{x}) = g_{ab}(x) + \frac{1}{2} g_{ae} \eta^{en} C_{napq} \bar{x}^{p} \bar{x}^{q} + \frac{1}{2} g_{eb} \eta^{en} C_{nbpq} \bar{x}^{p} \bar{x}^{q} + O^{4}(\bar{x})[/tex]

Differentiate with respect to [itex]\bar{x}^{c}[/itex] you find

[tex]\bar{g}_{ab,c} = \bar{\partial}_{c} g_{ab}(x) + g_{ae}\eta^{en} C_{nbpc} \bar{x}^{p} + g_{eb} \eta^{en} C_{napc} \bar{x}^{p} + F[/tex]

where F represents a collection of some junk that will vanish when we evaluate the final expression at [itex]\bar{x} = 0[/itex];

[tex]F = O^{2}(\bar{x}) \bar{\partial}g + \ ...[/tex]

Now take the second derivative and collect more junk

[tex]\bar{g}_{ab,cd} = g_{ab,cd} + g_{ae}\eta^{en}C_{nbcd} + g_{eb} \eta^{en} C_{nacd} + G(\bar{x}) \ (1)[/tex]

where G represents some thing of order;

[tex]G = \bar{\partial} F + O(\bar{x}) \bar{\partial}g[/tex]

At [itex]\bar{x} = 0[/itex] we can put [itex]g_{ab} = \eta_{ab}[/itex] in eq(1), and since the junk G(0) vanish, we find our result

[tex]\bar{g}_{ab,cd} = g_{ab,cd} + \delta^{n}_{a}C_{nbcd} + \delta^{n}_{b}C_{nacd}[/tex]

****

Ok, let me now tell you how I was able to figure out the result without doing any calculation.
I knew if I started with the linear transformation

[tex]x^{a} = \bar{x}^{a} + \eta^{an}C_{nm}\bar{x}^{m}[/tex]

I would find

[tex]\bar{g}_{ab} = g_{ab} + C_{ab} + C_{ba}[/tex]

I also knew if considered the quadratic transformation

[tex]x^{a} = \bar{x}^{a} + \frac{1}{2} \eta^{an} C_{npq} \bar{x}^{p} \bar{x}^{q}[/tex]

I would get for the 1st derivative (at [itex]\bar{x}=0[/itex])

[tex]\bar{g}_{ab,c} = g_{ab,c} + C_{abc} + C_{bac}[/tex]

So, if you can see the pattern that I see , you would be able to figure out how the 2nd derivative transforms at [itex]\bar{x}=0[/itex] when you consider the transformation

[tex]x^{a} = \bar{x}^{a} + \frac{1}{6}\eta^{an}C_{nmpq}\bar{x}^{m}\bar{x}^{p}\bar{x}^{q}[/tex]

regards

birdhen · Dec 17, 2010

Wow, I have just come back to look at this post, thank you for all of your help!

DiracLandau · Dec 18, 2010

Good post by Samalkhaiad .
I think it would be wrong to use the word 'derive' for any action or any Lagrangian, as the Lagrangian formalism is just a formalism at arriving the fundamental equations and not a deduction. Rather one can talk of an intelligent and intuitive way of guessing a proper Lagrangian,which gives the required equations .
For a good,intuitive way of arriving at a right Lagrangian take a look at the classic course in theoretical physics by Landau & Lifgarbagez volume 2-The classical theory of fields, in the chapter
"Gravitational Field equations"and section titled "The action function for the gravitational field".

Also,for the axiomatic formulation of the theory of gravitation see one more classic by hawking and Ellis-"The large Scale structure of space-time" where the postulates are given as
1.Local Causality
2.Local conservation of energy-momentum
3.The Einstein field equations.
and obviously the axiomatic formulation will also be given in MTW.

arkajad · Dec 18, 2010

The following came to my mind. As it is discussed in another thread geodesic equations can be 'derived' from conservation laws. No particular Lagrangian is needed. Conservation laws, on the other hand, are derived from invariance under diffeomorphisms. In fact geodesic equations can be derived directly from invariance under diffeomorphisms.

Can we do the same for field equations? Just some invariance of some higher level, applied to fields rather than to test particle trajectories? One possible objection could be that we can have different Lagrangians for GR, for instance f(R) type, and they lead to different field equations. So, how field equations can be derived from one invariance principle?

But this objection fails if you look carefully at the derivation of geodesic equations from diffeomorphism invariance. Here too the resulting equation are not unique if you allow for higher order locality. For instance you can take into account spin, or internal tension of "test particles". It is only when you neglect completely internal structure of test particles that you arrive at simple geodesic equations for structurless spinless particles.

Just a thought.

P.S. It is not excluded that someone already did it using higher order jet bundles or something like that.

Where does the Einstein-Hilbert action come from?

Undergrad Relativistic Space Travel: Optimizing Proper Time [Project Hail Mary]

Undergrad Why is gravity a fictitious force?

Undergrad KE of rotating disc

Undergrad Why is the Lorentz Force always perpendicular to velocity?

Graduate How valid is the Block Universe theory?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Where does the Einstein-Hilbert action come from?

Similar threads