Where does the Einstein-Hilbert action come from?

  • Thread starter birdhen
  • Start date
35
0
Hi,

I have been looking all over on the internet and I can't find out how the action is derived. I know it is used to derive the field equations, but why does it look the way it does? Why use the determinant of the metric, where does the 1/(16 pi G) come from?

Any help would be great,
thanks!
 
4,222
1
Whoa! Why is this called the Einstein-Hilbert action? Einstein had nothing to do with it. Am I wrong? In my understanding, Einstein had everything to do with raising the question of the equivalence principle and working it to the very end, yet Hilbert snatched the brass ring away from Einstein in first publishing the stress-energy equation attributed to Einstein.

Einstein, to favor the simplest form of the stress-energy equation among many candidates, obtained the same result as Hilbert--perhaps motivated, in part, by the conclusion of Hilbert, I would suspect. Hilbert, on the other hand, was very missive in taking credit, and insisted that Einstein deserved the credit.

The action is the invention of Hilbert.

I'd like to hear, as well, from anyone, on how Hilbert came up with it. I'm glad you brought up Hilbert. Hilbert's action formulation, in my opinion, needs more attention than it get.
 
Last edited:
1,444
4
Hi,
Why use the determinant of the metric, where does the 1/(16 pi G) come from?

Any help would be great,
thanks!
Square root of the determinant is the simplest possible volume form. R is the simplest possible scalar. 1/16 pi G or whatever is needed if you want to get Newton's law for the simplest solution.

Hilbert himself would be busy exclusively with pure mathematics without Einstein.
 
987
54
Einstein did all the heavy lifting, so it makes sense to call it the "Einstein-Hilbert action". You might want to take a look at Pais's book on Einstein.
 

atyy

Science Advisor
13,495
1,609
The Hilbert action comes from postulating that gravity comes from making the metric dynamical, and that the dynamical equations come from an action, which is a scalar.

There are more complex terms consistent with this idea, and the Hilbert action is only the simplest. This seems to be enough so far, the other terms probably arise as tiny quantum corrections at low energy, and make the theory perturbatively nonrenormalizable at high energies. (Look up the "Asymptotic Safety" conjecture.)

The 1/(16.pi.G) is just to match units.
 

haushofer

Science Advisor
Insights Author
2,218
558
You could check Sean Carroll's notes on this :) I would say that any proper book on GR explains the form of the Hilbert action.
 
4,222
1
Einstein did all the heavy lifting, so it makes sense to call it the "Einstein-Hilbert action". You might want to take a look at Pais's book on Einstein.
Yes, Einstein did all the heavy lifting on general relativity, but to my best knowledge, the action is Hilbert's. Does Pais say otherwise?
 
987
54
Yes, Einstein did all the heavy lifting on general relativity, but to my best knowledge, the action is Hilbert's. Does Pais say otherwise?
No. Hilbert was the first to write down the action correctly. My point was that writing down the action depended on Einstein's previous work, so the hyphenated name is justified IMO.
 
4,222
1
No. Hilbert was the first to write down the action correctly. My point was that writing down the action depended on Einstein's previous work, so the hyphenated name is justified IMO.
Each to his own. Here's the Voigt-Lorentz transformation for electromagnetic radiation before Lorentz came up with it.
http://en.wikipedia.org/wiki/Woldemar_Voigt" [Broken]
 
Last edited by a moderator:

samalkhaiat

Science Advisor
Insights Author
1,615
804
Hi,

I have been looking all over on the internet and I can't find out how the action is derived. I know it is used to derive the field equations, but why does it look the way it does?
This is a very good question! Indeed by deriving the action, we can learn a lot about the axiomatic structure of the theory. I do not know why text books (including the good ones) shy away from deriving the E-H action? I think it can be done and since it is snowing, I might have just enough time tonight to do for you. I will post the result if I managed to do it.

regards

sam
 
1,444
4
Perhaps it is worthwhile to notice that a particular form of the action density will depend on what are the independent variables of your theory. Is it just a metric tensor? Or is it a metric and a connection as independent variables? Or, perhaps, only a connection? There are many versions and options. Einstein-Hilbert is, of course, the one that is best known for historical reasons.
 

samalkhaiat

Science Advisor
Insights Author
1,615
804
This is a very good question! Indeed by deriving the action, we can learn a lot about the axiomatic structure of the theory. I do not know why text books (including the good ones) shy away from deriving the E-H action? I think it can be done and since it is snowing, I might have just enough time tonight to do for you. I will post the result if I managed to do it.

regards

sam




First: some introduction;

Let us assume that all dynamical properties of our space-time are comprised in the metric tensor, [itex]g_{ab}(x)[/itex], which, at the same time, characterizes the behaviour of measuring apparatus. Thus, in a field-like theory, the metric tensor together with its derivatives to finite order can be taken as “dynamical” type variables provided an appropriate SCALAR “Lagrangian” can be constructed out of them:

[tex]\mathcal{L}(x) = \mathcal{L}(g_{ab},g_{ab,c},g_{ab,cd}, \ ...)[/tex]

We always have in mind the group of general coordinate transformations subject to appropriate differentiability conditions (the manifold mapping group of diffeomorphisms). Since tensors form representations of this group, they are natural objects to serve as the building blocks of a generally covariant physical theory. To ensure the general covariance of the resulting theory, we need to form an invariant “action” integral, so that the statement [itex]\delta S[g_{ab}]= 0[/itex] is generally covariant, and so also is the dynamics derived from this statement. However, the integral; [itex] \int_{D} \ d^{4}x \mathcal{L}(x)[/itex], over an invariantly fixed domain D, would not be an invariant as long as [itex]\mathcal{L}(x)[/itex] is a scalar quantity, because

[tex]\int_{D} d^{4}x \mathcal{L}(x) = \int_{\bar{D}} d^{4}\bar{x}\ J(\frac{x}{\bar{x}}) \bar{\mathcal{L}}(\bar{x}) \neq \int_{\bar{D}} d^{4}\bar{x} \bar{\mathcal{L}}(\bar{x})[/tex]

when the Jacobian [itex]J(x/ \bar{x}) \neq 1[/itex]. So we need some quantity [itex]a(x)[/itex] such that

[tex]\int_{D}d^{4}x a(x) \mathcal{L}(x) = \int_{\bar{D}}d^{4}\bar{x} \bar{a}(\bar{x}) \bar{\mathcal{L}}(\bar{x})[/tex]

holds, i.e., [itex]a(x)[/itex] needs to be a scalar density transforming as

[tex]\bar{a}(\bar{x}) = J(\frac{x}{\bar{x}}) a(x)[/tex]

Now, by taking the determinants of both sides of the transformation law of the metric tensor, we find

[tex]\bar{g}(\bar{x}) = J^{2}(\frac{x}{\bar{x}})g(x)[/tex]

Since Lorentz signature implies [itex]g = \det (g_{ab}) < 0[/itex], hence

[tex]\sqrt{- \bar{g}} = J(\frac{x}{\bar{x}}) \sqrt{-g},[/tex]

and our invariant action becomes ([itex] a(x) = \sqrt{-g}[/itex]),

[tex]S[g_{ab}] = \int_{D} d^{4}x \sqrt{- g} \mathcal{L}(x)[/tex]

By varying the action with respect to the metric tensor, we get the following E-L equation of motion;

[tex]\frac{\partial \hat{\mathcal{L}}}{\partial g_{ab}} - \partial_{c}\left( \frac{\partial \hat{\mathcal{L}}}{\partial g_{ab,c}}\right) + \partial_{c}\partial_{d}\left( \frac{\partial \hat{\mathcal{L}}}{\partial g_{ab,cd}}\right) - \ ... = 0 \ \ (1)[/tex]

where [itex]\hat{\mathcal{L}} = \sqrt{-g}\mathcal{L}[/itex].

Deriving the form of the Lagrangian:

This will be based on the following principles;

1)The principle of equivalence:
“At every point in an arbitrary curved space-time, we can choose a locally inertial frame in which the laws of physics take the same form as in a global inertial frame of flat space-time”; at any point p, one can choose a coordinate system such that [itex]g_{ab}(p) = \eta_{ab}[/itex] and [itex]g_{ab,c}(p) = 0[/itex]. The principle states that in the neighbourhood of this point, the physics is Lorentzian.

2) Prejudice:
“The metric tensor obeys a second order partial differential equation”; since almost all of the differential equations of physics are second order, we may regard the above prejudice as a principle and see what it implies.

3)The principle of general covariance:
“The form of physical laws is invariant under the group of general coordinate transformations”. The covariance (form invariance) means that physical laws must be tensorial. To apply the principle, we need a mathematical representation of it. The obvious choice is;

[tex]\eta_{ab} \rightarrow g_{ab}(x)[/tex]
[tex]\partial_{a} \rightarrow \nabla_{a}[/tex]

So, the principle of general covariance represents a technical way to express the transition from SR to GR.
Notice that we have already used the principle of general covariance when we assumed that [itex]\mathcal{L}(x)[/itex] is some unspecified scalar.

Ok, we are ready to do the job.

First notice that [itex]g_{ab}(x)[/itex] would obey a second order differential equation if [itex]\mathcal{L}(x)[/itex] were a function of [itex]g_{ab}[/itex] and [itex]\partial_{c}g_{ab}[/itex] only. But, the principle of equivalence makes it impossible to have a non-trivial scalar function [itex]\mathcal{L}(g_{ab},g_{ab,c})[/itex]; any such function can be made equal to the constant [itex]\mathcal{L}(\eta_{ab},0)[/itex] because, it is always possible to set [itex]g_{ab}=\eta_{ab}[/itex] and [itex]g_{ab,c}=0[/itex] at any point by coordinate transformation. The only way out of this is to let [itex]\mathcal{L}[/itex] to depend on [itex]g_{ab}[/itex] and its frist and second derivatives but demand that [itex] \frac{\partial \mathcal{L}}{\partial g_{ab,cd}}[/itex] be a function of [itex]g_{ab}[/itex] only. According to eq(1), [itex]g_{ab}[/itex] will then satisfy a second order differential equation. So, let us put [itex]\mathcal{L}[/itex] in the form;

[tex]
\mathcal{L}(g_{ab},g_{ab,c},g_{ab,cd}) = g_{ab,cd}(x)A^{abcd}(g_{ab}) + B(g_{ab},g_{ab,c})
[/tex]

Let us evaluate this Lagrangian in a locally inertial system; At the point [itex]x^{a}=0[/itex], we choose coordinates such that [itex]g_{ab}(0)=\eta_{ab}[/itex] and [itex]g_{ab,c}(0) = 0[/itex]. Hence

[tex]\mathcal{L}(\eta , 0 , \partial^{2} g) = g_{ab,cd}A^{abcd}(\eta) + b \ \ (2)[/tex]

where [itex]b= B(\eta , 0)[/itex].

In a new coordinate system related to the x-system by a Lorentz transformation [itex]x^{a}=\Lambda^{a}{}_{b}\bar{x}^{b}[/itex], we still have [itex]\bar{g}_{ab}=\eta_{ab}[/itex] and [itex]\bar{g}_{ab,c}=0[/itex], but

[tex]\bar{g}_{ab,cd} = \Lambda^{m}{}_{a}\Lambda^{n}{}_{b}\Lambda^{p}{}_{c}\Lambda^{q}{}_{d}g_{mn,pq} \ \ (3)[/tex]

Since [itex]\mathcal{L}[/itex] is a Lorentz scalar, we must have

[tex]\bar{g}_{ab,cd}A^{abcd} = g_{ab,cd}A^{abcd} \ \ (4)[/tex]

Eq(3) and Eq(4) imply that [itex]A^{abcd}(\eta)[/itex] is an invariant Lorentz tensor. In 4-dimensional space-time, the most general rank-4 invariant tensor is

[tex]A^{abcd} = a\eta^{ab}\eta^{cd} + a_{1}\eta^{ac}\eta^{bd} + a_{2}\eta^{ad}\eta^{bc} + a_{3}\epsilon^{abcd}[/tex]

Using the symmetry of [itex]g_{ab,cd}[/itex] in (ab) and (cd), we can write Eq(2) in the form

[tex]\mathcal{L} = g_{ab,cd} \left( a \eta^{ab}\eta^{cd} + c \eta^{ac} \eta^{bd}\right) + b[/tex]

where a and c are constants.
Next, we go to yet another (locally inertial) coordinate system related to the x-system by
[tex]
x^{a} = \bar{x}^{a} + (1/6) \eta^{ae}C_{ebcd}\bar{x}^{b}\bar{x}^{c}\bar{x}^{d}
[/tex]

where the constant [itex]C_{ebcd}[/itex] is symmetric in b, c and d. With some boring calculation, one finds, at [itex]\bar{x}^{a}=0[/itex], that

[tex]\bar{g}_{ab,cd} = g_{ab,cd} + C_{abcd} + C_{bacd}[/tex]
Now, it is easy to see that the invariance of [itex]\mathcal{L}[/itex] under this transformation;

[tex]\mathcal{L}(\eta, 0, \bar{g}_{ab,cd}) = \mathcal{L}(\eta, 0, g_{ab.cd}),[/tex]

implies [itex]a = -c[/itex].

Thus, our Lagrangian becomes

[tex]
\mathcal{L} = c g_{ab,cd}\left(\eta^{ac}\eta^{bd} - \eta^{ab}\eta^{cd}\right) + b[/tex]

Now comes the most difficult part, we play with the indices and rewrite the first term as

[tex]
(c/2)\eta^{bc}\eta^{ae}\partial_{a}\left(g_{be,c} + g_{ce,b} - g_{bc,e}\right) - (c/2)\eta^{ae}\eta^{bc}\partial_{c}\left(g_{be,a} + g_{ae,b} - g_{ba,e}\right)
[/tex]

and, after introducing the connection coefficients, we conclude that in a locally inertial frame our Lagrangian has the form

[tex]\mathcal{L}= c \eta^{bc}\left(\partial_{a}\Gamma^{a}_{bc} - \partial_{c}\Gamma^{a}_{ba}\right) + b[/tex]

Finally, we are in good position to use the principle of general covariance and write an expression for [itex]\mathcal{L}(x)[/itex] which holds true in any (completely arbitrary) reference frame

[tex]\mathcal{L}(x) = c R + b \ \ (5)[/tex]

where R is the (curvature) scalar

[tex]
R = g^{ab}\left( \nabla_{c}\Gamma^{c}_{ab} - \nabla_{b}\Gamma^{c}_{ac}\right)
[/tex]

The values of c and b must be determined by experiment. For [itex] b = 0[/itex], the value of c can be determined by comparing the Newtonian limit of the theory with Newton’s gravity;

[tex]c = \frac{1}{16 \pi G}[/tex]

Thus, for [itex]b = \frac{\Lambda}{16\piG} \neq 0[/itex], we arrive at the H-E action

[tex]S[g_{ab}] = \frac{1}{16\pi G} \int d^{4}x \sqrt{-g} ( R + \Lambda)[/tex]

So, in spite of all the fuss, general covariance seems to be a powerful and important principle which determines acceptable forms of physical laws.
 

dextercioby

Science Advisor
Homework Helper
Insights Author
12,944
525
Hi, Sam. Great post! The best contribution I've seen on PF this year. I followed your approach closely but couldn't replicate your calculations hidden under <With some boring calculation, one finds, ..., that ...>. I mean exactly how you're getting the 2 C's linking the 2 metrics in that formula.

Could you spell them out or post a strong hint as how to start them ?

Thanks
 
1,444
4
[tex]\bar{a}(\bar{x}) = J(\frac{x}{\bar{x}}) a(x)[/tex]

Now, by taking the determinants of both sides of the transformation law of the metric tensor, we find

[tex]\bar{g}(\bar{x}) = J^{2}(\frac{x}{\bar{x}})g(x)[/tex]
Well, can't it be, say, the determinant of the Ricci tensor? What then?
 

dextercioby

Science Advisor
Homework Helper
Insights Author
12,944
525
Wouldn't the theory be more than second order in derivatives, if the Lagrangian density would be the product between the sqrt of the Ricci tensor's determinant and the Ricci scalar ? My guess is yes. It's the same reason we would reject quadratic, cubic,... terms in the Ricci scalar.
 
1,444
4
Wouldn't the theory be more than second order in derivatives, if the Lagrangian density would be the product between the sqrt of the Ricci tensor's determinant and the Ricci scalar ? My guess is yes. It's the same reason we would reject quadratic, cubic,... terms in the Ricci scalar.
Perhaps, if you have det of the Ricci tensor, you will not need the Ricci scalar anymore?
 

dextercioby

Science Advisor
Homework Helper
Insights Author
12,944
525
Are you suggesting that the Lagrangian density would simply be the sqrt of the Ricci tensor's determinant ? If so, then the theory wouldn't be local anymore, it would have an infinite number of derivatives. And the determinant couldn't be without the sqrt acting on it, because it wouldn't be scalar, but a scalar density of weight 2.
 
1,444
4
What is wrong with square root? You are using square root anyway. Why would the theory have an infinite number of derivatives? In fact, to take square root of the det(Ricci), you do not need metric at all. You just need a connection. One can start with a symmetric one.
 
Last edited:

dextercioby

Science Advisor
Homework Helper
Insights Author
12,944
525
True, in the most general case, the Ricci tensor (and the determinant of its matrix in a coordinate basis) can be defined independently of a metric on the manifold. So what's the Langrangian density you propose ? And what degree in the derivatives are its Euler-Lagrange equations ?
 
1,444
4
True, in the most general case, the Ricci tensor (and the determinant of its matrix in a coordinate basis) can be defined independently of a metric on the manifold. So what's the Langrangian density you propose ? And what degree in the derivatives are its Euler-Lagrange equations ?
From Schrodinger's "Space-Time Structure", Cambridge University Press 1950, p. 113, Chapter on "Pure affine theory"

pure_affine.jpg


See also:

http://arxiv.org/abs/gr-qc/0406088" [Broken]
Authors: Jerzy Kijowski, Roman Werpachowski

Abstract: Affine variational principle for General Relativity, proposed in 1978 by one of us (J.K.), is a good remedy for the non-universal properties of the standard, metric formulation, arising when the matter Lagrangian depends upon the metric derivatives. Affine version of the theory cures the standard drawback of the metric version, where the leading (second order) term of the field equations depends upon matter fields and its causal structure violates the light cone structure of the metric. Choosing the affine connection (and not the metric one) as the gravitational configuration, simplifies considerably the canonical structure of the theory and is more suitable for purposes of its quantization along the lines of Ashtekar and Lewandowski (see this http URL). We show how the affine formulation provides a simple method to handle boundary integrals in general relativity theory.
 
Last edited by a moderator:

samalkhaiat

Science Advisor
Insights Author
1,615
804
Hi, Sam. Great post! The best contribution I've seen on PF this year. I followed your approach closely but couldn't replicate your calculations hidden under <With some boring calculation, one finds, ..., that ...>. I mean exactly how you're getting the 2 C's linking the 2 metrics in that formula.

Could you spell them out or post a strong hint as how to start them ?

Thanks

I have not done the calculation. It looked obvious to me. But don’t worry, below I will show you how I knew the result without doing the actual calculation. But first, let us try to derive that particular equation.

Let us write the transformation of metric

[tex]\bar{g}_{ab}(\bar{x}) = \frac{\partial x^{c}}{\partial \bar{x}^{a}} \frac{\partial x^{d}}{\partial \bar{x}^{b}} g_{cd}(x)[/tex]

do the differentiations of coordinate transformation, you get

[tex]\bar{g}_{ab}(\bar{x}) = g_{ab}(x) + \frac{1}{2} g_{ae} \eta^{en} C_{napq} \bar{x}^{p} \bar{x}^{q} + \frac{1}{2} g_{eb} \eta^{en} C_{nbpq} \bar{x}^{p} \bar{x}^{q} + O^{4}(\bar{x})[/tex]

Differentiate with respect to [itex]\bar{x}^{c}[/itex] you find

[tex]\bar{g}_{ab,c} = \bar{\partial}_{c} g_{ab}(x) + g_{ae}\eta^{en} C_{nbpc} \bar{x}^{p} + g_{eb} \eta^{en} C_{napc} \bar{x}^{p} + F[/tex]

where F represents a collection of some junk that will vanish when we evaluate the final expression at [itex]\bar{x} = 0[/itex];

[tex]F = O^{2}(\bar{x}) \bar{\partial}g + \ ...[/tex]

Now take the second derivative and collect more junk

[tex]\bar{g}_{ab,cd} = g_{ab,cd} + g_{ae}\eta^{en}C_{nbcd} + g_{eb} \eta^{en} C_{nacd} + G(\bar{x}) \ (1)[/tex]

where G represents some thing of order;

[tex]G = \bar{\partial} F + O(\bar{x}) \bar{\partial}g[/tex]

At [itex]\bar{x} = 0[/itex] we can put [itex]g_{ab} = \eta_{ab}[/itex] in eq(1), and since the junk G(0) vanish, we find our result

[tex]\bar{g}_{ab,cd} = g_{ab,cd} + \delta^{n}_{a}C_{nbcd} + \delta^{n}_{b}C_{nacd}[/tex]

****

Ok, let me now tell you how I was able to figure out the result without doing any calculation.
I knew if I started with the linear transformation

[tex]x^{a} = \bar{x}^{a} + \eta^{an}C_{nm}\bar{x}^{m}[/tex]

I would find

[tex]\bar{g}_{ab} = g_{ab} + C_{ab} + C_{ba}[/tex]

I also knew if considered the quadratic transformation

[tex]x^{a} = \bar{x}^{a} + \frac{1}{2} \eta^{an} C_{npq} \bar{x}^{p} \bar{x}^{q}[/tex]

I would get for the 1st derivative (at [itex]\bar{x}=0[/itex])

[tex]\bar{g}_{ab,c} = g_{ab,c} + C_{abc} + C_{bac}[/tex]

So, if you can see the pattern that I see , you would be able to figure out how the 2nd derivative transforms at [itex]\bar{x}=0[/itex] when you consider the transformation

[tex]x^{a} = \bar{x}^{a} + \frac{1}{6}\eta^{an}C_{nmpq}\bar{x}^{m}\bar{x}^{p}\bar{x}^{q}[/tex]

regards
 
35
0
Wow, I have just come back to look at this post, thank you for all of your help!
 
Good post by Samalkhaiad .
I think it would be wrong to use the word 'derive' for any action or any Lagrangian, as the Lagrangian formalism is just a formalism at arriving the fundamental equations and not a deduction. Rather one can talk of an intelligent and intuitive way of guessing a proper Lagrangian,which gives the required equations .
For a good,intuitive way of arriving at a right Lagrangian take a look at the classic course in theoretical physics by Landau & Lifgarbagez volume 2-The classical theory of fields, in the chapter
"Gravitational Field equations"and section titled "The action function for the gravitational field".

Also,for the axiomatic formulation of the theory of gravitation see one more classic by hawking and Ellis-"The large Scale structure of space-time" where the postulates are given as
1.Local Causality
2.Local conservation of energy-momentum
3.The Einstein field equations.
and obviously the axiomatic formulation will also be given in MTW.
 
1,444
4
The following came to my mind. As it is discussed in another thread geodesic equations can be 'derived' from conservation laws. No particular Lagrangian is needed. Conservation laws, on the other hand, are derived from invariance under diffeomorphisms. In fact geodesic equations can be derived directly from invariance under diffeomorphisms.

Can we do the same for field equations? Just some invariance of some higher level, applied to fields rather than to test particle trajectories? One possible objection could be that we can have different Lagrangians for GR, for instance f(R) type, and they lead to different field equations. So, how field equations can be derived from one invariance principle?

But this objection fails if you look carefully at the derivation of geodesic equations from diffeomorphism invariance. Here too the resulting equation are not unique if you allow for higher order locality. For instance you can take into account spin, or internal tension of "test particles". It is only when you neglect completely internal structure of test particles that you arrive at simple geodesic equations for structurless spinless particles.

Just a thought.

P.S. It is not excluded that someone already did it using higher order jet bundles or something like that.
 
Last edited:

Related Threads for: Where does the Einstein-Hilbert action come from?

  • Posted
Replies
4
Views
3K
  • Posted
Replies
13
Views
7K
  • Posted
Replies
2
Views
508
Replies
32
Views
3K
Replies
10
Views
6K
Replies
15
Views
5K
Replies
1
Views
2K

Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving

Hot Threads

Top