# Stress-Energy Tensor from Lagrangian: Technical Question

Stress-Energy-Momentum Tensor from Lagrangian: Technical Question

I've been reading about how to generate the stress-energy-momentum tensor $T^{\mu \nu}$ from the action

$$S = \int d^{4}x \sqrt{|g|} \mathcal{L}$$
$$T^{\mu \nu} = \frac{2}{\sqrt{|g|}} \frac{\partial}{\partial g_{\mu \nu}} \left( \sqrt{|g|} \mathcal{L} \right)$$

My impression is that it should not matter whether we're differentiating with respect to upper indices $g^{\mu \nu}$ or lower $g_{\mu \nu}$ but in actual fact it seems to:

Compare

$$\frac{\partial}{\partial g_{\mu \nu}} \left( \sqrt{|g|} \mathcal{L} \right) = \frac{1}{2} \sqrt{|g|} g^{\mu \nu} \mathcal{L} + \sqrt{|g|} \frac{\partial \mathcal{L}}{\partial g_{\mu \nu}} \right)$$

with

$$\frac{\partial}{\partial g^{\mu \nu}} \left( \sqrt{|g|} \mathcal{L} \right) = -\frac{1}{2} \sqrt{|g|} g_{\mu \nu} \mathcal{L} + \sqrt{|g|} \frac{\partial \mathcal{L}}{\partial g^{\mu \nu}} \right)$$

the difference is sign comes from the fact that

$$\delta \sqrt{|g|} = \frac{1}{2} \sqrt{|g|} g^{\mu \nu} \delta g_{\mu \nu} = -\frac{1}{2} \sqrt{|g|} \delta g^{\mu \nu} g_{\mu \nu}$$

But how can the stress-energy-momentum tensor be dependent on whether we're differentiating with respect to lower or upper indices? I am most likely making an error somewhere.

Also, what about the overall sign? I see Weinberg and Carroll's GR book/notes defining the tensor with a -2 instead of my +2 -- but when I use it on the EM free-lagrangian $-\frac{1}{4} F^{\mu \nu} F_{\mu \nu}$ it gives me negative energy.

Is there an un-ambiguous manner to determine both the overall sign and whether to take derivatives with respect to metric tensor elements with upper or lower indices?

Last edited:

Related Special and General Relativity News on Phys.org
Physics Monkey
Homework Helper
lonelyphysicist,

Let me jump right to heart of the matter. The simple reason why it matters whether you differentiate with respect to raised or lowered components of the metric is that the metric is the thing doing the raising and lowering. Let me illustrate this.

Suppose I wanted to differentiate a Lagrangian with respect to a 1 form field $$a_\mu$$. It is really easy to show that if I differentiate with respect to the associated vector $$a^\mu$$ field instead, I get a simple relation between the two derivatives:
$$\frac{\partial \mathcal{L}}{\partial a^\mu} = g_{\mu \nu} \frac{\partial \mathcal{L}}{\partial a_\nu}$$

just as you would expect. This all arises because the raised components are just linear combinations of the lowered components.

The situation is different when I am considering the metric. You can still try to say that the raised metric components are linear combinations of the lowered metric components, but the coeffecients are themselves the raised components of the metric!

$$g^{\alpha \beta} = g^{\alpha \mu} g^{\beta \nu} g_{\mu \nu}$$

See what I mean? With the vector field, the coeffecients connecting the raised and lowered components didn't involve the vector field. The problem is, as I said, that the metric is doing the raising and lowering. The relationship between the raised components of the metric and the lowered components of the metric is a nonlinear inverse relationship rather than a simple linear relationship. So it does make a difference, although it turns out to be a small one. You should try to figure out what this small difference is (i.e. calculate the relationship between derivatives with respect to lower comp. and derivatives with respect to upper comp.), though I'll be happy to help if you get stuck. Is this clear at all?

Your problem with the overall sign in your definition of the stress-energy tensor is just a matter of sign convention I think. For instance Misner, Wheeler, and Thorne use your definition in their book (positive sign), and they point out at the beginning that they have a different sign convention from Weinberg, for instance. Overall signs are sometimes hard to compare in GR because everyone uses different conventions. If you make sure the kinetic energy comes in positive in the Lagrangian using your metric signature, then you should choose which ever sign makes the kinetic energy come out positive in stress-energy tensor. The key is consistency.

I haven't checked all your calculations, but it seems like they are ok. As long as you are consistent, all your equations will come out right. For example, if you take the derivative of the geometric part of the action with respect to raised components you had better take the derivative of the matter part of the action also with respect to raised components. Also, the convention is to always have the energy come out positive in the stress-energy tensor, so you adjust your definition accordingly (basically you use either plus or minus).

Hope this helps!

It seems what you're saying is that

$$\frac{\partial}{\partial g_{\mu \nu}} = -g^{\alpha \mu} g^{\beta \nu} \frac{\partial}{\partial g^{\alpha \beta}}$$

The way I got this was to consider

$$\frac{\partial}{\partial g_{\mu \nu}} \left( g^{\sigma \lambda} g_{\lambda \tau} \right) = \frac{\partial}{\partial g_{\mu \nu}} \left( g^{\sigma \lambda} \right) g_{\lambda \tau} + g^{\sigma \lambda} \frac{\partial}{\partial g_{\mu \nu}} \left( g_{\lambda \tau} \right) = \frac{\partial}{\partial g_{\mu \nu}} \left( g^{\sigma \lambda} \right) g_{\lambda \tau} + g^{\sigma \lambda} \delta^{\mu}_{\phantom{\lambda} \lambda} \delta^{\nu}_{\phantom{\lambda} \tau}$$
$$\frac{\partial}{\partial g_{\mu \nu}} \left( g^{\sigma \lambda} g_{\lambda \tau} \right) = \frac{\partial}{\partial g_{\mu \nu}} \left( \delta^{\sigma}_{\phantom{\lambda} \tau} \right) = 0$$

which means

$$\frac{\partial}{\partial g_{\mu \nu}} \left( g^{\sigma \lambda} \right) g_{\lambda \tau} = -g^{\sigma \lambda} \delta^{\mu}_{\phantom{\lambda} \lambda} \delta^{\nu}_{\phantom{\lambda} \tau}$$
$$\frac{\partial g^{\sigma \lambda}}{\partial g_{\mu \nu}} = -g^{\sigma \mu} g^{\lambda \nu}$$

and by the chain rule

$$\frac{\partial}{\partial g_{\mu \nu}} = \frac{\partial g^{\alpha \beta}}{\partial g_{\mu \nu}} \frac{\partial}{\partial g^{\alpha \beta}} = -g^{\alpha \mu} g^{\beta \nu} \frac{\partial}{\partial g^{\alpha \beta}}$$

If I am right - I'm not sure if I am, of course - then I have a new question: what is

$$\frac{\partial}{\partial g_{\mu \nu}} \left( g_{\alpha \beta} V^{\alpha} V^{\beta} \right)$$

Is it

$$\frac{\partial}{\partial g_{\mu \nu}} \left( g_{\alpha \beta} V^{\alpha} V^{\beta} \right) = \delta^{\mu}_{\phantom{\mu}\alpha} \delta^{\nu}_{\phantom{\mu}\beta} V^{\alpha} V^{\beta} = V^{\mu} V^{\nu}$$

or is it

$$\frac{\partial}{\partial g_{\mu \nu}} \left( g^{\alpha \beta} V_{\alpha} V_{\beta} \right) = - g^{\mu \alpha} g^{\nu \beta} V_{\alpha} V_{\beta} = -V^{\mu} V^{\nu}$$
?

Last edited:
Physics Monkey
Homework Helper
lonelyphysicist,

You got! The only difference is an extra minus sign. Basically the minus comes from the fact that what you're doing is calculating
$$\frac{\partial x^{-1}}{\partial x} = - x^{-2}$$

Ok so now how to we resolve the apparent paradox? Suppose I start with the first statement,
$$\frac{\partial}{\partial g_{\mu \nu}} \left( g_{\alpha \beta} V^{\alpha} V^{\beta} \right) = \delta^{\mu}_{\phantom{\mu}\alpha} \delta^{\nu}_{\phantom{\mu}\beta} V^{\alpha} V^{\beta} = V^{\mu} V^{\nu}$$

If this statement is true then it means the raised components of V are independent of the metric. Ok, so now how do I seem to get a different answer for the same derivative below? The catch is that you cheated to calculate the derivative. If the raised components of V are independent of metric then the lowered components of V are dependent on the metric since they were obtained by contraction with the metric. In other words,
$$\frac{\partial V_\alpha}{\partial g_{\mu \nu}} \neq 0$$
which is contrary to what you have assumed in calculating the derivative. You can check for yourself that you get the term you have plus the extra term,
$$2 V^\mu V^\nu$$ which makes the two expressions agree.

This obviously begs the question, "How do I know which of the raised or lowered components are the ones independent of the metric?" The answer has to come from context. Suppose I define some vector as tangent to a curve. Well this object naturally comes with a raised index and isn't defined using the metric, so the raised components must be metric independent. However, when you use the metric to find the equivalent 1 form, the 1 form you obtain obviously depends to the metric you use. Thus the lowered components do depend on the metric. This subtlety means that you have to be very careful about how you define various tensors. For instance, is the electromagnetic potential most naturally a 1 form or a vector? Elementary treatments often start with it as a vector, but it gauge theory it appears as a connection 1 form. Consistency must be you guide.

Does this picture make sense?

Physics Monkey said:
This obviously begs the question, "How do I know which of the raised or lowered components are the ones independent of the metric?" The answer has to come from context.
....
For instance, is the electromagnetic potential most naturally a 1 form or a vector? Elementary treatments often start with it as a vector, but it gauge theory it appears as a connection 1 form. Consistency must be you guide.

It is rather interesting you brought up the electromagnetic potential, because I started thinking about all this due to my trying to get the correct stress-energy tensor out of Maxwell's "free" lagrangian $\mathcal{L} = -\frac{1}{4} F_{\mu \nu} F^{\mu \nu}$. I eventually got the correct tensor by reasoning that somehow it is $F_{\mu \nu}$ that was the basic object and so I ought to take the derivative with respect to $g^{\alpha \beta}$ and not the lower index, which had been giving me a wrong relative sign between the two terms in the expression.

Now, could you explain - I apologize for going off topic here - why exactly we ought to think of the vector potential as components of a 1-form, and not as a vector? This is also somewhat related to my confusion of how to match up the vector notation (in cartesian coordinates) for the E field as

$$\vec{E} = -\frac{\partial \vec{A}}{\partial t} - \nabla A^{0}$$

with that in the gauge-invariant tensor form

$$E_{i} = \partial_{i} A_{0} - \partial_{0} A_{i}$$

Why is there no relative negative sign between the two terms in the vector form, whereas there is in the $F_{i0}$? Is it even legitimate to put a lower index on $E_{i}$, as I've done? How would I write the indices in the vector form? Do the components of $\vec{A}$ have upper or lower indices? Should it be $A^{0}$ or $A_{0}$ in the second term?

Chronos
Gold Member
An astounding explanation, Physics Monkey. As advertised, it cut right to the heart of the matter. And it was a pretty darn good question to start with, lonelyphysicist. I'm going to the counter for some popcorn. You guys want anything?

lonelyphysicist said:
.....
$$\frac{\partial}{\partial g_{\mu \nu}} \left( g_{\alpha \beta} V^{\alpha} V^{\beta} \right)$$

Is it

$$\frac{\partial}{\partial g_{\mu \nu}} \left( g_{\alpha \beta} V^{\alpha} V^{\beta} \right) = \delta^{\mu}_{\phantom{\mu}\alpha} \delta^{\nu}_{\phantom{\mu}\beta} V^{\alpha} V^{\beta} = V^{\mu} V^{\nu}$$

or is it

$$\frac{\partial}{\partial g_{\mu \nu}} \left( g^{\alpha \beta} V_{\alpha} V_{\beta} \right) = - g^{\mu \alpha} g^{\nu \beta} V_{\alpha} V_{\beta} = -V^{\mu} V^{\nu}$$
?
You need to recheck what you're doing. Take this expression above. On the left side you have a second rank covariant operator acting on a scalar and you're getting a second ran contravariant object. Doesn't that bother you??? It gives me the willies! And I hate those darn willies! :yuck:

Pete

DrGreg
Gold Member
pmb_phy said:
You need to recheck what you're doing. Take this expression above. On the left side you have a second rank covariant operator acting on a scalar and you're getting a second ran contravariant object. Doesn't that bother you??? It gives me the willies! And I hate those darn willies! :yuck:

Pete
My knowledge of GR is extremely limited, and it is 25 years since I did any. However, even with my elementary knowledge, I was under the impression that the derivative with respect to a covariant tensor was a contravariant operator. Or am I confused?

Physics Monkey
Homework Helper
DrGreg is right about the index positions. The bottom of the bottom is the top of the top. Like a fraction of fractions. That's how I remember it.

Last edited:
Physics Monkey
Homework Helper
I would like a coke please, Chronos.

Physics Monkey
Homework Helper
lonelyphysicist,

Fantastic. Glad to see this is making sense to you. Now, why is the EM potential is most naturally a 1 form? The short answer is, "because the partial derivative $$\partial_\mu$$ naturally comes with a lowered index." What does this mean? Well, if you use the old notation a gauge transformation can be written like this,
$$\vec{A}' = \vec{A} + \vec{\nabla} \Lambda$$
$$\phi' = \phi - \frac{\partial \Lambda}{\partial t}$$

Note the odd sign difference between the two. What is really going on here is, $$A'^\mu = A^\mu + \partial^\mu \Lambda$$. Now, you know that the partial derivative is most naturally defined with a lowered index, and in order to get the upper index we have to use the metric. That minus sign in front of the time derivative is simply the minus from the usual special relativity metric.
$$\partial^0 = \eta^{0 \mu} \partial_\mu = - \partial_0 = -\frac{\partial }{\partial t}$$

The weird minus sign appears because they are using the metric dependent upper components of the potential. Now the difference here is trivial, but I am led to the conclusion that, generally speaking, if I use the raised components of the potential then my gauge transformations are metric dependent! So the raised components of the potential are metric dependent. It is the lowered components of the potential that don't depend on the metric for their definition. They are introduced in the gauge covariant derivative (a different covariant derivative from the space-time one) $$D_\mu = \partial_\mu - i e A_\mu$$. These lowered components change under a gauge transformation according to $$A'_\mu = A_\mu + \partial_\mu \Lambda$$ with no reference to the metric, and we define the gauge field $$A_\mu$$ without the metric. I must introduce the potential as a 1 form because the partial derivatives come naturally with a lowered index.

The lowered component $$A_0$$ is correct in your last equation. The key is that $$A_0 = - A^0 = - \phi$$.

More, or do I need to clarify something?

I still don't fully understand how to go from

$$\vec{E} = -\frac{\partial \vec{A}}{\partial t} - \nabla A^{0}$$

to

$$E_{i} = \partial_{i} A_{0} - \partial_{0} A_{i}$$

In the first line, should it be a $A^{0}$ or $A_{0}$? And when I convert to component notation should I write

$$(\vec{E})^{i} = \left( -\frac{\partial \vec{A}}{\partial t} - \nabla A^{0} \right)^{i} = -\partial_{0} A^{i} - \nabla_{i} A^{0} \textrm{ ?}$$

Of course that looks wrong, but that's how I'd usually write it if I treat the $\vec{E}$ and $\vec{A}$ fields as 3-vectors with upper indices, and I'll give lower indices to the spacetime derivatives since they transform accordingly.

I'm also slightly concerned about sign convention. Your $\partial^{0} = -\frac{\partial}{\partial t}$ is a convention choice, right? So are the negative signs in my first line above for the E field also convention dependant? What about the second line in terms of the gauge-invariant $F_{i0}$?

Chronos said:
I'm going to the counter for some popcorn. You guys want anything?
This popcorn you consumed - I appreciate your offer, though I'd politely decline - what's more pressing is, could you advise me, what exactly is its Lagrangian, or "world function", as D. Hilbert calls it? And would you recommend taking the derivative with respect to upper or lower index metric components in order to get the correct form of its stress-energy-momentum tensor?

lonelyphysicist said:
My impression is that it should not matter whether we're differentiating with respect to upper indices $$g^{\mu \nu}$$ or lower $$g_{\mu \nu}$$ but...
In my opinion, people -except one guy- has already excellently replied. But let me add a very simple comment still.

It appears that you think that $$g^{\mu \nu}$$ and $$g_{\mu \nu}$$ represent the same "physics" and, therefore, does not matter what g do you take in the definition of the tensor in the same representation of GR. However, note that even in the linear regime there is a sign diference between both, one cannot mix both in the same equations.

$$g^{\mu \nu} = \eta^{\mu \nu} - h^{\mu \nu}$$

but

$$g_{\mu \nu} = \eta_{\mu \nu} + h_{\mu \nu}$$

I think that you would know what are the variables onthe action and then derive just with respect to those variables. For example for the Palatini action given in function of $$g^{\mu \nu}$$ you would take the tensor defined via differentiating on $$g^{\mu \nu}$$ instead of usual $$g_{\mu \nu}$$.

About the sign metric there is not single standard. Usually general relativists prefer the +2 whereas particle physicists (working with SR of course) prefer -2. I traditionally used the -2, but it appears that there are advantages in the use of +2 in general relativity problems.

Last edited:
EL
Physics Monkey for Science Advisor! Physics Monkey
Homework Helper
lonelyphysicist,

From electrodynamics we know that $$E^x = - \frac{\partial \phi}{\partial x} - \frac{\partial A^x}{\partial t}$$ and we always define $$A^0 = \phi$$.

Now where does the sign convention come in? If I call the components of the usual vector potential $$A^i$$ then I can write the above equation two ways:

If I use the metric (-1,1,1,1) then I can write $$E^i = \partial^0 A^i - \partial^i A^0 = F^{0 i}$$ where I have absorbed the minus sign into the time derivative to raise the index.

If I use the metric (1,-1,-1,-1) then I can write $$E^i = \partial^i A^0 - \partial^0 A^i = F^{i 0}$$ where now I have absorbed the minus sign into the space derivative to raise the index.

Well which is it? It turns out that how I interpret $$F^{0 i}$$ depends on the metric I use. But there is no contradiction or ambiguity here. To see this, let's ask how the electric field is defined. Its defined in terms of the Lorentz force, right? Well, the Lorentz force is given by $$f^\mu = q F^\mu_\nu u^\nu = q F^{\mu \alpha} \eta_{\alpha \nu} u^\nu$$, so let's see what we get.

If I use the metric (-1,1,1,1) then I find $$f^i = q F^{i 0} \eta_{0 0} u^0 + ...$$ where ... is the magnetic part. For small velocities I thus get $$f^i = q (-E^i) (-1) (1) + ... = q E^i + ...$$ exactly as I should.

On the other hand, if I use the metric (1,-1,-1,-1) then I find as before $$f^i = q F^{i 0} \eta_{0 0} u^0 + ...$$ but now this reduces to $$f^i = q (E^i) (1) (1) + ... = q E^i + ...$$ for small velocities just as before.

So you see I get the force right both times, but which of the components of $$F^{\mu \nu}$$ (or equivalently $$F^{\mu \nu}$$) I identify as the electric field is dependent on the metric I choose.

Does this help clarify things?

Physics Monkey - thank you for taking the time to reply. I think you've cleared up my confusion.

Juan R. said:
... but it appears that there are advantages in the use of +2 in general relativity problems.
Juan: could you explain briefly what's the advantage of using +2 in GR problems?

I've been reading about how to generate the stress-energy-momentum tensor $T^{\mu \nu}$ from the action

$$S = \int d^{4}x \sqrt{|g|} \mathcal{L}$$
$$T^{\mu \nu} = \frac{2}{\sqrt{|g|}} \frac{\partial}{\partial g_{\mu \nu}} \left( \sqrt{|g|} \mathcal{L} \right)$$

My impression is that it should not matter whether we're differentiating with respect to upper indices $g^{\mu \nu}$ or lower $g_{\mu \nu}$ but in actual fact it seems to:

Compare

$$\frac{\partial}{\partial g_{\mu \nu}} \left( \sqrt{|g|} \mathcal{L} \right) = \frac{1}{2} \sqrt{|g|} g^{\mu \nu} \mathcal{L} + \sqrt{|g|} \frac{\partial \mathcal{L}}{\partial g_{\mu \nu}} \right)$$

with

$$\frac{\partial}{\partial g^{\mu \nu}} \left( \sqrt{|g|} \mathcal{L} \right) = -\frac{1}{2} \sqrt{|g|} g_{\mu \nu} \mathcal{L} + \sqrt{|g|} \frac{\partial \mathcal{L}}{\partial g^{\mu \nu}} \right)$$

the difference is sign comes from the fact that

$$\delta \sqrt{|g|} = \frac{1}{2} \sqrt{|g|} g^{\mu \nu} \delta g_{\mu \nu} = -\frac{1}{2} \sqrt{|g|} \delta g^{\mu \nu} g_{\mu \nu}$$

But how can the stress-energy-momentum tensor be dependent on whether we're differentiating with respect to lower or upper indices? I am most likely making an error somewhere.

Also, what about the overall sign? I see Weinberg and Carroll's GR book/notes defining the tensor with a -2 instead of my +2 -- but when I use it on the EM free-lagrangian $-\frac{1}{4} F^{\mu \nu} F_{\mu \nu}$ it gives me negative energy.

Is there an un-ambiguous manner to determine both the overall sign and whether to take derivatives with respect to metric tensor elements with upper or lower indices?
can you please explain why when you derive the root of the determinant of the metric, you don't get the inversed root as a factor, but the root itself?
isn't the 1/2 factor a result of the chain rule?

thank you.

can you please explain why when you derive the root of the determinant of the metric, you don't get the inversed root as a factor, but the root itself?
isn't the 1/2 factor a result of the chain rule?

thank you.
There are 2 methods of deriving this result.

One is to use the fact that for any matrix A

$$\textrm{det} A = \textrm{det} \exp[\log[A]] = \exp[{\rm Tr}[\log[A]]]$$

Hence if we consider $g \to g+\delta g = g(1+g^{-1}\delta g)$, we have

$$\sqrt{g} \to \sqrt{g} \sqrt{1+g^{-1}\delta g} = \sqrt{g} \exp[(1/2){\rm Tr}[\log[1+g^{-1}\delta g]]] = \sqrt{g} \left(1+\frac{1}{2}{\rm Tr}[g^{-1}\delta g] + \mathcal{O}[\delta g^2] \right)$$

The second method is to use Cramer's rule from linear algebra. First we observe that

$$g^{\mu\nu} g_{\nu\lambda} = \delta^\mu_{\phantom{\mu}\lambda} \\ \Rightarrow \frac{\delta g^{\mu\nu}}{\delta g_{\tau\sigma}} = - g^{\mu\tau}g^{\nu\sigma}$$

Then differentiating Cramer's rule on both sides give

$$\frac{\delta g^{\mu\nu}}{\delta g_{\nu\mu}} = \frac{\delta}{\delta g_{\nu\mu}} \frac{(-1)^{\mu+\nu} {\rm det}g_{\hat{\mu\nu}}}{{\rm det} g_{\mu\nu}} \\ = - \frac{g^{\nu\mu}}{{\rm det}g} \frac{\delta}{\delta g} {\rm det} g_$$

where ${\rm det}g_{\hat{\mu\nu}}$ is the determinant of the metric with the $\mu$ column and $\nu$ row removed, and no Einstein summation is implied. But

$$\frac{\delta g^{\nu\mu}}{\delta g_{\mu\nu}} = - g^{\mu\nu}g^{\nu\mu}$$

Hence

$$\frac{\delta}{\delta g_{\nu\mu}} {\rm det} g = \frac{\delta}{\delta g_{\mu\nu}} {\rm det} g = {\rm det}[ g ] g^{\mu\nu}$$

Last edited:
thank you very much

There are 2 methods of deriving this result.

One is to use the fact that for any matrix A

$$\textrm{det} A = \textrm{det} \exp[\log[A]] = \exp[{\rm Tr}[\log[A]]]$$

Hence if we consider $g \to g+\delta g = g(1+g^{-1}\delta g)$, we have

$$\sqrt{g} \to \sqrt{g} \sqrt{1+g^{-1}\delta g} = \sqrt{g} \exp[(1/2){\rm Tr}[\log[1+g^{-1}\delta g]]] = \sqrt{g} \left(1+\frac{1}{2}{\rm Tr}[g^{-1}\delta g] + \mathcal{O}[\delta g^2] \right)$$

The second method is to use Cramer's rule from linear algebra. First we observe that

$$g^{\mu\nu} g_{\nu\lambda} = \delta^\mu_{\phantom{\mu}\lambda} \\ \Rightarrow \frac{\delta g^{\mu\nu}}{\delta g_{\tau\sigma}} = - g^{\mu\tau}g^{\nu\sigma}$$

Then differentiating Cramer's rule on both sides give

$$\frac{\delta g^{\mu\nu}}{\delta g_{\nu\mu}} = \frac{\delta}{\delta g_{\nu\mu}} \frac{(-1)^{\mu+\nu} {\rm det}g_{\hat{\mu\nu}}}{{\rm det} g_{\mu\nu}} \\ = - \frac{g^{\nu\mu}}{{\rm det}g} \frac{\delta}{\delta g} {\rm det} g_$$

where ${\rm det}g_{\hat{\mu\nu}}$ is the determinant of the metric with the $\mu$ column and $\nu$ row removed, and no Einstein summation is implied. But

$$\frac{\delta g^{\nu\mu}}{\delta g_{\mu\nu}} = - g^{\mu\nu}g^{\nu\mu}$$

Hence

$$\frac{\delta}{\delta g_{\nu\mu}} {\rm det} g = \frac{\delta}{\delta g_{\mu\nu}} {\rm det} g = {\rm det}[ g ] g^{\mu\nu}$$
Although this is certainly the way in which most books treat the problem, it's not strictly correct. Since you are using functional derivatives, if one takes a functional derivative of a functional $F[g;x)$ with respect to the metric, you will in general have Dirac bidensity distributions in your answer. For example,

$$\frac{\delta}{\delta g_{ij}(y)}F[g;x) =\textrm{(derivative)}\times\delta^{(m)}(x,y)$$

where $\delta^{(m)}(x,y)$ is an $m$-dimensional Dirac bidensity distribution. Thus, every single one of the functional derivatives which people have been talking about in this thread make sense if and only if they appear under an integral sign.

For example, the need to use these Dirac distributions is the reason why the proper definition of the stress energy tensor in terms of the action is actually a functional derivative of the action, not of the Lagrangian density, i.e., instead of

$$T^{ij}(x) \propto \frac{\delta}{\delta g_{ij}(x)} \mathcal{L}$$

we need to use

$$T^{ij}(x) \propto \frac{\delta}{\delta g_{ij}(x)} S = \frac{\delta}{\delta g_{ij}(x)} \int_\mathcal{M} \, d^my \mathcal{L}[\Phi_{(A)},\partial\Phi_{(A)};y)$$

where $\Phi_{(A)}$ are tensorial objects which represent the degrees of freedom of your theory.

Last edited:
Since you are using functional derivatives, if one takes a functional derivative of a functional $F[g;x)$ with respect to the metric, you will in general have Dirac bidensity distributions in your answer.
I was using $\delta/\delta g_{\mu\nu}$ to mean $\partial/\partial g_{\mu\nu}$ -- I think this is just a notational issue, not a mathematical/physical one. The delta function is not necessary because the question of what the first order variation of det g does not need to involve an action.

I was using $\delta/\delta g_{\mu\nu}$ to mean $\partial/\partial g_{\mu\nu}$ -- I think this is just a notational issue, not a mathematical/physical one. The delta function is not necessary because the question of what the first order variation of det g does not need to involve an action.
This is precisely the sort of claim that I was pointing out is incorrect. Look, take a general metric $\bm{g}$ whose components can be written $g_{ij}$ in some basis, and choose local coordinates $x^i$ on your manifold. The question now is "What are the $g_{ij}$?" The answer to this is simple: they are $x^i$-dependent functions. As they are position-dependent functions we should, strictly speaking, write $g_{ij}(x)$ instead of $g_{ij}$. As these are functions, one must use functional derivatives. The only situation in which the notation

$$\frac{\partial g_{ij}}{\partial g_{kl}}$$

is acceptable is if you state explicitly beforehand that it is to be taken to mean "the (functional) derivative of $g_{ij}$ at a point $x^i$ with respect to $g_{kl}$ at the same point $x^i$". Then (and only then) does this type of notation make sense. In any other situation, it simply doesn't make sense (and nothing you can do can make it rigorous). Even on a very basic level, it makes sense to think of the functional derivative

$$\frac{\delta g_{ij}(x)}{\delta g_{kl}(y)}$$

as giving you the rate of change of $g_{ij}$ evaluated at $x\in\mathcal{M}$ with respect to a change in $g_{kl}$ evaluated at $y\in\mathcal{M}$. This is actually a reasonably fruitful way to think about it since, with a little bit of work, you can use this approach to investigate causality in any theory with a light cone structure. More specifically, you cannot properly examine the Hamiltonian formulation of a field theory without considering such functional derivatives.

If you're not convinced by this, think back to all you know about, say, the explicit formulation of electromagnetism as a gauge theory derived from a singular Lagrangian. You cannot present that theory rigorously unless you either (a) use functional derivatives, along with the Dirac distributions that this implies, or (b) use the "normal" notation with the explicit qualification about the position-dependence of all the quantities of which you are taking derivatives.

Last edited:
This is precisely the sort of claim that I was pointing out is incorrect.
I understand what you are driving at, but saying the above derivation is incorrect is incorrect.

Our goal here is quite simple. We just want to find out what is the first order variation of det g. The answer does not depend on whether we wish to view our metric as a tensor field or just a matrix of numbers. In the latter, we can view the entries of our 4x4 (or dxd, if you live in d dimensions) metric tensor as independent variables -- the only constraint I used was that this matrix was a real symmetric one. At no point was the knowledge that this metric is a function of spacetime necessary.