# Schutz 4.5, p. 100: A differential equation

Bernard Schutz, in A First Course in General Relativity, section 4.5, p. 101 (in this edition), writes that

$$\mathrm{d}\rho-(\rho+p)\frac{\mathrm{d}n}{n}$$

"depends only on rho and n." Is he saying

$$\mathrm{d}\rho-(\rho+p)\frac{\mathrm{d}n}{n} = f(\rho,n)$$

where f : {scalar fields on spacetime} --> {1-form fields on spacetime}, and rho, n and p are such scalar fields? In the discussion that follows, are the functions A and B scalar fields on spacetime, dB being the gradient or exterior derivative of B?

pervect
Staff Emeritus
The only thing I can think is that he's saying pressure (p) depends only on rho and n....

Thanks pervect. And, of course, d(rho) and dn depend only on rho and n, so I guess saying that p depends only on rho and n is synonymous with saying that the whole expression on the left hand side is equal to the value f(rho,n) of some 1-form-field-valued function of scalar fields, as I guessed.

But am I interpreting the other symbols correctly? I ask because Schutz sometimes uses d to mean exterior differentiation, in which case he often writes a tilde above it, but other times he says the expression "d followed by some letter" denotes something he calls an infinitesimal. I don't know what distinction he has in mind, in these cases where the input is a scalar field; the gradient (in the sense: covariant derivative) and exterior derivative of a scalar field are identical, aren't they?

What I mean is: Schutz usually puts a tilde over the d when it means exterior differentiation. Here he doesn't. Is there a reason for that? I don't know what else it could denote. If it doesn't mean exterior differentiation in this equation, what does it mean?

Elsewhere in this section, he uses d-without-a-tilde, followed by a letter, to denote what he calls an 'infinitesimal' (his scare quotes). If this is the case here, does anyone know what he means by infinitesimal? My only thought as to what infinitesimal might signify here is the gradient of the scalar field in question, but the gradient of a scalar field is equivalent to the exterior derivative, so here d-without-tilde is equivalent to d-with-tilde?

Or am I right in thinking of these letters as denoting scalar fields? Are they being treated as some other kind of function on this occasion. Schutz describes A and B simply as "functions" without specifying domain and range.

Or, since the context is conservation of energy (This is the section titled First law of thermodynamics.), and absorption or loss of energy by a fluid element over an interval of time, perhaps rho, p, n, T and S here denote real-valued functions of a single real variable. Perhaps their outputs represent energy density, pressure, number density, temperature and specific entropy at a given event, and their common input represents time, maybe time wrt the CMRF of the fluid. Then dn, say, would denote the differential (in the sense of the derivative conceived of as a linear function of - in this case - real numbers) of n.

I'd be very interested to know how other readers of this book have interpreted the equation.

If anyone reading this is familiar with Bernard Schutz: A First Course in General Relativity, or is familiar with the material and can guess what the letters in the equation I quoted in post #1 (and the related equations on this page of the book) represent, I long to know what it's saying, mathematically.

What are the domains and codomains of the functions rho, p, n, T and S?

pervect
Staff Emeritus
They are just density, pressure,number density, temperature, and entropy - as defined in the local rest frame of the fluid. Schutz mentions this on p 100, I think.

The analysis isn't a tensor analysis. So you do have to pay attention to where the quantities are defined. I hope "the rest frame of the fluid" is clear enough?

There are some covariant treatments of thermodynamics in which 1 / temperature is a four-vector, but that isn't the approach being taken here.

I gather there is still some controversy in the literature about the "best" way to do thermodynamics covariantly. My personal favorite is http://arxiv.org/abs/physics/0505004

Okay, so you're saying these functions, rho, p, n, T, S are scalar fields on spacetime, as defined in the table on p. 100, in which case d(rho), dn, dS are 1-form fields, specifically the exterior derivatives (which Schutz calls gradients) of these scalar fields? (And I can rule out the other possible interpretation I had, which was that rho, p, n, T, S denote real-valued functions of time - namely the proper time of a fluid element - which give energy density, pressure, etc. at the point in spacetime where the fluid element is located? In this other interpretation, I take it, d(rho), dn, dS would be exterior derivatives, gradients, differentials / total derivatives or whatever - all these concepts coinciding, in this case - over R, rather than over spacetime.)

I can see that one reason it's not a tensor equation is that tensors are a kind of vector, and the exterior derivatives of smooth scalar fields make a module over the ring of smooth scalar fields, rather than a vector space (although the values of these fields at a point in spacetime are tensors, over the real numbers). But I think you mean "isn't a tensor analysis" in the sense that the definitions of rho, p, n, etc. refer to a particular set of charts (=coordinate system) or frames (=basis fields for the tangent spaces), namely those in which the proper velocity (=4-velocity) of each fluid element has all spacelike components zero. (I think "rest frame of the fluid" means this set of basis fields, although sometimes I wonder if, by frame, Schutz means the basis for one particular tangent space, or its associated Lorentz chart for Minkowski space.) And yet, couldn't the same be said of any timelike tangent vector field or 1-form field, since they define just such a set of frames? But tangent vectors and 1-forms are the archetypal tensors.

pervect is correct that p, pressure, is a function of $\rho$, energy density, and n, number density. In the text it says:

Schutz said:
It can be shown in general that a fluid's state can be given by two parameters: for instance, $\rho$ and T or $\rho$ and n. Everything else is a function of, say, $\rho$ and n.
That last sentence might be more clear if he had said "Everything else is a function of the two parameters that you pick, say $\rho$ and n.". In any case, since the pressure p is part of the fluid's state, it is a function of $\rho$ and n.

The 'd' in the unnumbered equation between (4.24) and (4.25) is an infinitesimal as he states in the paragraph I quoted above. The distinction between a small change $\Delta n$ and an infinitesimal dn is best explained, in my opinion, by displaying the Taylor series for a function and comparing it to the linearization of that function using the first two terms only. That is:

f(x + $\Delta x$) = f(x) + $\Delta x$f'(x) + $\frac{(\Delta x)^2}{2!}$f''(x) + ...

but

f(x + dx) = f(x) + f'(x)dx

From what you've written, Jimmy, it seems I was still way off the mark in my previous post. Thank you for your reply. Suppose we take rho and n as parameters. To disambiguate symbols, I'll use a subscript f to indicate a function (such a pressure "as a function of..."), and a subscript v to indicate a variable (such as a pressure when the word refers to a numerical value). Is this what you mean?

$$p_f:\mathbb{R}^2 \rightarrow \mathbb{R} \enspace | \enspace p_f(\rho_v,n_v) = p_v$$

$$T_f:\mathbb{R}^2 \rightarrow \mathbb{R} \enspace | \enspace T_f(\rho_v,n_v) = T_v$$

$$S_f:\mathbb{R}^2 \rightarrow \mathbb{R} \enspace | \enspace S_f(\rho_v,n_v) = S_v$$

$$\rho_f : \mathbb{R} \rightarrow \mathbb{R} \enspace | \enspace \rho_f(n_v) = \rho_v$$

$$n_f : \mathbb{R} \rightarrow \mathbb{R} \enspace | \enspace n_f(\rho_v) = n_v$$

So this is an equation between real numbers, or real-valued functions, rather than (as I initially thought) between 1-forms on spacetime. And the definitions of the functions would change if we were to choose a different pair of parameters.

And I think you're saying a Schutz infinitesimal is the entity Wikipedia calls the (total) derivative as a linear map, or the differential, or (unambiguously) the Frechet derivative. If we denote the derivative (as defined in elementary calculus), evaluated at x, of a function f, and if we denote (matrix) multiplication x*y as m(x,y), then this kind of d means

$$\mathrm{d}f \bigg|_x = m(D(f,x),\cdot)$$

and this function df|x has the same domain and range as f.

Please correct me if I'm still misunderstanding.

Suppose we take rho and n as parameters. ... Is this what you mean?$$\rho_f : \mathbb{R} \rightarrow \mathbb{R} \enspace | \enspace \rho_f(n_v) = \rho_v$$

$$n_f : \mathbb{R} \rightarrow \mathbb{R} \enspace | \enspace n_f(\rho_v) = n_v$$
The first three equations you wrote were correct. However, these two are not. Although it is true that you can find two parameters that describe the state, it is not true that any two parameters will do. It is necessary that they are independent of each other. In other words, in order for $\rho$ and n to be a suitable choice, it is necessary that neither of the two equations above hold.

So this is an equation between real numbers, or real-valued functions, rather than (as I initially thought) between 1-forms on spacetime. And the definitions of the functions would change if we were to choose a different pair of parameters.
Precisely so. The only two entries in table 4.1 on page 100 that are not scalars are the 4-velocity and the flux vector. The text specifically says that you could have chosen the pair $\rho$ and T (temperature) instead of $\rho$ and n. If you had done so then as you say, the form of the functions would change.

And I think you're saying a Schutz infinitesimal is the entity Wikipedia calls the (total) derivative as a linear map, or the differential, or (unambiguously) the Frechet derivative. If we denote the derivative (as defined in elementary calculus), evaluated at x, of a function f, and if we denote (matrix) multiplication x*y as m(x,y), then this kind of d means

$$\mathrm{d}f \bigg|_x = m(D(f,x),\cdot)$$

and this function df|x has the same domain and range as f.

Please correct me if I'm still misunderstanding.
I haven't looked into this question so I won't answer it. However, if you look at the last two equations in my previous post, I would say that the easiest way to understand what the author is doing is to say that he is restricting $\Delta \rho$ and $\Delta n$ to be so small that he can use the linearized function instead of the full Taylor series without introducing too much error.

The first three equations you wrote were correct.

Thank you! It's such a relief to feel like I'm making progress on this question.

However, these two are not.

What are the correct definitions of these two functions, rhof and nf? (In the case where the parameters are rhov and nv?) It would be helpful if you could give them in the same format and using the same symbols as I used in the definitions I got right.

What are the correct definitions of these two functions, rhof and nf? (In the case where the parameters are rhov and nv?) It would be helpful if you could give them in the same format and using the same symbols as I used in the definitions I got right.
There aren't any, $\rho$ and n are free variables. For instance, if the author had chosen $\rho$ and T instead of $\rho$ and n, then n could be displayed as a function of $\rho$ and T, but then there would be no function for T. To give you a very simple case, consider the formulas for area A and perimeter P of a rectangle in terms of the free variables width w and length l.

A = w * l;
P = 2 (w + l)

What are the formulas for w and l? There aren't any. They are free variables. Now consider the formulas for area A and width w in terms of the perimeter P and length l.

A = l * (P - 2l)/2
w = (P - 2l)/2

Now there is a formula for w, but none for P. There is never a formula for a free variable. There is nothing deeper than this in the book. The author is saying that of all the state variables, you only need to pick two of them to be the free variables and all the rest of the state variables will be functions of those two. However, the two that you pick must be independent of each other. For instance, I could not pick w and 2w to be my free variables and hope to get formulas for A, P and l. That is to say, it is essential that the two variables that I pick are not functions of each other.

Last edited:
Okay, thanks for clarifying that point. The only thing that bothers me about the idea that rho and n stand for free variables here, is what that implies about d. Does d denote a different function on each side of the equation then? On one side, a function of differentiable functions from R2 to R, and on the other side a function from R to... what? I guess it's just the df=fx dx + fy dy of elementary calculus, however we choose to interpret such symbols.

I wonder if we could formalise the idea as follows. Suppose Schutz is referring to scalar fields, p, T, S on the state space of the fluid at an arbitrary, representative point in spacetime, or (equivalently?) the state space of a fluid element. Perhaps this space is a differentiable manifold. Then

$$nT\mathrm{d}S=\mathrm{d}\rho-\frac{(\rho+p)}{n}\mathrm{d}n$$

makes a statement about 1-form fields on the state space, rather than on spacetime:

$$\mathrm{d}S \bigg|_q=\frac{1}{n(q)T(q)}\left ( \mathrm{d}\rho\bigg|_q-\frac{(\rho(q)+p(q))}{n(q)}\mathrm{d}n \bigg|_q \right )$$

where rho and n are coordinate functions for a particular, global chart on the state space, and q is a state, a point in the state space, with coordinate presentation (rho(q),n(q)). If we let

$$\overline{T}$$

be the coordinate presentation of the temperature function in the chart (rho,n), and similarly overline p and S, then we can write

$$T(q)=(\overline{T}(\rho(q),n(q))$$

substituting these into the equation for dS to get an explicit formula. Is this a reasonable approach, do you think? Is this what Schutz means, or, at least, a consistent way of formalising his statement about infinitesimals and smallness? (His use of scare quotes around 'infinitesimal', along with statements in other calculus books I've read, suggests that he sees some more precise idea as underlying behind the term, and that it's being introduced here only as a first step to understanding, a way of appealing to the reader's intuition somehow.)

The goal of pages 100 and 101 is to arrive at a definition of T and S. He states this goal on page 99 at the top of the last paragraph "Definition of macroscopic quantities". Note that in table 4.1 on page 100, He indicates that he will clarify the definitions of T and S below.

As for the functions p, q, T, S, etc. all are ordinary real valued functions of two real variables. The calculus involved is the calculus of such functions as you might find in any text used for a freshman survey course, for instance Thomas.

The only reason he replaces $\Delta \rho$ and $\Delta n$ with d$\rho$ and dn on the right hand side of equation (4.24) is so that he can invoke a theorem from first order differential equations. He is temporarily interested in the unnumbered equation between (4.24) and (4.25) and temporarily uninterested in the left hand side of (4.24), so that he can invoke that theorem. He quickly swiches from infinitesimals back to deltas in equation (4.26). T and S are the names he gives to the functions guaranteed to exist by the theorem he invoked (labeled A and B in the text). Once he has eqn (4.26) in hand, he is no longer working with infinitesimals. I think that a slow and careful reading of the text will reveal the nature of his plan.

One caveat to what I have writen here is that he states that he will use eqn (4.25) later in the book. For this reason, it is important for you to understand this equation as an ordinary first order differential equation involving real valued functions of two real variables.

One caveat to what I have writen here is that he states that he will use eqn (4.25) later in the book. For this reason, it is important for you to understand this equation as an ordinary first order differential equation involving real valued functions of two real variables.

Here's my provisional attempt to reconcile this with the ideas of my previous post. Once we've picked two "parameters", such as energy density and number density, we can identify the state space of a fluid element with R2. Then rho and n could be seen as the natural coordinate functions on R2, or as their values at the point associated with the fluid element. I think this is how you're advising me to look at it.

On the other hand, if we want to reserve the right to switch parameters, maybe it would be more revealing to treat the state space as having no natural chart, since no pair of coordinate functions ("parameters") is more natural a choice than any other pair. Also, this could save us the ambiguity of having to reassign new definitions to letters. Does this make sense?

Then rho and n could be seen as the natural coordinate functions on R2, or as their values at the point associated with the fluid element.
I'm afraid it is beyond my comprehension how you could call rho and n functions after I have said they are not so many times. What is going on here?

I'm sorry if my reply gave the impression of not taking notice of what you wrote. Thanks for your patience. I'm not trying to dispute what you said; I'm only exploring issues that are still confusing me in the hope that this will reveal where my confusion lies and thereby help you to target your answers. Your replies have been very helpful in narrowing down the field of possible interpretations, but there is still much here that I don't comprehend.

In my previous post, I was fumbling around in search of some way to understand how the d operator can take as its input a real-valued function of two real variables on the right-hand side of equation (4.25), namely S, and yet, on the left, take a real number as its input. A situation where this seems to arise is in equations of the form df = fx dx + fy dy, where subscripts denote partial derivatives. Sometimes dy and dy are said to be very small or infinitesimal quantities, whatever that means. The explanation I've read for this notation, and for the term infinitesimal, is that x and y are to be interpreted as coordinate functions on a manifold, and dx and dy as their exterior derivatives, a notion that coincides, in this case, with what Schutz calls a gradient. I was under the impression that this is the standard, modern interpretation of the word infinitesimal, but perhaps I'm mistaken. Anyway, this is why I suggested a reading in which rho and n denote functions. Do you read d(rho) and dn as increments (like Delta rho and Delta n, but with a connotation of smallness, in some sense), and dS as a linear approximation of Delta S? (That is, do you simply accept that d means something different on different sides of the equation, or depending on the definition of the letter that follows it?)

Last edited:
Arnold defines an ordinary differential equation as an equation of the form

$$\frac{\partial}{\partial t} g(t,x) \bigg|_{(0,x)} = v(x)$$

where g:RxM-->M is the evolution function of a state space, M = Rn, and v:M-->M called phase velocity. Is Schutz's equation (4.25) of this form? If so, how would it be expressed in this way; which part of (4,25) is the first partial of the evolution function (the partial with respect to time), evaluated at (0,x); is it dS? And which part of it is the phase velocity, everything else apart from dS? But then what do d(rho) and dn mean in that context? Are they basis vectors?

Arnold defines an ordinary differential equation as an equation of the form

$$\frac{\partial}{\partial t} g(t,x) \bigg|_{(0,x)} = v(x)$$

where g:RxM-->M is the evolution function of a state space, M = Rn, and v:M-->M called phase velocity. Is Schutz's equation (4.25) of this form? If so, how would it be expressed in this way; which part of (4,25) is the first partial of the evolution function (the partial with respect to time), evaluated at (0,x); is it dS? And which part of it is the phase velocity, everything else apart from dS? But then what do d(rho) and dn mean in that context? Are they basis vectors?
I haven't looked too deeply into this However, I'm pretty sure that an equation with partial derivatives is a partial differential equation, not an ordinary differential equation. In any case, this is not going in the right direction.

You seem to have trouble with the distinction between the independent variable and the dependent variable in an equation of the form y = f(x). In this equation, y has a defining function, namely f. However, x does not have a defining function, it is the independent variable. Independent means something close to "has no defining function".

I mentioned earlier what the infinitesimal is. Here is a more detailed description. It starts with a Taylor series. That is, suppose that

f(x + $\Delta$x) = f(x) + $\Delta$xf'(x) + $\frac{(\Delta x)^2}{2!}f''(x) + ...$

This is an exact equation. Now consider taking just the first two terms on the right hand side.

f(x + $\Delta$x) = f(x) + $\Delta$xf'(x)

This is what I call the linearized equation because the right hand side, unlike the Taylor series, is linear in $\Delta$x. It is not an exact equation, but if f is well-behaved and $\Delta$x is sufficiently small, so is the error caused by using it. Let's rename $\Delta$x to dx and call dx 'infinitesimal' to remind ourselves that the following equations are not exact, but are nearly valid if dx is sufficiently small.

f(x + dx) = f(x) + dxf'(x)

Also, define

df(x) = f(x + dx) - f(x)

As you say, dx and df are defined differently. dx is just some sufficiently small number, df depends on the choice of dx in a particular way (as well as depending on x and f).

Now we can rearrange the linearized Taylor series as:

df(x) = dxf'(x)

One more step, divide both sides by dx.

$\frac{df(x)}{dx}$ = f'(x)

This equation is an approximation, but it becomes an equality if you take the limit as dx goes to zero. Schutz is doing nothing more or less than this except that his functions are functions of two variables instead of one and he doesn't take the limit.

Last edited:
I haven't looked too deeply into this However, I'm pretty sure that an equation with partial derivatives is a partial differential equation, not an ordinary differential equation. In any case, this is not going in the right direction.

Arnold's book that I refered to is The Theory of Ordinary Differential Equations. These he defines as equations of the form

$$\frac{\mathrm{d} }{\mathrm{d} t} \bigg|_{t=0} g^t x = v(x).$$

I just wrote gtx as g(t,x), so as to avoid any potential clash with Schutz's use of superscripts for indices. Can the distinction between ordinary and partial really be so trivial that either type can be changed into the other by a mere change of notation?

You seem to have trouble with the distinction between the independent variable and the dependent variable in an equation of the form y = f(x). In this equation, y has a defining function, namely f. However, x does not have a defining function, it is the independent variable. Independent means something close to "has no defining function".

That's okay. I'm familiar with the distinction. It's only when the letter d is in play that I get confused. I think the reason for this is that different authors, addressing readers of different levels of sophostication, may use the same form of equation but conceptualise the symbols in a somewhat different way.

I mentioned earlier what the infinitesimal is. Here is a more detailed description. It starts with a Taylor series. That is, suppose that

f(x + $\Delta$x) = f(x) + $\Delta$xf'(x) + $\frac{(\Delta x)^2}{2!}f''(x) + ...$

This is an exact equation. Now consider taking just the first two terms on the right hand side.

f(x + $\Delta$x) = f(x) + $\Delta$xf'(x)

This is what I call the linearized equation because the right hand side, unlike the Taylor series, is linear in $\Delta$x. It is not an exact equation, but if f is well-behaved and $\Delta$x is sufficiently small, so is the error caused by using it. Let's rename $\Delta$x to dx and call dx 'infinitesimal' to remind ourselves that the following equations are not exact, but are nearly valid if dx is sufficiently small.

f(x + dx) = f(x) + dxf'(x)

Also, define

df(x) = f(x + dx) - f(x)

As you say, dx and df are defined differently. dx is just some sufficiently small number, df depends on the choice of dx in a particular way (as well as depending on x and f).

Now we can rearrange the linearized Taylor series as:

df(x) = dxf'(x)

One more step, divide both sides by dx.

$\frac{df(x)}{dx}$ = f'(x)

This equation is an approximation, but it becomes an equality if you take the limit as dx goes to zero. Schutz is doing nothing more or less than this except that his functions are functions of two variables instead of one and he doesn't take the limit.

This is the treatment I've seen in elementary calculus books such as Berkey & Blanchard. I just thought exterior derivatives were supposed to bring together and formalised all those various uses of d, so that the notation could be interpreted in a consistent way: linear approximation of a function, small increment of an argument, and the d-things that bookend integral notation. Even if it would be overkill to use such formalism here, I'm curious as to how it would work. I'm trying to connect these various subjects I've been learning about, so as to better understand them. In the context of differential geometry, it often happens that such an equation is written treating dx as the exterior derivative of a coordinate function, x, such as the identity function on R. Have you not encountered this? (Of course, it also often happens that the distinction between the output of such a function is blurred with the function itself...)