• Rasalhague
In summary: Differential forms are a special case of differential forms, which are antisymmetric tensors with all indices fixed. The notation for a differential form is a parenthesized letter followed by the differential operator, e.g. \mathrm{d}x.
Rasalhague
In chapter 1 of Sean Carroll's Lecture Notes on General Relativity, p. 12, he writes:

In spacetime the simplest example of a dual vector is the gradient of a scalar function, the set of partial derivatives [of the function] with respect to the spacetime coordinates, which we denote by "d":

$$\mathrm{d}\phi = \frac{\partial \phi}{\partial x^{\mu}} \hat{\theta}^{(\mu)}$$

http://preposterousuniverse.com/grnotes/

Is it just a coincidence of notations that $$\mathrm{d}\phi$$ looks like a differential (an infinitesimal quantity)? I take it it's important to distinguish between these two concepts (differential and gradient) even though they might be written using the same symbol?

So what exactly is the relationship of differential forms to the concept of a differential (infinitesimal) in elementary calculus? Are these differentials a special case of differential forms, or something analogous, or something else entirely? I gather a differential dx isn't considered a number, as such, except in non-standard analysis where the reals are extended to include infinitesimals and infinite numbers.

In Chapter 1, p. 21, Carroll defines a differential p-form as a (0,p) tensor that is completely antisymmetric: "Thus scalars are automatically 0-forms, and dual vectors automatically 1-forms". On p. 18: "a tensor is antisymmetric in any of its indices if it changes sign when those indices are exchanged" and completely antisymmetric if it is antisymmetric in all of its indices. I'm not quite sure what it means to "exchange" 0 indices or 1 index; how does this exchanging of a single index (with itself?) distinguish elements of a vector space from elements of its dual space? Or does "automatically" just mean that they're defined to be differential forms, either arbitrarily or for some reason other than complete antisymmetry?

There are several different meanings that can be attached to differentials.
consider y=f(x)
non-standard analysis
let dx be "small" in since that it is an infinitesimal in the non-standard analysis sense
dx^2 is also defined and is "smaller"
dx=[f(x+dx)-f(x)]/dx
standard analysis
dx need not be small
dy=f'(x) dx
here dy:R^2->R
we have dy~f(x+dx)-f(x) when dx~0
but dy is not "small" of near f(x+dx)-f(x) in general
dual algebra
we expand R into a ring R[x]/(x^2)
that is numbers of the form a+b*dx where dx is infinitesimal in the sense dx^2=0
f(x+dx)=f(x)+f'(x) dx
dy=f(x+dx)-f(x)=f'(x)dx
differential forms
very similar to standard analysis except we resolve the invariance and problem by the condition
dy^2=dy^dy=0

in the definition you site one does not consider "exchange" 0 indices or 1 index"
all exchanges are of two indices
to generalize the exchange into a permutation one can either decompose a permutation into exchanges (the sign changes when an odd number of exchanges and does not change when there are an even number of exchanges) or introduce the signature of the permutation.

lurflurf said:
There are several different meanings that can be attached to differentials.
consider y=f(x)
non-standard analysis
let dx be "small" in since that it is an infinitesimal in the non-standard analysis sense
dx^2 is also defined and is "smaller"
dx=[f(x+dx)-f(x)]/dx

So in nonstandard analys dx is an infinitely small number, an infinitesimal, belonging to the set of hyperreal numbers?

lurflurf said:
standard analysis
dx need not be small
dy=f'(x) dx
here dy:R^2->R
we have dy~f(x+dx)-f(x) when dx~0
but dy is not "small" of near f(x+dx)-f(x) in general

I thought it was the convention in standard analysis to represent any change in x as $$\Delta x$$ and only very vanishingly small changes in x and y as dx and dy, defined in terms of a limit, but not thought of as numbers. But obviously I have a lot to learn!

lurflurf said:
dual algebra
we expand R into a ring R[x]/(x^2)
that is numbers of the form a+b*dx where dx is infinitesimal in the sense dx^2=0
f(x+dx)=f(x)+f'(x) dx
dy=f(x+dx)-f(x)=f'(x)dx

This is the first I've heard of dual algebra. Can you recommend any websites or books that introduce the concept at a basic level?

lurflurf said:
differential forms
very similar to standard analysis except we resolve the invariance and problem by the condition
dy^2=dy^dy=0

I don't know what you mean by "the invariance" or "problem" or "resolve the invariance and problem". I don't know what it means to say dy^2 in this context. I saw somewhere the notation x^2 for a vector x being used to mean: g(x,x). By analogy with this, my first guess would be that for a 1-form dy, dy^2 might mean g^-1(dy,dy). But I don't think this is always zero, is it? And Sean Carroll says that scalars are "automatically" 0-forms. But for an arbitrary scalar s, s^2 isn't in general equal either to s^s or to 0.

lurflurf said:
in the definition you site one does not consider "exchange" 0 indices or 1 index"
all exchanges are of two indices
to generalize the exchange into a permutation one can either decompose a permutation into exchanges (the sign changes when an odd number of exchanges and does not change when there are an even number of exchanges) or introduce the signature of the permutation.

On p. 21 it defines a (differential) p-form as an antisymmetric (0,p) tensor, and seems to be saying that this definition leads to the conclusion that scalars are 0-forms and dual vectors 1-forms. But on p. 18 it defines antisymmetry: "a tensor is antisymmetric in any of its indices if it changes sign when those indices are exchanged" and complete antisymmetry as the quality of being antisymmetric "in all of its indices". So, as far as I can see, it does indeed define p-forms in terms of antisymmetry, and antisymmetry in terms of exchanges, which is what lead me to the question of what it might mean to exchange no indices or one index!

Does it mean something like this:

1. Exchange each pair of indices (if there are any pairs). Are there any cases where a pair of indices were exchanged but the sign was not reversed?

2. Count the number of type (1,0) tensors which comprise the tensor. Is this number anything other than zero?

If no and no, then the tensor is a p-form.

Last edited:
The fact that a scalar can be seen as a 0-form comes, from my understanding, from the notion of the exterior derivative d. The exterior derivative d is a differential operator sending p forms to p+1 forms. If you apply this to a scalar, you get a 1-form (just plug it in the definition). This makes people say that a scalar apparently can be seen as a 0-form, because the d sends a 0-form to a 1-form in this case. But this can be a little confusing, because the scalar doesn't have any indices!

You shouldn't try to justify this notion with the wedge product, because the wedge product is only defined for forms with indices; without indices the wedge product becomes an ordinary multiplication, and ofcourse the product of a scalar with itself is only zero if the scalar itself is zero.

A p-form can be defined as a covariant tensor (lower indices) which is completely antisymmetric in the indices. So for a 3-form A with components $A_{\mu\nu\rho}$ we have

$$A_{\mu\nu\rho} = -A_{\nu\mu\rho} = A_{\nu\rho\mu} = -A_{\rho\nu\mu} = A_{\rho\mu\nu} = -A_{\mu\rho\nu}$$

A (I think) nices way to state this is to simply use an antisymmetric basis; every tensor with lower indices expanded in this basis is then automatically a form. This expansion is done via the wedgeproduct.

I think Nakahara's treatment on forms, chapter 5 (I believe) would be a nice read for you :)

By the way, a quick way of judging if a permutation of the indices gives you a minus sign or not is the following: write down the original indices, (here $\mu\nu\rho$), write underneath them the permutation you're interested in and connect equal indices. An odd number of crossings gives you a minus sign.

A nice intuitive introduction is given by Zee's Nutshell book on QFT. The wedge product should intuitively clear by linear algebra; the area spanned by two vectors is in 3 dimensions given by a vector which flips sign if you reverse the product of the two vectors. The same goes for a volume which is represented by the determinant.

Well, until I solve my confusion, I might as well revel in it with this quote from Roger Penrose's The Road to Reality:

"Confusion easily arises between the 'classical' idea that a thing like $\mathrm{d}x^r$ should stand for an infinitesimal displacement (vector), whereas we here seem to be viewing it as a covector. In fact, the notation is consistent, but it needs a clear head to see this! The quantity $\mathrm{d}x^r$ seems to have a vectorial character because of its upper index r, and this would indeed be the case if r is treated as an abstract index, in accordance with § 12.8. On the other hand, if r is taken as a numerical index, say r = 2, then we do get a covector, namely $\mathrm{d}x^2$, the gradient of the scalar quantity $y = x^2$ ('x-two, not x squared'). But this depends on the interpretation of 'd' as standing for the gradient rather than as denoting an infnitesimal, as it would have done in the classical tradition. In fact, if we treat both the r as abstract and the d as gradient, then $\mathrm{d}x^r$ simply stands for the (abstract) Kronecker delta!"

Maybe it would all fall into place if I knew what that meant. At least the words are familar, even if I can't fit them all together yet...

When the word "infinitesimal" is used in a physics books, the word never actually refers to an infinitesimal. The authors use it as a secret code that means that the next mathematical expression you see only includes a finite number of terms from a Taylor series in some variable. It makes you wonder why they don't just say that.

If f is a real-valued function of one real variable, the differential df is defined as a real-valued function of two variables, df(x,h)=f'(x)h. Note that h, which is sometimes written as "dx", doesn't need to be small. h needs to be small when we try to estimate f(x+h)-f(x) by df(x,h), because these two only agree to first order in h. So a lot of physicists would say that f(x+h)-f(x)=df(x,h) when h is infinitesimal, but that's just a dumber way of saying what I just said.

It's not hard to generalize this. The differential of a function $\phi:\mathbb R^n\rightarrow\mathbb R^n$ is a function $d\phi:\mathbb R^{2n}\rightarrow\mathbb R^n$ defined by $d\phi(x,h)=J_\phi(x)h$, where $J_\phi(x)$ is the Jacobian matrix of $\phi$ at x.

In differential geometry, we define the "d" of a real-valued function f by df(v)=v(f). Suppose that $\phi:\mathbb R^n\rightarrow\mathbb R^n$. Let I be the identity map on $\mathbb R^n$ (note that it satisfies the definition of a coordinate system), and let v be a tangent vector at x. We have

$$d(\phi^i)v=v(\phi^i)=v^j\partial_j|_x\phi^i=v^j(\phi^i\circ I^{-1})_{,j}(I(x))=v^j\phi^i{}_{,j}(x)=(J_\phi(x)v)^i=d\phi(x,v)^i$$

where the last "d" is the other kind of "d", the kind I mentioned first.$df|x$ is a cotangent vector at x with components $(df)_i(x)=df(\partial_i|_x)=\partial_i|_x(f)=f_{,i}(x)$ but the gradient is the tangent vector that corresponds to it via the isomorphism defined by the metric, i.e. you need to "raise an index" to get the gradient of f from $df$.

Last edited:
Thanks for that. I have a bunch of, no doubt, very naive follow-up questions.

Fredrik said:
It's not hard to generalize this. The differential of a function $\phi:\mathbb R^n\rightarrow\mathbb R^n$ is a function $d\phi:\mathbb R^{2n}\rightarrow\mathbb R^n$ defined by $d\phi(x,h)=J_\phi(x)h$, where $J_\phi(x)$ is the Jacobian matrix of $\phi$ at x.

I take it h was a real number in the special case of the differential of a function of a single variable, and the juxtaposition $f' \, h$ there denoted multiplication. Is that the case here too, with h a real number that scales the value of the matrix product of the Jacobian matrix and the vector x? Could we write this more fully: $d\phi : \mathbb{R}^n\times\mathbb{R}\rightarrow\mathbb{R}^n : d\phi\left(\vect{x},h\right) = J_\phi \, \vec{x} \, h$?

Fredrik said:
In differential geometry, we define the "d" of a real-valued function f by df(v)=v(f).

In Geometrical Methods..., Schutz defines v(f) as equivalent to f(v), when v is a tangent vector and f a cotangent vector, so does this mean that when f is a cotangent vector, df = f, and when we let the differential operator d act on a tangent vector v, dv = v, since df(v) = v(f) = f(v), and dv(f) = f(v) = v(f)? Given that this does nothing at all, I assume there's something I've misunderstood.

Fredrik said:
Suppose that $\phi:\mathbb R^n\rightarrow\mathbb R^n$.

Perhaps I have a too narrow idea of what real-valued means. I thought it meant that the codomain was $\mathbb{R}$.

Fredrik said:
Let I be the identity map on $\mathbb R^n$ (note that it satisfies the definition of a coordinate system), and let v be a tangent vector at x. We have

$$d(\phi^i)v=v(\phi^i)=v^j\partial_j|_x\phi^i=v^j(\phi^i\circ I^{-1})_{,j}(I(x))=v^j\phi^i{}_{,j}(x)=(J_\phi(x)v)^i=d\phi(x,v)^i$$

where the last "d" is the other kind of "d", the kind I mentioned first.

I'm still pondering that...

Fredrik said:
$df|x$ is a cotangent vector at x with components $(df)_i(x)=df(\partial_i|_x)=\partial_i|_x(f)=f_{,i}(x)$ but the gradient is the tangent vector that corresponds to it via the isomorphism defined by the metric, i.e. you need to "raise an index" to get the gradient of f from $df$.

I get the feeling some people give the name gradient to the cotangent vector itself, but I could be mistaken. In The Road to Reality, Penrose seems to be calling directional derivatives gradients, and df the "full gradient" (e.g. fig, 10.8).

Rasalhague said:
I take it h was a real number in the special case of the differential of a function of a single variable, and the juxtaposition $f' \, h$ there denoted multiplication. Is that the case here too,
Yes and no. f'(x)h is the product of two real numbers. $J_\phi(x)h$ is the product of an n×n matrix and an n×1 matrix.

Rasalhague said:
In Geometrical Methods..., Schutz defines v(f) as equivalent to f(v), when v is a tangent vector and f a cotangent vector,
I agree with that definition when f is a cotangent vector, but my f is a function, not a cotangent vector. (Recall that tangent vectors are derivative operators on the ring of smooth functions from the manifold into the real numbers). My df is a 1-form, i.e. a cotangent vector.

Rasalhague said:
so does this mean that when f is a cotangent vector, df = f,
The d operation can be generalized to a function that takes n-forms to (n+1)-forms. The d I defined can be thought of as a special case of that, if we define "0-forms" to be functions. A cotangent vector is a 1-form, so if ω is a cotangent vector, dω is a 2-form, i.e. an alternating tensor of the type that acts on two tangent vectors to produce a number. (I have previously called that a (0,2) tensor, but I have noticed that some call it a (2,0) tensor, so I don't know what I should call it).

Rasalhague said:
and when we let the differential operator d act on a tangent vector v,
That's undefined as far as I know.

Rasalhague said:
Perhaps I have a too narrow idea of what real-valued means. I thought it meant that the codomain was $\mathbb{R}$.
That's what it means when I use that word. $\phi$ isn't real-valued. That's why I also used the word "generalize". Perhaps I should have done it in two steps: First define $df:\mathbb R^{2n}\rightarrow\mathbb R$ for functions $f:\mathbb R^n\rightarrow\mathbb R$ by $df(x,h)=f_{,j}(x)h^j$, where ",j" is the jth partial derivative and $h^j$ is the jth component of h. Then define $df:\mathbb R^{2n}\rightarrow\mathbb R^m$ for functions $f:\mathbb R^n\rightarrow\mathbb R^m$ by applying the previous definition to the component functions: $df^i(x,h)=f^i{}_{,j}(x)h^j$. Note that the notation $df(x,h)=f'(x)h$ works for all cases if we define f'(x) as the matrix of partial derivatives (i.e. the Jacobian matrix).

Rasalhague said:
I get the feeling some people give the name gradient to the cotangent vector itself, but I could be mistaken.
I don't know. I'm just using the definition I found here.

Last edited:
the differential of a function was originally thought of as a small displacement of a measurement. Small was not rigorously defined but it meant small enough so that the displacement was essentially dependent only on the local neighborhood.

A small enough displacement is dominated by a linear function and it was this linear part that become canonized as the differential of the function.

In a multi-variable world this differential is not a single number but a matrix of directional derivatives. dF(vector) is the directional derivative along a curve whose tangent at the given point is the vector. dF(vector) is the best linear approximation to the displacement of the function along the curve for small time increments. My calculus teacher called it a BLT(Best Linear Transformation).

The differential is not the same as the gradient. A gradient is a vector. A differential is a linear operator. However with an inner product one can find a vector v so that dF(x) = <v,x>. In other words, the linear operator dF is the same as the linear operator <v,>. This vector,v, is the gradient with respect to the given inner product. However, with a different inner product you would get a different gradient vector.

In a way calculus is taught wrong at first because the directional derivative of a function in the direction,x, is said to be gradF.x
The Euclidean inner product is used to find the gradient but you aren't told that and it is totally unnecessary. All that is needed is the differential.

The study of calculus without inner products is the field of differential topology. With inner products, it is differential geometry.

The curly delta is used to define a small displacement as well but usually the displacement occurs in an infinite dimensional space such as the space of piece wise smooth curves connecting two point on a surface. Instead of a function one has the integral of a function over each curve and one wants to know the infinitesimal displacement of this integral as one shifts to nearby curves. But the intuitive idea is exactly the same.

Last edited:
Many thanks for the answers. I will get round to replying more fully eventually. But for now, is the following anywhere near correct? Just looking, to begin with, at the case of elementary, single-variable calculus, $\mathrm{d}x$ could mean one of three things:

(1) A function called the differential of $x$, where $x:\mathbb{R} \rightarrow \mathbb{R}$. This $\mathrm{d}x$ is defined as $\mathrm{d}x:\mathbb{R}^2 \rightarrow \mathbb{R} \enspace \left | \enspace (t_0,\Delta t) \mapsto \mathrm{d}x(t_0,\Delta t) = \frac{\mathrm{d} x}{\mathrm{d} t} (t_0) \enspace \Delta t$, where $\Delta t$ means a finite increment in the independent variable. And for this kind of $\mathrm{d}x$,

$$\mathrm{d}x(t_0,\Delta t) = \Delta x(t_0, \Delta t) - \frac{\mathrm{d}^2 x}{\mathrm{d} t^2} (c) \enspace \frac{(\Delta t)^2}{2},[/itex] where $t_0 < c < \Delta t$. The differential is a linear approximation (first-order approximation) of an increment in the function due to a given increment in the independent variable. The second-order term in the equation above gives the error in this approximation. The error approaches zero as $\Delta t$ approaches zero. Since the increment is finite, the error is finite. To be a useful approximation, the increment must be small, but small has no rigorous, all-purpose definition here (i.e. it isn't a euphemism for infinitesimal), and how small "small" is will depend in the application. (2) An alternative way of denoting a finite increment, $\Delta x$, in the independent variable, $x$, of a function $f:\mathbb{R} \rightarrow \mathbb{R} \enspace \left| \enspace x \mapsto f(x)$, which is used, by convention, when this increment appears on the other side of an equals sign to an expression of the form $\mathrm{d}f$, where $\mathrm{d}f$ has meaning (1), the differential of $f$. Thus: [tex]\mathrm{d}_{sense 1}f(x_0,\Delta x) = \frac{\mathrm{d} f}{\mathrm{d} x}(x_0) \enspace \mathrm{d}_{sense 2}x = \frac{\mathrm{d} f}{\mathrm{d} x}(x_0) \enspace \Delta x.[/itex] (3) In nonstandard analysis, literally an infinitesimal--and in standard analysis an equivalent concept to an infinitesimal, defined in terms of a limit--in the following contexts: (i) Leibnitz notation for a derivative, $\frac{\mathrm{d} x}{\mathrm{d} t}$, where both $\mathrm{d}x$ and $\mathrm{d}t$ are infinitesimals (or limits), (ii) in the differential form, $\mathrm{d}x = \frac{\mathrm{d}x}{\mathrm{d}t} \mathrm{d}t$, of a differential equation, where again both $\mathrm{d}x$ and $\mathrm{d}t$ are infinitesimals or limits (or is this an example of the linear approximation and increment meanings?), (iii) the symbol for an integration variable, such as $\mathrm{d}t$ in $\int x(t) \enspace \mathrm{d}t$, (iv) in the substitution formula, $\mathrm{d}u = \frac{\mathrm{d} u}{\mathrm{d} x} \mathrm{d}x$, for a change of variable of integration, where both $\mathrm{d}u$ and $\mathrm{d}x$ are infinitesimals (or, equivalently, some kind of limit), perhaps a special case of the second category, and (v) often in physics, an infinitesimal increment generally, in which context it may also be called a differential displacement. Now, given the identical notations for (1) and (3), it's tempting to think that there's a definition of differential that encompasses both ideas: in nonstandard terms, perhaps, a hyperreal-valued function of one real ($t_0$) and one hyperreal ($\Delta t$) variable. I'm not sure how to express this idea clearly in standard terms (the language of limits) though. Or is it best to keep all of these concepts distinct, in spite of the notation and names. Fredrik, when you wrote, in #9, df(x,h)=f'(x)h. Note that h, which is sometimes written as "dx", doesn't need to be small. h needs to be small when we try to estimate f(x+h)-f(x) by df(x,h), because these two only agree to first order in h. So a lot of physicists would say that f(x+h)-f(x)=df(x,h) when h is infinitesimal, but that's just a dumber way of saying what I just said. the "dumber way" sounds like a short-hand version of the nonstandard analysis formulation, in so far as I understand it (possibly not ver far...), that when h is infinitesimal, it disappears when we take the "standard part", which I suppose corresponds to the limit-based idea that the error in the first-order approximation, provided by the differential, vanishes as the increment h approaches zero. Last edited: Rasalhague said: $\mathrm{d}x$ could mean one of three things: (1) A function called the differential of $x$, where $x:\mathbb{R} \rightarrow \mathbb{R}$. This $\mathrm{d}x$ is defined as $\mathrm{d}x:\mathbb{R}^2 \rightarrow \mathbb{R} \enspace \left | \enspace (t_0,\Delta t) \mapsto \mathrm{d}x(t_0,\Delta t) = \frac{\mathrm{d} x}{\mathrm{d} t} (t_0) \enspace \Delta t$, where $\Delta t$ means a finite increment in the independent variable. That's the definition I use. Rasalhague said: And for this kind of $\mathrm{d}x$, [tex]\mathrm{d}x(t_0,\Delta t) = \Delta x(t_0, \Delta t) - \frac{\mathrm{d}^2 x}{\mathrm{d} t^2} (c) \enspace \frac{(\Delta t)^2}{2},[/itex] where $t_0 < c < \Delta t$. If your [tex]\Delta x(t_0,\Delta t)$$ is the actual change of x when you change t by $$\Delta t$$, then you need all the higher order terms as well on the right-hand side.

Rasalhague said:
The differential is a linear approximation (first-order approximation) of an increment in the function due to a given increment in the independent variable. The second-order term in the equation above gives the error in this approximation.
You definitely need the higher order terms as well. If you write

$$f(x)=f(0)+f'(0)x+E(x)$$

you can see that

$$\left|\frac{E(x)}{x}\right|\rightarrow 0$$

when $$x\rightarrow 0$$, but you can't just set the error term equal to the second order term in the Taylor expansion. There's a pretty cool trick you can use to get all the other terms...

$$E(x)=f(x)-f(0)-f'(0)x=\int_0^x f'(t)dt-\int_0^xf'(0)dt=\int_0^x(f'(t)-f'(0))dt$$

$$=\int_0^x (f''(0)t+E_1(t))dt=f''(0)\frac{x^2}{2}+\int_0^x E_1(t)dt$$

where $E_1(t)$ is a new error term (which you can integrate to get the error you get when you keep terms up to second order). And now you can use the same method to get an expression for $E_1(t)$, which will contain another error term $E_2(t')$. And you don't have to stop there. The equation E(x)=f(x)-f(0)-f'(0)x generates the entire Taylor series recursively.

Rasalhague said:
(2) An alternative way of denoting a finite increment
Yes, it's used that way too.

Rasalhague said:
(3) In nonstandard analysis, literally an infinitesimal
I know almost nothing about non-standard analysis, so I'm not going to comment.

Rasalhague said:
the "dumber way" sounds like a short-hand version of the nonstandard analysis formulation,
Maybe it does, but it also sounds like a shorthand for the Taylor series version, and you should keep in mind that the people who write these things probably don't know anything about non-standard analysis. They have probably heard the term and know that it includes a rigorous definition of infinitesimals, but they have no idea what that definition is. So I think you should just interpret that word as a warning that the next equation you see is an approximation valid to some order in the independent variable(s), and not as something that has anything to do with actual infinitesimals.

Last edited:
Fredrik said:
you can't just set the error term equal to the second order term in the Taylor expansion.

I didn't mean just the second term,

$$\frac{\mathrm{d}^2 x}{\mathrm{d} t^2} (t_0) \enspace \frac{(\Delta t)^2}{2},$$

of the Taylor expansion, but the actual error term, involving the second derivative evaluated not at $t_0$, but some other point, $c$ on the open interval $(t_0,\Delta t)$. It's all too posible I've misunderstood, but I got this from Berkey/Blanchard: Calculus (3rd edition), Ch. 4, Theorem 12, "Taylor's Theorem (First Derivative Version)":

Suppose the function f is continuous on the interval [a,b] and twice differentiable on (a,b). Then there exists a number c $\in$ (a,b) such that

$$f(b)=f(a)+f'(a)(b-a)+\frac{f''(c)}{2}(b-a)^2.$$

I just subracted f(a) from both sides of that to find an expression for (if not a method of calculating) $\Delta f$.

Fredrik said:
Yes, it's used that way too.

That seems like having two tools that are used together for different, but complementary, purposes, such as a hammer and tongs, and calling them each by the same name, e.g. hammer, and then referring to various other tools in the workshop as hammers too, without always saying which kind of hammer they are, and leaving open the possibility that they might be some other kind of tool entirely, the traditional answer to enquiries being "it's just a convenient name"!

In the Leibniz notation for a derivative, in the standard view that doesn't include infiniesimals, I guess the d in both numerator and denominator is a relic of a time when they were regarded as infinitesimals, and Berkey & Blanchard's frequent use of the word "notation" is their way of saying that this system doesn't treat them as infinitesimals.

How about the d in the integral symbol, the notation of the substitution formula du = du/dx dx, and the "differential form" of a differential equation? Should these be thought of, in standard analysis, as relics from a time when they were treated as infinitesimals, or are they examples of the linear approximation or the finite increment meanings?

*

Just for the fun of it, here's a medley of coy quotes from Berkey/Blanchard. They seem to give the name differential both to the linear approximation function (sense 1 in #13), and to the increment of the independent variable (sense 2), and use the notation for several concepts besides. One thing we can be sure of: it's a "notation"!

"Until we study antidifferentiation in Ch. 5, we shall regard the differentials dx and dy as merely a notational device to help us remember the linear approximation $\Delta y \approx f'(x) \enspace \Delta x$ [...] Frequently the symbol dx is used to denote small changes in x [...] and the symbol dy is used to represent the approximation to the resulting increment $\Delta y$ given by the right side of the approximation $\Delta y \approx f'(x) \enspace \Delta x$ [...] Historically they have been used to argue that the derivative can be thought of as a ratio of infinitesimals" (pp. 165-6).

This from Chapter 5, where enlightenment was promised:

"While the symbol dx suggests the differential discussed in Ch. 3, it should be regarded for now as simply part of the notation signifying the indefinite integral for f" (p. 278).

Here, apparently an admission that a third entity is denoted by the same symbol:

"In the method of substitution, it is important to note that eq. 5, $\int f(g(x)) \enspace g'(x) dx = \int f(u) \enspace du = F(u) + C$, results from the notation $du = g'(x) dx$ and not from its interpretation as a linear approximation" (p. 290).

(Which reminds me of someone's mention in a recent thread here of marking a piece of homework that offered a "proof by notation".) And yet, they justify this "way to simplify the procedure of identifying the integrand" by saying that it's "based on the notation for the differential du of the function u=g(x). Recall the definition of a differential (Section 3.7)" (p. 289). But Section 3.7, quoted above, defined the differential corresponding formally to du here as a linear approximation, the very thing they warn us not to interpret it as!

"We will frequently encounter the differential equation $\frac{\mathrm{d}y}{\mathrm{d}x}=f(x)$ in the differential form $dy=f(x) \enspace dx$. In fact, the differential formulation of [this] differential equation is simply another use of the differential notation introduced in Section 3.7. Recall that if y=F(x), then we defined y=F'(x) dx. When we write dy=f(x) dx, we are asserting that dy/dx = F'(x) = f(x)" (p. 297).

In this usage, do the d's have the same two meanings as in the previous example, "the method of substitution", p. 290? The fact that they call this "another use" of the same notation suggests that perhaps they don't intend it to have the same meanings (linear approximation and finite increment) as in Section 3.7, the only actual definition, but they don't explicitly say whether and to what extent it should be regarded as the same concept as any of the other uses of the same notation.

"We write the symbol dx following the integrand f(x) to indicate that x is the independent variable for f. (We shall see later [at some unspecified point!] that the symbol dx has a special meaning associated with the differential dx [we're not told which of the many things called a differential so far, or in what way associated], as suggested by the Riemann sum. But for now simply regard dx as part of the notation identifying the definite integral" (p. 325).

What the relationship to the Riemann sum always "suggested" to me was that dx in the integral notation was (neither a linear approximation, nor a finite increment) but an infinitesimal, but if there's no such thing in standard analysis, it must be something else. Later, far from elaborating on this "special meaning", they call just it a "dummy variable" which they say is "simply used to fill out the standard notation for the definite integral" (p. 341), and when written as the numerator of a fraction in the integrand "simply a convenient notation" (p. 354).

Fredrik said:
I know almost nothing about non-standard analysis, so I'm not going to comment.

I've just dipped into this: http://www.lightandmatter.com/calc/

Aha, I just came across a justification, in the Wikipedia article Differential of a function, for writing the finite increment in the independent variable as dx, namely that $dx(x_0, \Delta x) = x'(x_0) \enspace \Delta x = \Delta x$.

## 1. What is gradient notation and how is it used?

Gradient notation is a mathematical notation used to represent the gradient of a function. It is represented by the symbol ∇ and is useful in calculating the rate of change of a function in a particular direction.

## 2. Can you explain the difference between gradient and differential notation?

While gradient notation represents the gradient of a function, differential notation represents the derivative of a function. The derivative is the rate of change of a function with respect to its independent variable, while the gradient is the vector representing the direction and magnitude of the steepest slope of a function.

## 3. What does the symbol ∇ mean in gradient notation?

The symbol ∇, also known as "del" or "nabla", is used in gradient notation to represent the gradient operator. It is not a variable, but rather a symbol that indicates that the following expression represents a vector gradient.

## 4. How is gradient notation used in calculus?

In calculus, gradient notation is used to calculate the directional derivative of a function, which measures the rate of change of a function in a specific direction. It is also used in optimization problems to find the maximum or minimum value of a function.

## 5. Are there any other notations used to represent the gradient?

Yes, there are other notations used to represent the gradient, such as ∇f, grad(f), and Df. These notations are all equivalent and are used interchangeably in mathematics and science.

Replies
7
Views
3K
Replies
1
Views
1K
Replies
4
Views
4K
Replies
7
Views
3K
Replies
5
Views
2K
Replies
12
Views
3K
Replies
38
Views
4K
Replies
4
Views
1K
Replies
12
Views
2K
Replies
4
Views
2K