What is the interpretation of dx in calculus?

"Don't panic!"
Messages
600
Reaction score
8
Apologies if this isn't quite the right forum to post this in, but I was unsure between this and the calculus forum.

Something that has always bothered me since first learning calculus is how to interpret dx, essentially, what does it "mean"? I understand that it doesn't make sense to consider it as an infinitesimal change in x (in a rigorous sense) as the idea of an infinitesimal cannot be formulated rigorously (at least in standard analysis), but can any sense of this notion be retained?

I read in Spivak's book that we can not consider quantities such as df in the classical sense, but to overcome this we can replace this notion of infinitesimals by promoting the quantities df to functions of infinitesimal changes along particular directions, i.e. functions of tangent vectors. In this sense the function df: T_{p}M\rightarrow\mathbb{R} contains all information about the infinitesimal changes in the function f as it moves in particular directions (i.e. along particular directions). Thus we can consider the functions dx^{i}: T_{p}M\rightarrow\mathbb{R} as containing all information about the infinitesimal change in the coordinate functions in particular directions. I have paraphrased what is written in the book and tried to reformulate it in the way that I can understand it; would what I put be correct?

Also, is there a way to formulate the idea of a differential in elementary calculus (without resorting to non-standard analysis)? Is it correct to say that one can consider the rate of change in a function at a point, f'(x_{0}) which is the gradient of the tangent line y to the function f at this point. From this we can construct a new function df which is dependent on the point x and this change in its value \Delta x, such that df(x_{0},\Delta x)=f'(x_{0})\Delta x Thus, the (finite) change in the function near a point x_{0}, \Delta f can be expressed as the following \Delta f=f'(x_{0})\Delta x +\varepsilon =df+\varepsilon where \varepsilon is some error function. We note that dx(x_{0},\Delta x)=\Delta x and so \Delta f=f'(x_{0})dx +\varepsilon =df+\varepsilon \Rightarrow df=f'(x_{0})dx where df=f'(x_{0})dx represents a (finite) change along the tangent line to the function f at the point x_{0}. I'm unsure how to proceed from here though?!
 
Physics news on Phys.org
I found this discussion on dx that may help although you are most likely beyond this point:

http://mathforum.org/library/drmath/view/60949.html

As I read your post, it got me thinking about it too. For me from a differential geometry perspective, the dx represents a vector in the x direction vs dy and dz representing vectors along the y and z directions so that df = f'dx + f''dy + f'dz represents a vector that is in the same direction as the the gradient of the function f(x,y,z).

http://en.wikipedia.org/wiki/Gradient

http://en.wikipedia.org/wiki/Differential_form
 
jedishrfu said:
the dx represents a vector in the x direction vs dy and dz representing vectors along the y and z directions so that df = f'dx + f''dy + f'dz represents a vector that is in the same direction as the the gradient of the function f(x,y,z).

Thanks for the comments. I understand this bit, but have gotten myself into a bit of a "maths spiral" trying to somehow relate it to the classical (non-rigorous) notion of an infinitesimal quantity?!
 
There's no escpaingthe infinitesimal quantity notion as it comes up in limit proofs.

Is your confusion related to the notion of plotting the function f on x, y vs having a one dimensional geometry where the f provides a value for each x?

As an example, f could represent the temperature at any point along a one dimensional line.
 
jedishrfu said:
Is your confusion related to the notion of plotting the function f on x, y vs having a one dimensional geometry where the f provides a value for each x?

Yeah, I think it really arises from the fact that I've seen the differential written as df=\lim_{\Delta x\rightarrow 0}\Delta f, but to me this would imply that df=\lim_{\Delta x\rightarrow 0}f'(x)\Delta x =\lim_{\Delta x\rightarrow 0}f'(x)\lim_{\Delta x\rightarrow 0}\Delta x which doesn't seem to make any sense, as clearly \lim_{\Delta x\rightarrow 0}\Delta x =0?!
 
Yeah, I see your confusion. I don't think you can do the limit distribution trick with dx though, but I couldn't find an exact reference:

http://math.oregonstate.edu/home/pr...estStudyGuides/SandS/lHopital/limit_laws.html

As an example:

lim y = lim (y)*(x/x) = lim (y/x) lim (x) which leads to a nonsensical answer that all limits approach zero as lim x approaches zero.

Perhaps reviewing the use of L'Hospital's rule will help resolve your confusion:

http://math.oregonstate.edu/home/pr...uestStudyGuides/SandS/lHopital/statement.html

http://en.wikipedia.org/wiki/L'Hôpital's_rule

@Mark44 would have a much better answer though.
 
Last edited:
"Don't panic!" said:
Also, is there a way to formulate the idea of a differential in elementary calculus (without resorting to non-standard analysis)? Is it correct to say that one can consider the rate of change in a function at a point, f'(x_{0}) which is the gradient of the tangent line y to the function f at this point. From this we can construct a new function df which is dependent on the point x and this change in its value \Delta x, such that df(x_{0},\Delta x)=f'(x_{0})\Delta x Thus, the (finite) change in the function near a point x_{0}, \Delta f can be expressed as the following \Delta f=f'(x_{0})\Delta x +\varepsilon =df+\varepsilon where \varepsilon is some error function. We note that dx(x_{0},\Delta x)=\Delta x and so \Delta f=f'(x_{0})dx +\varepsilon =df+\varepsilon \Rightarrow df=f'(x_{0})dx where df=f'(x_{0})dx represents a (finite) change along the tangent line to the function f at the point x_{0}. I'm unsure how to proceed from here though?!

The idea of dx in single variable calculus is the same as in several dimensions. If you have a function f(x), then

df = f'(x)dx

is a 1-form. If you have a small change in x, represented by

ζ = Δx (∂/∂x)

Then the change in f is

Δf = (f'(x)dx)⋅ζ

= f'(x)Δx

The interpretation of this is that Δf is the change in f when ζ is very small, or more precisely, it is the first order change in f. You cannot write

Δf = df + ε

because Δf is a number and df is a 1-form.
 
"Don't panic!" said:
Yeah, I think it really arises from the fact that I've seen the differential written as df=\lim_{\Delta x\rightarrow 0}\Delta f, but to me this would imply that df=\lim_{\Delta x\rightarrow 0}f'(x)\Delta x =\lim_{\Delta x\rightarrow 0}f'(x)\lim_{\Delta x\rightarrow 0}\Delta x which doesn't seem to make any sense, as clearly \lim_{\Delta x\rightarrow 0}\Delta x =0?!

The point of the classical notion of an "infinitesimal quantity" is simply to focus on linear changes and linear approximations. f'(x) tells you the linear approximation of f at the point x. If you change x by a very small amount, then the change in f(x) is mainly coming from its linear behavior, i.e. the second derivatives become unimportant, as x goes to 0.

In modern calculus, instead of developing calculus on the idea of a derivative (which is a representation of the linear behavior of f, since it is the slope of the tangent line to f), we base calculus on the linear behavior itself, which is df, defined as a linear operator. If you have a curve ξ : λ → M, then

df( d/dλ )

tells you the linear part of change in f as you move along the curve.
 
Last edited:
So would it be correct to say that df encodes all the information about the incremental change in f at a particular point due to an infinitesimal change along a particular coordinate direction (itself encoded in the partial derivative \frac{\partial} {\partial x^{i}})?!
 
  • #10
Yes. It encodes how f is changing in all directions, not just a particular direction. More precisely, it encodes the linear changes in f in all directions, which is another way of saying that it encodes how f changes with infinitesimal changes in x.
 
  • #11
I am going to give this modern point of view of differentials because it is the one that one must use when doing calculus on manifolds.

In modern Mathematics differentials are linear maps between tangent spaces. They codify the idea of the Jacobian matrix of a function.

From this point of view, dx can be thought of as the identity linear transformation on the tangent space to R. That is dx(∂/∂x) = ∂/∂x. In this context, it is the differential of the identity map, x-> x of R into itself.

Aside: Usually, since every tangent vector to R is of the form A(x)∂/∂x at some point,x,one often writes dx(A(x)∂/∂x) = A(x) and because of this one can define dx as having values in R. So dx or the differential of any function may be defined as a 1-form that maps the tangent space into R. But here in this way of looking at it dx is being treated as the identity linear map with values in the tangent bundle to R.

In general if f maps an n-manifold into an m-manifold, df is a linear map of the n dimensional tangent spaces into the m dimensional tangent spaces.

To each smooth function between two smooth manifolds

f: M ->N

the differential, df, is a smooth vector bundle morphism of the tangent bundles,. This means that df: TM -> TN such that df: TM##_{p}## -> TN##_{f(p)}## is linear on each fiber, TM##_{p}##, and df is a smooth function between the two smooth manifolds TM and TN.

If f and g are two smooth maps

f: M -> N and g: N -> W then

d(gof) = dgodf

This is the general form of the Chain Rule.

Note also that if

f: M -> M = id##_{M}## then

df is the identity linear transformation on each tangent space to M. The case of dx is the special case of the identity map on R.

So for illustration, take the case of a function from R to R.
df is a linear map from the tangent bundle of R into itself that is linear on fibers.

This means that df(A(x)∂/∂x) = A(x)df(∂/∂x) is a tangent vector at the point, f(x).

Exercise: Figure out what vector df(∂/∂x) is.

Observation: Suppose φ,ψ: U -> R##^n## are two coordinate charts from an open set,U, in a manifold,M, and let v be a tangent vector at a point in U. Then

w = dφ(v) and u = dψ(v) are two different tangent vectors at different points in R##^n##.

Therefore, the inverse differentials dφ##^{-1}## and dψ##^{-1}## map w and u back to v.
So one can think of v as the vectors w and u identified as the same vector. In fact the identification mapping is

dφodψ##^{-1}## since dφodψ##^{-1}##(u) = w ( by the Chain Rule).
This is how one defines tangent spaces on manifolds in terms of coordinate charts.
 
Last edited:
  • #12
Thanks for your explanations. I'm really stuck trying to understand what exactly df and dx actually "represent" in a modern context, I was trying to parse Spivak's discussion (that I paraphrased in my first post), but I'm struggling if I'm honest. Is the point that we wish df to describe the infinitesimal change in f due to an infinitesimal change in the point it is evaluated at. To do so it must be a function of this infinitesimal change, i.e. it must be a function of the tangent vectors (in the tangent space at that point) as they themselves describe all the possible directions (and rates of change) that the function f could pass through that point. In this sense, if f=x^{i} (where dx^{i} is a given coordinate function), then dx^{i} describes the infinitesimal change in the coordinate function x^{i} as a function of all possible directions that it could pass through that point. Obviously, as a coordinate function only changes along itself, and at unit "speed" with respect to itself we require that dx^{i}\left(\frac{\partial}{\partial x^{i}}\right)=1 or, more generally, dx^{i}\left(\frac{\partial}{\partial x^{j}}\right)=\delta^{i}_{\; j} In this sense the functional df evaluated along the rate of change of a particular coordinate function should describe the rate of change in the function f at a particular point, i.e.
df\left(\frac{\partial}{\partial x^{i}}\right)=\frac{\partial f}{\partial x^{i}}
Would this be correct at all?
 
Last edited:
  • #13
"Don't panic!" said:
Thanks for your explanations. I'm really stuck trying to understand what exactly df and dx actually "represent" in a modern context, I was trying to parse Spivak's discussion (that I paraphrased in my first post), but I'm struggling if I'm honest. Is the point that we wish df to describe the infinitesimal change in f due to an infinitesimal change in the point it is evaluated at. To do so it must be a function of this infinitesimal change, i.e. it must be a function of the tangent vectors (in the tangent space at that point) as they themselves describe all the possible directions (and rates of change) that the function f could pass through that point. In this sense, if f=x^{i} (where dx^{i} is a given coordinate function), then dx^{i} describes the infinitesimal change in the coordinate function x^{i} as a function of all possible directions that it could pass through that point. Obviously, as a coordinate function only changes along itself, and at unit "speed" with respect to itself we require that dx^{i}\left(\frac{\partial}{\partial x^{i}}\right)=1 or, more generally, dx^{i}\left(\frac{\partial}{\partial x^{i}}\right)=\delta^{i}_{\; j} Would this be correct at all?

That is correct. dx is the dual form to ∂/∂x in a coordinate system.

Whenever one has a vector space, one has a dual space. In the case of coordinate tangent vectors. one has dual coordinate tangent vectors. The concept of a dual basis is general and applies to any basis of any vector space.

I strongly recommend that you try to understand this stuff from the point of view that I described above.
Take a look at Milnor's "Topology from the Differentiable Viewpoint". He uses this method I from the beginning.

As far as infinitesimals go, I think you are getting a little off track. You should think of differentials as linear maps between tangent spaces or in the case of functions - like coordinate functions - you can think of them as 1 forms - i.e. as dual vectors to the tangent space.

Whenever one has a vector space, one has the dual vector space of linear maps of the vector space into the base field. the differential of a function may be thought of as a dual vector because given any tangent vector at a point, df(v) = v.f is a linear map into R. That is given any linear combination of tangent vectors av + bw

df(av + bw) = a(v.f) + b(w.f).

This of course is true for coordinate functions.
 
Last edited:
  • #14
So is the definition df(p)(v_{p})=v_{p}[f](p) where v_{p}\in T_{p}M given this way because we wish df, evaluated at a particular point p\in M, to describe the infinitesimal change in f as it passes through p along the direction described by v_{p}, and as such this must equal the directional derivative of the function along v_{p}?
 
  • #15
"Don't panic!" said:
So is the definition df(p)(v_{p})=v_{p}[f](p) where v_{p}\in T_{p}M given this way because we wish df, evaluated at a particular point p\in M, to describe the infinitesimal change in f as it passes through p along the direction described by v_{p}, and as such this must equal the directional derivative of the function along v_{p}?
yes
 
  • #16
Ah, ok. Thank you very much for your help.
 
  • #17
Not a mathematician, but I think ##dx## only has any meaning if it appears in an expression containing another differential such as ##dy##, kind of like how open parenthesis only has meaning when there is a close parenthesis in the same expression. It tells how two or more variables scale with each other, so one differential by itself means nothing. They need to be arranged in an equation such that they can be rearranged into a division, like ##dy/dx## or ##dx/dy##, and the derivatives have their usual epsilon limit definition. But the differential notation is nice because it shows the symmetry between the two variables.

A multivariable differential like
##dz = x dx + y dy##
means the same thing as
##\partial z/\partial x = x##
##\partial z/\partial y = y##
##\partial x/\partial z = 1/x##
##\partial x/\partial y = y/x##
##\partial y/\partial z = 1/y##
##\partial y/\partial x = x/y##
with the implication that z is a function of x and y, and vice versa
but written more elegantly and symmetrically.
sorry if I'm stating the obvious
 
  • #18
df is a function of 2 variables, x and h. for each x, df is the homogeneous linear approximation to the function of h defined by f(x+h)-f(x). Thus if you imagine the graph of y = f(x), and its family of tangent lines, these tangent lines are the graphs of the functions df + f(x), for each value a of x. I.e. for each a, the tangent line at (a,f(a)) is the graph of df(x-a)+f(a). Taking f(x) = x, you get the definition of dx. I hope this is right. I'm watching the Harvard UNC game.

I.e. df is really a section of the cotangent bundle to the x axis, whose value at x is the linear function of h with value f'(x).h. but then i have to define cotangent bundle.
 
  • #19
If I've understood it correctly, in order to quantify how a coordinate function x^{i} is changing at a particular point p\in M one needs to know how it can change as it passes through that point. The set of tangent vectors in the vector space at that point describe all the possible directions in which a function can pass through that point and all the possible "speeds" (rates of change) that it can do so at. These vectors quantify the instantaneous changes along all possible paths through that point and so we can consider a function dx^{i}, that is dependent on the tangent vectors in the tangent space to that point, which quantifies the instantaneous (or infinitesimal) change in the coordinate function along a chosen direction at that point. Obviously, x^{i} will only change along the coordinate curve that it defines, and thus, its differential change along any given vector v_{p}\in T_{p}M at the point p\in M should be equal to the component of that vector along that coordinate curve, i.e. dx^{i}(v_{p})=v_{p}^{i} as v_{p}^{i} quantifies how much change occurs along the direction of the basis vector \frac{\partial} {\partial x^{i}} at that point. Hence, for consistency the change in x^{i} at that point along the basis vector \frac{\partial} {\partial x^{i}} should be 1, as it is a measure of how it changes with respect to itself at that point, i.e. dx^{i}(\frac{\partial} {\partial x^{i}})=1.
Moving on to general (differentiable) functions f we note that all the possible paths that f could pass through a given point p\in M, and how "fast" it can do so, are encoded in the tangent vectors v_{p}\in T_{p}M and thus the infinitesimal change in the function, df as it passes through that point should depend on which path it takes, i.e. which vector it "points" along at that point. Thus, for df to quantify the instantaneous change in f, when evaluated along a particular vector v_{p}\in T_{p}M it should be equal to the directional derivative of f along v_{p}, i.e. df(v_{p})=v_{p}(f)
Apologies for the long-windedness of this post, just trying to express in words what I think I've understood from this discussion.
 
  • #20
It is not true that xi changes only along ∂/∂xi

Draw a picture and it will be clear. For example, in 2 dimensions, with coordinates x and y, x changes in all directions except ∂/∂y
 
  • #21
dx said:
For example, in 2 dimensions, with coordinates x and y, x changes in all directions except ∂/∂y

Where else can x change though other that along x or -x (if it changes in any other direction there will be some amount of change in y)?!
 
  • #22
You are confusing the point x with the coordinate x.

Let's call the point p, to avoid the confusion.

p can move in the direction ∂/∂x + ∂/∂y, and x will change.
 
  • #23
The only direction in which p can move without changing x is ∂/∂y
 
  • #24
dx said:
You are confusing the point x with the coordinate x.

Apologies, I'm struggling to get out of that habit :-/

dx said:
p can move in the direction ∂/∂x + ∂/∂y, and x will change.
#

Sorry, I probably didn't articulate my point very well before, but I was basically trying to say that if we evaluate dx along e.g. v_{p}=\frac{\partial}{\partial x}+\frac{\partial}{\partial y}, then dx(v_{p})=dx(\frac{\partial}{\partial x}+\frac{\partial}{\partial y})=dx(\frac{\partial}{\partial x})+dx(\frac{\partial}{\partial y}) = 1+0=1 i.e. the change in the coordinate function x along the direction v_{p} is quantified by the component of v_{p} in the direction given by the basis vector \frac{\partial}{\partial x}, right?
 
  • #25
Khashishi said:
Not a mathematician, but I think ##dx## only has any meaning if it appears in an expression containing another differential such as ##dy##, kind of like how open parenthesis only has meaning when there is a close parenthesis in the same expression. It tells how two or more variables scale with each other, so one differential by itself means nothing. They need to be arranged in an equation such that they can be rearranged into a division, like ##dy/dx## or ##dx/dy##, and the derivatives have their usual epsilon limit definition. But the differential notation is nice because it shows the symmetry between the two variables.

This is not correct. dx is the special case of the differential of a function which is a special case of a 1 form.
On the other hand if you think of dx as an infinitesimal increment then you are correct that one sees it in ratios that represent derivatives.

In this thread, differentials are being thought of either as 1 forms or as linear maps on tangent bundles. This is a little different.
"Don't panic!" said:
Apologies, I'm struggling to get out of that habit :-/

#

Sorry, I probably didn't articulate my point very well before, but I was basically trying to say that if we evaluate dx along e.g. v_{p}=\frac{\partial}{\partial x}+\frac{\partial}{\partial y}, then dx(v_{p})=dx(\frac{\partial}{\partial x}+\frac{\partial}{\partial y})=dx(\frac{\partial}{\partial x})+dx(\frac{\partial}{\partial y}) = 1+0=1 right?
yes.

In general,

dx( a∂/∂x + b∂/∂y) = a + 0 = a
 
  • #26
Is the summary I gave a couple of posts above correct at all?
 
  • #27
It seems essentially correct, although your language is a little strange. You talk about "f passing through a point" and "df passing through a point"

f and df don't pass through points. The curves pass through points. If a curve passes through a point p, then the tangent vector to that curve at p, when acted on by df will give you the directional derivative.
 
  • #28
dx said:
f and df don't pass through points. The curves pass through points.

I was just trying to get an intuitive understanding of the notion really and understand how df relates to an infinitesimal change in f?! Would you be able to provide me with a better description/ intuitive "picture"?
 
  • #29
Sure. Just think of single variable calculus, for simplicity. You have a function of one variable f(x).

At a point c, you move by a small amount e, then f(c + e) = f(c) + f'(c)e + o(e2) + higher order terms

We are interested in the second term, which tells you the linear behavior of f at c. Let's call this linear behavior 'df'. Clearly, for every e, df should tell you how much f changes when you change x by an amount e at that point, and moreover, this change is a linear function of e. This linear function, or linear operator is what we call df.

So df is a linear operator on the tangent space at p, which tells you the 'linear change' in f when you move close to p, i.e. move in the tangent space.
 
  • #30
So essentially, instead of talking about 'infinitesimals', we talk about linear approximations. The tangent vector is the linear approximation to a curve at some point. The 1-form df is the linear approximation to f at a particular point.
 
  • #31
So are the functions dx^{i} essentially the coordinate functions of the tangent space?
I guess all this approximation business has been confusing me. I've heard before that df is the best linear approximation of f at a particular point, but the "approximation" part is what's troubling me. In elementary calculus we say that df is an infinitesimal change in f near a point (which I know is not strictly correct) and this equals \frac{df}{dx}dx (in one dimension), but is this essentially saying that dx is the coordinate function of the tangent line to f, and so when dx is evaluated on a particular tangent vector at that point df quantifies how much f f is changing in that direction (and at what rate it is doing so)?
 
  • #32
"So are the functions dxi essentially the coordinate functions of the tangent space?" yes exactly.
 
  • #33
So is the reason why we can qualitatively think of dx^{i} as an infinitesimal change in the coordinate function x^{i} at a particular point because dx^{i} is only defined at that point, but encodes the information about how x^{i} changes along all possible directions (encoded in the tangent vectors) as one "moves" through that point?

(P. S. If you have any good references on this particular discussion it'd be much appreciated).
 
  • #34
"Don't panic!" said:
So are the functions dx^{i} essentially the coordinate functions of the tangent space?
Right. As you know, there's a theorem that says that if ##v\in T_pM## and ##x## is a coordinate system with p in its domain, we can write
$$v=v(x^i)\bigg(\frac{\partial}{\partial x^i}\bigg)_p.$$ This means that the ith component of v in the coordinate system x is ##v(x^i)##. And the definition ##(df)_p(v)=v(f)## implies that
$$(dx^i)_p(v)=v(x^i).$$ So for each p in the domain of x, ##(dx^i)_p## takes each tangent vector at p to its ith component. In particular, we have
$$(dx^i)_p\bigg(\frac{\partial}{\partial x^j}\bigg)_p =\bigg(\frac{\partial}{\partial x^j}\bigg)_p x^i =(x^i\circ x^{-1})_{,j}(x(p))=I^i_{,j}(p)=\delta^i{}_j,$$ where ##I## is the identity map on ##\mathbb R^n##. This means that ##\big((dx^1)_p,\dots,(dx^n)_p\big)## is the ordered basis dual to ##\big(\big(\frac{\partial}{\partial x^1}\big)_p,\dots,\big(\frac{\partial}{\partial x^n}\big)_p\big)##.

"Don't panic!" said:
I guess all this approximation business has been confusing me. I've heard before that df is the best linear approximation of f at a particular point, but the "approximation" part is what's troubling me. In elementary calculus we say that df is an infinitesimal change in f near a point (which I know is not strictly correct) and this equals \frac{df}{dx}dx (in one dimension), but is this essentially saying that dx is the coordinate function of the tangent line to f, and so when dx is evaluated on a particular tangent vector at that point df quantifies how much f f is changing in that direction (and at what rate it is doing so)?
The term "infinitesimal" doesn't belong in a calculus book. In calculus, given a function ##f:\mathbb R^n\to\mathbb R##, you can define a function ##df:\mathbb R^{2n}\to\mathbb R## by
$$df(x,h)=f_{,i}(x)h^i$$ for all ##x,h\in\mathbb R^n##. This makes ##df(x,h)## the first-order approximation of the difference ##f(x+h)-f(x)##. In differential geometry, we have the definition ##(df)_p(v)=v(f)## for all ##v\in T_pM##. To see the connection between these two, let's use the differential geometry definition with the identity map as our coordinate system, on a function ##f:\mathbb R^n\to\mathbb R##. We get
$$(df)_p(v)=v(f)=v^i \bigg(\frac{\partial}{\partial I^i}\bigg)_p f =v^i (f\circ I^{-1})_{,i}(I(p)) =v^i f_{,i}(p)=df(p,v),$$ where the df on the right is the one from calculus, and the v on the right is the n-tuple of coordinates (in the coordinate system ##I##) of the v on the left.
 
Last edited:
  • #35
Thanks for your descriptive explanations Fredrik.
My confusion initially arose from reading Spivak's book on differential geometry in which he says:

"Classical differential geometers (and classical analysts) did not hesitate to talk about ‘infinitely small’ changes dx^{i} or the coordinates x^{i}, just as Leibniz had. No one wanted to admit that this was nonsense, because true results were obtained when these infinitely small quantities were divided into each other (provided one did it in the right way).
Eventually it was realized that the closest one can come to describing an infinitely small change is to describe a direction in which this change is supposed to occur, i.e., a tangent vector. Since df is supposed to be an infinitesimal change of f under an infinitesimal change of the point, df must be a function of this change, which means that df should be a function on tangent vectors. The dx^{i} themselves then metamorphosed into functions, and it became clear that they must be distinguished from the tangent vectors \frac {\partial} {\partial x^{i}}."

Now, I understand that the notion of an infinitesimally small number is nonsense, as if a number is infinitesimally close to zero then it should be equal to zero according to the definition of the limit, but I'm struggling to understand what dx^{i} means intuitively in this new (more rigorous approach), and also what the intuition behind the definition df(v) =v(f)?
 
  • #36
"Don't panic!" said:
Now, I understand that the notion of an infinitesimally small number is nonsense, as if a number is infinitesimally close to zero then it should be equal to zero according to the definition of the limit, but I'm struggling to understand what dx^{i} means intuitively in this new (more rigorous approach), and also what the intuition behind the definition df(v) =v(f)?

dx^{i} is the differential of the coordinate function x^{i}. It is no different than the differential of any function.

The differential of a function,df, is a linear function defined on vectors. The value of df on a vector v is df(v). One can also think of vectors as operators on functions. The value of v on a function is called v(f) and is evaluated as df(v). Operators on functions can be defined abstractly. One can show that if an operator satisfies certain conditions it is in fact a tangent vector.
 
  • #37
lavinia said:
dxi dx^{i} is the differential of the coordinate function xi x^{i} . It is no different than the differential of any function.

I'm confused about one should interpret a differential now though (especially after reading that paragraph given in Spivak's book). I'm confused as to whether one can interpret it as an infinitesimal change in a function anymore, or whether now df denotes a new function which encodes how f changes locally. As df_{p} is defined at a given point then we can interpret it as describing an infinitesimal change in f along all possible directions passing through that point. As such infinitesimal changes are induced by tangent vectors at that point, df must be a function of tangent vectors?!

Should df(v) =v(f) be interpreted as a differential change in f along v at a point, which is equal to evaluating the directional derivative of f along v (although how is it possible to equate a differential to a derivative)?
 
Last edited:
  • #38
"Don't panic!" said:
I'm confused about one should interpret a differential now though (especially after reading that paragraph given in Spivak's book). I'm confused as to whether one can interpret it as an infinitesimal change in a function anymore, or whether now df denotes a new function which encodes how f changes locally. As df_{p} is defined at a given point then we can interpret it as describing an infinitesimal change in f along all possible directions passing through that point. As such infinitesimal changes are induced by tangent vectors at that point, df must be a function of tangent vectors?!

As has been explained, today in mathematics, df is considered to be a function on tangent vectors.

Also as as been explained small increments in f along a curve with velocity vector,v, satisfy the equation

Δf = Δtdf(v) + small error term that goes to zero faster than the square of Δt.

If Δt is chosen small enough so that the error term is negligible -e.g. too small to be measured -then Δf is often written as df and is called an infinitesimal increment and Δt is written as dt. In this case one could write df = df(v)dt. Notice that the notation is being used in two ways. I think you are confusing the two. The notation df and dt is common in Physics books.

I don't know the history but I think that df was called an infinitesimal change in f because it represented an increment that could only be perfectly measured in the limit of no increment at all. I guess that Leibniz and others thought of this as an infinitesimal increment. Maybe they thought of df(v) as something real. Today we reformulate this idea by saying that v is a tangent vector and that df is a cotangent vector. Just as the infinitesimal increment of Leibniz could only be inferred from measurement and never directly observed, so tangent and cotangent vectors can not be directly observed since they exist in a separate space, the tangent space of the manifold.

I suppose one could use language that says that df is the infinitesimal tendency of f to change because it exists on its own ( as a cotangent vector in modern terms) but is not realized as a finite increment until f is evaluated along a curve.
 
Last edited:
  • #39
lavinia said:
If Δt is chosen small enough so that the error term is negligible -e.g. too small to be measured -then Δf is often written as df and is called an infinitesimal increment and Δt is written as dt. In this case one writes df = df(v)dt. Notice that the notation is being used in two ways. I think you are confusing the two

But isn't this appealing to the notion of an infinitesimal quantity again (by letting \Delta t become really small)? Is the point that df represents the first order change in a function near a point, and therefore, as when we are near that point this can exactly describe the change in the function, we can consider the linear change in f as the infinitesimal change in f at that point?!

Is Spivak basically saying that we can no longer consider infinitesimal changes in functions directly, but can do so indirectly by considering them as functions of infinitesimal changes (i. e. tangent vectors) along all possible directions at a particular point. In doing so the differentials of the coordinate functions dx^{i} themselves become functions of tangent vectors, describing how the coordinate maps x^{i} change along all possible directions at that point?!
 
  • #40
Is Spivak basically saying that we can no longer consider infinitesimal changes in functions directly, but can do so indirectly by considering them as functions of infinitesimal changes (i. e. tangent vectors) along all possible directions at a particular point. In doing so the differentials of the coordinate functions dx^{i} themselves become functions of tangent vectors, describing how the coordinate maps x^{i} change along all possible directions at that point?!

yes.

Along any curve one can ask what the coordinates are at any time. Thus one can ask what the change in coordinates is over a small time interval. There is no difference between this and asking what the change in any measurement is over a small time interval - e.g. the change in gravitational potential or the change in altitude - whatever.
 
  • #41
"Don't panic!" said:
But isn't this appealing to the notion of an infinitesimal quantity again (by letting \Delta t become really small)? Is the point that df represents the first order change in a function near a point, and therefore, as when we are near that point this can exactly describe the change in the function, we can consider the linear change in f as the infinitesimal change in f at that point?!

In Physics books I have seen this. In this way of looking at things,df and dx are small increments- so small that the error term above the first order approximation can be ignored. I am not sure that Leibniz thought of it this way. I have not read his works but I do know that he was a philosopher and may have thought of df as an infinitesimal increment in a metaphysical way. No idea. I have read that non-standard analysis models his idea of infinitesimal but I know nothing about it.

We often talk of an object as having a speed. We say " The car was going 50 miles per hour when it crashed into the truck." This contains the idea that at any point in time the car actually has something called a speed. But what is this speed really? How can something have a speed in a instant of time? Don't we need to watch it move over a small time interval and calculate its average speed? In Mathematics, the answer is the the speed is a tangent vector - not directly observable because it doesn't exist in the manifold but exists in a separate space,in the tangent space.
 
Last edited:
  • #42
Thanks for your help (and patience). I think I'm starting to understand it a bit better.
 
  • #43
"Don't panic!" said:
Thanks for your help (and patience). I think I'm starting to understand it a bit better.

No Problem. It is good to struggle with these difficult ideas.
 
  • #44
So, is the reason df is called the differential of f because it describes the linear change in f near a point that has been obtained by differentiating f at that point along a given vector, and thus measuring the local change in f along that vector at that point. We can use this to approximate f near this point as f(x)\simeq f(p)+d_{p}f(v). As we get "nearer" to the point p this approximation becomes more and more exact?!
Would it be correct to say that dx^{i}_{p}(v) represents a finite quantity, which is the component of the tangent vector v along the direction \frac{\partial}{\partial x^{i}} (i.e. the amount of change that occurs in the coordinate map x^{i} along v at the point p)?!
Is the reason why such quantities can be considered differential changes in the corresponding function because that describe how it is changing at a specific point and not over some region (or interval)?Restricting to one-dimension for simplicity, is the function df the change in the function describing the tangent line to the function f at a given point; thus the differential is a function describing the change along the tangent line to f at a given point (as we can no longer consider infinitesimals, the closest we can come to this notion is to describe the change along a particular direction at a given point, i.e. along the tangent line at that point)? Thus, locally to this point we can obtain a good linear approximation of f by using this change along the tangent line to f.
 
Last edited:
  • #45
In one dimension df(∂/∂x) = df/dx. It is the ordinary derivative. It is the rate of change of f with respect to x.

When you evaluate df on a vector it is a rate of change - just like in one variable calculus.

An increment in f as a function along a curve, is close to Δtdf(v) for small Δt and Δf/Δt is close to df(v) so df(v) is just the limit of Δf/Δt as Δt goes to zero.
That is the whole story.

But all of this has already been explained.
 
Last edited:
  • #46
lavinia said:
But all of this has already been explained.

Yes, sorry I was trying to summarize what I'd gleamed from the discussion and to see if I'd understood it correctly. I think the block I'm struggling to get around in my brain, is that (in the past) I've always associated df as describing an actual change in the function, but in differential geometry it seems to be that the derivative (rate of change of f) are merged into the same concept?!

"Don't panic!" said:
Restricting to one-dimension for simplicity, is the function dfdf the change in the function describing the tangent line to the function ff at a given point; thus the differential is a function describing the change along the tangent line to ff at a given point (as we can no longer consider infinitesimals, the closest we can come to this notion is to describe the change along a particular direction at a given point, i.e. along the tangent line at that point)? Thus, locally to this point we can obtain a good linear approximation of ff by using this change along the tangent line to ff.

In this paragraph I was trying to justify in my mind (at least intuitively) why we still refer to df as the differential of f (which I've, incorrectly, always associated with an infinitesimal quantity) despite it being a finite quantity, and how it can still capture the notion of a change in f?!

Referring to Spivak's comments on the subject of infinitesimal changes in functions, is the point that we can no longer consider infinitesimals, however we can utilize the notion of rates of change in given directions (tangent vectors) to capture the notion of an infinitesimal change in a function at a point. We do this by noting that such a change should depend on which direction we consider and rate of change in that direction, i.e. it should depend on the tangent vectors at that point. Thus, df maps tangent vectors at a point to real numbers which describe how f is changing in a given direction at that point. Would this be a correct intuition at all?
 
  • #47
Yet another take : When f is differentiable at x df(x) is the change of the function f along the linear approximation to f (tangent line, plane, etc.), which is as good as you want ( in a rigorous ## \delta - \epsilon## sense when f is differentiable). So, e.g., for ##f(x)=x^2 ## , 2xdx gives you a(n) local approximation to the actual change of x^2 at ##x= x_0## (change along the tangent line) between two values of x close to each other. For any choice of ## \epsilon## you can find an interval ## (x -\delta, x+\delta) ## where ## | (x^2-x_0^2)-(2x(x-x_0))| < \epsilon ## for ##|x-x_0| < \delta ; \delta >0 ## .
 
  • #48
So is the point that dx is a finite change along the tangent line to a function f (at a given point x_{0}) and this f(x_{0})+f&#039;(x_{0})dx is the equation of the tangent line to the function f at x=x_{0}. We can then use an \epsilon -\delta limit approach to make this approximation approach the exact change in f at x=x_{0}, such that \lim_{x\rightarrow x_{0}}\left[(f(x)-f(x_{0}))-f&#039;(x_{0})(x-x_{0})\right]=0 (where dx=x-x_{0}). In other words if we construct a linearised function at a point that approximates f locally, then df is a change in this linear function. As such a linearised function is obtained through computing the derivative of f at a given point, we thus call it the differential of f?!

Also, is any of what I put in my previous post correct?

(Sorry to bug everyone with this; I feel like I'm going round in circles at the moment :-( )
 
Last edited:
  • #49
Yes,df is finite I finite since you are dealing with continuous functions ( differentiable implies continuous) and it is in a sense the best linear approximation to the change of f locally, and the linearization is done through the derivative/gradient. Same applies to higher dimensions, e.g., you can approximate the change of the function locally in the same sense by studying the change of f at (x,y) along the tangent plane at (x,y). .
 
Last edited:
Back
Top