# What is a differential.

Hello,

I am an engineering student that has recently taken it upon myself to try and pickup some math skills in my own time. When reviewing the concept of a differential I was a little bit lost. To be honest I don't understand exactly what it is, or what it could be "good for". IE how to apply it.

The book gives its definition of dy/dx = f'(x). Well what exactly is dy, and dx? Most problems seem to be involved with calculating percentages of errors, but what in general does a differential solve?

Also the book I'm reading keeps bringing up dx in integration by mentioning (don't you remember this from differentials). Well yes I remember seeing it, but now I don't understand why it is appearing in integration. If anybody can explain this to me I'd appreciate it. I feel like I only know the mechanical steps of differentiation/integration and didn't really understand what half this stuff was actually doing.

In short dy/dx represents the slope of the function y(x) and it could be a function of x. That is, the slope of y(x) itself is varying wrt x.

Fredrik
Staff Emeritus
Gold Member
There's no need to define dx and dy. You can (and probably should) think of dy/dx as a single symbol that means "the derivative of the function that takes x to y". The alternative is to let dx be any number and define dy as (dy/dx)dx, where dy/dx is defined as above.

If $f:\mathbb R\rightarrow\mathbb R$, then we define $df:\mathbb R^2\rightarrow\mathbb R$ by

$$df(x,h)=f'(x)h$$

If we write dx instead of h, dy instead of df(x,h), and use that f'(x)=dy/dx, the equality takes the form

$$dy=\frac{dy}{dx}dx$$

Note that a Taylor expansion of f around x gives us

$$f(x+h)=f(x)+f'(x)h+\mathcal O(h^2)$$

which implies

$$f(x+h)-f(x)=df(x,h)+\mathcal O(h^2)$$

The same thing expressed with the notation dx=h:

$$f(x+dx)-f(x)=df(x,dx)+\mathcal O(dx^2)$$

So dx is just a real number, and dy=df(x,dx) is the first order estimate of how much f(x) changes as a result of changing x by dx. Note that df(x,dx)=f'(x)dx regardless of how big or small dx is. dx does however have to be small for the approximation df(x,dx)≈f(x+h)-f(x) to be good.

Also the book I'm reading keeps bringing up dx in integration by mentioning (don't you remember this from differentials). Well yes I remember seeing it, but now I don't understand why it is appearing in integration. If anybody can explain this to me I'd appreciate it.

Suppose that you want to find the area under the graph of the function f, from a to b. Then you can approximate that area like this: Choose numbers $x_0,x_1,\dots,x_n$ such that $a=x_0<x_1<\cdots<x_n=b$. Define $\Delta x_k=x_k-x_{k-1}$ for k=1,2,...n. Let $y_k$ be any number in the interval $[x_k,x_{k+1}]$ for k=1,2,...n.

The area is approximately

$$\sum_{k=0}^n f(y_k)\Delta{x_k}$$

The exact value of that area is written as

$$\int_a^bf(x)dx[/itex] just to remind us that it's defined using a limit of better and better approximations of the sort mentioned above. The ∫ reminds us of the Σ, and the dx reminds us of the Δxk. Note that this isn't just a method to calculate the area of a non-rectangular region of the plane. The Riemann integral defines the concept of "area" of a non-rectangular region. Last edited: Well what exactly is dy, and dx? The dy/dx "differential" notation is an archaic notation that persists only because it is easier to hand-wave calculus to freshmen than it is to teach them analysis. Historically, calculus came about for practical reasons to solve physics problems. It was largely based on intuitions about continuous functions and "infinitely small" intervals. It was not until the 18th Century that calculus was formally defined in terms of set theory. By then, the standard notations were so well entrenched in academia that they could not be replaced. The derivative is an operation applied to functions. Newton's notation is more accurate: [tex]f'(x) = \lim_{h \to 0} \frac{f(x+h) - f(x)}{h}$$

Roughly speaking, the "dx" corresponds to the "h" in the above formula and the "dy" to the term "f(x+h) - f(x)". If you remove the limit and replace h with a very small number, you achieve an approximation of the derivative. In doing so, "dx" is a very small number, as is "dy" (as long as f is a continuous function). Dividing a very small number by another very small number results in a regular-sized result.

Since physicists only need to find approximations and don't abhor intuitive arguments in their proofs, this notation is perfect. But mathematically, it has a few shortcomings.

First, dy and dx can't really exist separately. They both must exist "inside" the limit. This means you can't technically "multiply both sides by dx". It's invalid, because that "h" escapes it's scope. (Just like how you can't find an "i" outside of the Σ sign that iterates over it).

Second, dy is still a function of x, but the parameter is elided. Physicists often conflate functions with their values at particular points, but for strict mathematical interpretation f is distinct and completely different from f(x). The former is a function and the latter is the evaluation of that function at a point x (which must be defined in the context of the expression).

Things get even worse in multivariate differential equations. A function f(v) might be written in terms of its components (which implicitly defines a basis), f(x, y, z). In physics, it's common to change bases, and in such cases, Leibniz notation comes in handy. Also, many problems come up where you differentiate with respect to time independently of space.

In general, physics treats functions much more fluidly (or ad hoc, if you're a mathematician), and you can always introduce new variable decadencies after the fact if your initial analysis of the problem is deemed insufficient.

But I'm rambling now.

There is no good interpretation for dy or dx as stand-alone objects. They must always appear together (or alone under an integral sign).

The true foundation for calculus is analysis. Differentiation becomes an operation on functions. You can even think of differentiation as a function from real-valued functions to real-valued functions.

The dy and dx are often convenient because all methods you'll learn in class are written in terms of them. You could invent a more accurate notation, but it's usually better to keep to something your professor will understand. For your sanity's sake, just think of "dx" as a "small distance" and "dy" as "the small amount f increases from x to x+dx".

mathwonk
Homework Helper
2020 Award
Unfortunately most of these comments are incorrect. They stem from a historical situation in which it is true that differentials were used unrigorously and replaced by a rigorous theory of limits in the treatment of derivatives. In this approach, which I also was taught in college, dy and dx do not mean anything individually, only the notion of derivative makes sense and is denoted confusingly by the "quotient" notation dy/dx. Here the two symbols dy and dx are never to be separated.

However modern differential geometry and differential calculus has succeeded in giving a precise, although somewhat complicated, meaning to these separate symbols, in which it is entirely correct to say that the derivative is an actual quotient of the two differentials dy and dx.

If you picture the family of tangent lines to the graph of y = f(x), they represent the graphs of a family of linear functions. For each point x=a on the x axis, the linear function represented by the tangent line at x=a, is the function of h defined by dfa(h) = f'(a).h, i.e. multiplication by f'(a).

For the function y=x, this assigns the identity function taking h to 1.h at every point. The quotient of two linear functions b.h/c.h, is the constant function b/c, whose value is the quoptient of the multipliers in the two linear functions.

In this sense the quotient dy/dx = df/dx is a function whose value at x=a is the constant function of h whose value is the quotient of the multipliers f'(a)/1. I.e. the value of dy/dx at x=a is indeed f'(a).

In a more general setting, we introduce a "vector field" as a family of tangent vectors to our surface or other space. Vectors are variables on which linear functions act. Then a covector field is a family of linear functions, and can be paired at each point with the vectors in our vector field to give a number at each point, i.e. a function.

A differential is just a covector field, i.e. something that pairs with a vector field to give a function. The purpose of all this is to be able to integrate differentials over curves. I.e. to each curve in a surface, we associate the family of velocity vectors at points of the curve. Then any differential acts on these vectors to give numbers, hence a function, which can be integrated. The good part is that since the differential acts linearly on the vectors, multiplying the vectors by a constant also multiplies the value of the differential by that same constant.

This way the integral is independent of parametrization of the curve. I.e. if you run over the curve faster, thus shortening the interval of integration, the differential compensaTES FOR THIS BY HAVING A larger value to integrate, so the integral comes out the same.

All this complicated stuff is just to say that a differential like f(x)dx, is a gadget that can be integrated over a curve, and it has the good property that f(x)dx and f(x).(dx/dt) dt, will have the same integral. I.e. the integral of the differential f(x)dx over the curve x(t) from x(t0) to x(t1), will equal the integral of the differential f(x(t))(dx/dt).dt over the interval from t0 to t1.

For a good elementary explanation, read the first part of the book on differential equations by morris tenebaum and ???

Hurkyl
Staff Emeritus
Gold Member
I'm under the impression that widespread thinking in terms of infinitessimals is a modern phenomenon, relatively speaking.

For example, the ancient Greeks computed the area of a circle using the method of exhaustion -- you can disprove $A < \pi r^2$ by inscribing polygons, and you can disprove $A > \pi r^2$ by circumscribing polygons. Everything I've read indicates that's really how the ancient Greeks thought of things, rather than imagining that the circle was a polygon with infinitely many infinitessimal sides.

Differential forms are rather simple and convenient things -- I think a great deal of imprecision in calculus comes from the fact that students are struggling to come up with the idea on their own, while the usual curriculum uses the idea but steadfastly refuses to say anything about them until differential geometry.

Differentiating is the act of finding the slope of a tangent line that is at the exact point of the funtion you are differentiating (assuming we are differentiating with time). take a simple x^2 for example. the reason the derivative is 2x is because the slope of the tangent line is 2x. it is linear. Applied, you are often measuring instantaneous rates of change. For example, if you wanted to know at what rate a car is accelerating at a particular point in time, you use basic calculus methods such as diff'ing. Remember, though...it gets more complicated. When differentiating and integrating unfamiliar functions, you will see how important limits actually are -calc3 (know your limits :)

Last edited:
Why is everyone complicating matters? It is all well and good to state that differentials as Newton and Leibniz used them were not rigorous, and it is also fine to state that things like nonstandard analysis and differential forms have made rigorous a lot of old ideas about infinitesimals.

But neither statement will actually resolve the OP's issue. For better or worse, the old non-rigorous infinitesimal methods are still being used, especially in science and engineering. An engineer doesn't justify his use of infinitesimals by asserting that they can be represented by hyperreal numbers. He just uses them, and lets the mathematician work out all the rigor and theory.

In my opinion, even if you're going to take more real analysis courses later on, it's still useful to understand the intuition behind the notation used in Calculus. And the only way you can do that is by studying mathematically "bogus" infinitesimals and how to manipulate them. </rant>

A good place to start is Calculus Made Easy by Silvanus Thompson. He really explains the meaning of differentials and their role in Calculus. You'll finally get to understand *why* basic rules like the product rule really work, and notation like integral f(x) dx will finally make sense. Best of all, it's a really short book, and can probably be read in a single sitting.

I was taught that a differential of f at a is the unique linear map that is tangent to a f at a, where tangency is defined via the notion of 'little oh' functions.

The wikipedia page on differentials has a good introductory explanation; see http://en.wikipedia.org/wiki/Differential_%28infinitesimal%29" [Broken] under the section "differentials as linear maps."

For a more thorough discussion of this notion of a differential, there is a good discussion in Loomis and Sternberg's Advanced Calculus, chapter 3, sections 5 and 6, availabe here:
http://www.math.harvard.edu/~shlomo/" [Broken]

Here is another book that text that breaks it down into a slightly more digestable form than that is found in Loomis and Sternberg (PDF warning):
http://www.mth.pdx.edu/~erdman/PTAC/problemtext_pdf.pdf" [Broken]
See chapter 8 for the single variable case, chapter 25 for the multiple variable case.

Last edited by a moderator:
mathwonk
Homework Helper
2020 Award
lets keep it simple. dy does make sense. and dy = dy/dx dx

jambaugh
Gold Member
When I teach Calc I, I like to hit on differentials as soon as possible to give the students time to absorb the concept and the notation. I find them a superior tool for doing both related rates and implicit differentiation, and of course they are essential to change of variables in integration. Here is how I explain it:

The coordinates x and y define positions of any point in the plane. Considering any point (x,y) we can define local coordinates dx and dy parallel to x and y with origin the point (x,y). (In differential geometry terms the coordinate plane being flat is its own tangent space.)

In short differentials are variables, specifically local coordinates parallel to x and y and relative to an origin point (x,y)... however we apply a few conventions...

A function('s graph) y=f(x) or even a general relation g(x,y) = 0 defines (for suitably smooth f or g) a smooth curve in the plane by acting as a constraint on x and y.

Given the constraint y=f(x) or g(x,y)=0, and for a given value of x and y satisfying the constraint there are extended conditions on dy and dx that they lie upon the tangent line to the curve at (x,y). They thus have the constraint:
$$\frac{dy}{dx} = f'(x)$$ or $$dx\cdot\frac{\partial}{\partial x} g(x,y) + dy\cdot\frac{\partial}{\partial y} g(x,y) = 0$$
which is the algebraic form of the statement that (x+dx,y+dy) lies on the line tangent to the curve at (x,y). [See footnote below]

This puts the differentials of primary variables on a concrete footing as "new variables" defined relative to a point on a curve. They need not be infinitesimal in scale. One can then define the differential operator, d as a mapping from expressions in x and y to expressions in x,y,dx, and dy. Specifically it maps any expression considered as a constraint function (like g above) to the corresponding constraint for the differentials (that they lie on the tangent curve).

And finally one generalizes to higher dimensions and multiple constraints and with some work one can show all the definitions and conventions are consistent. (Principally by applying the linearity property of d.)

Now this gets said much more eloquently and in more generality in differential geometry but using heaver, more esoteric tools such as tangent bundles and such.

Footnote: When I teach differentials in Calc. 1, I of course do not invoke partial derivatives explicitly. But by the time they master differentials using straightforward chain rule they find they have learned partial differentiation automatically. E.g. the implicit form of the constraint becomes the definition of the partial derivatives.

A final comment. I find that too much emphasis is placed on functional forms and not enough is done in parallel with the variables to show the distinction of working with variables vs working with functions. (Both need coverage in intro Calc.) Most especially is the bad practice (we Physicists promote) of using the same name for the distinct objects variable (coordinate) and function, e.g. y = y(x).

[EDIT]: A Final final comment... one can go further and define for the implicit form a family of curves g(x,y)=c and thereby give meaning to the differentials and their subsidiary constraint equation even when (x,y) do not lie on the curve g(x,y) = 0. Their constraint is that they lie on the tangent line to the curve g(x,y) = c upon which an arbitrary point (x,y) resides. That's not foolproof but is sufficient answer to the issue for first semester Calc..

Last edited:
However modern differential geometry and differential calculus has succeeded in giving a precise, although somewhat complicated, meaning to these separate symbols, in which it is entirely correct to say that the derivative is an actual quotient of the two differentials dy and dx.

I hope no one minds if I butt in with some questions. This is something I've been puzzling about for a long time. Anyone who's just learning this stuff should not pay too much attention to this post, as I'm sure it will contain misunderstandings on my part and unnecessary complications as I try to connect various ideas that I've read and struggled with.

If you picture the family of tangent lines to the graph of y = f(x), they represent the graphs of a family of linear functions. For each point x=a on the x axis, the linear function represented by the tangent line at x=a, is the function of h defined by dfa(h) = f'(a).h, i.e. multiplication by f'(a).

For the tangent line, I get: f(a) + f'(a).(h-a). Translating the origin to (a,f(a)), this becomes df(a,h) = f'(a).h. This is going to be a bit of a mouthfulm but... should these coordinate systems, copies of R2, located at each point (a,f(a)) be thought of as one set of coordinate spaces representing the components, in coordinate bases, of the cotangent spaces (i.e. vector spaces of continuous linear functions of tangent vectors) associated with each point of Euclidean 1-space? Or something like that...

For the function y=x, this assigns the identity function taking h to 1.h at every point.

This reminds me of something I read about Hamilton making a distinction between numbers in their role as functions acting on other numbers, and numbers in their passive role as the arguments of others, so 1(h) = 1.h. In one dimension, is the idea that one copy of the real numbers plays the role of a chart for Euclidean 1-space, others play the role of components of tangent vectors in the coordinate tangent bases associated with each point, and others the role of components of cotangent vectors in the coordinate cotangent bases associated with each point?

The differential seemed to begin as a function of two variables, a and h, then a was held fixed and it starts being treated as a function only of h. Is df(a,h) a cotangent vector field, and and df(h), at some fixed a, the actual cotangent vector which is the value of this field at the point a? (AFTERTHOUGHT: Oh, on rereading, I see you already said this in slightly different language.)

The quotient of two linear functions b.h/c.h, is the constant function b/c, whose value is the quoptient of the multipliers in the two linear functions.

b and c are cotangent vectors located at the same point, right?

$$\frac{\mathrm{d}f(h)}{\mathrm{d}g(h)} \bigg|_a = \frac{f'(a) \, \mathrm{d}x(h)}{g'(a) \, \mathrm{d}x(h)} \equiv \frac{f'(a)}{g'(a)} \frac{1}{1}=\frac{f'(a)}{g'(a)}$$

Or maybe b.h/c.h means diving the values at each point of these cotangent vector fields to give a scalar field. How do we know that the variable h on the bottom has the same value as the variable h on the top: is the principle that we hold it as fixed at some arbitrary non-zero value for the purpose of this definition?

In this sense the quotient dy/dx = df/dx is a function whose value at x=a is the constant function of h whose value is the quotient of the multipliers f'(a)/1. I.e. the value of dy/dx at x=a is indeed f'(a).

In other words, a quotient of cotangent vectors is defined by dividing their components in some basis, and treating dx/dx as 1? (So this is a function Q : T* x T* --> R?)

In a more general setting, we introduce a "vector field" as a family of tangent vectors to our surface or other space. Vectors are variables on which linear functions act. Then a covector field is a family of linear functions, and can be paired at each point with the vectors in our vector field to give a number at each point, i.e. a function.

Should the reciprocal 1/c be regarded as a tangent vector, as suggested by the notation, and by the fact that the linear function b could be said to act on the variable c to give a scalar.

$$\mathrm{d}x\left ( \frac{\partial }{\partial x} \right ) = \frac{\mathrm{d} x}{\mathrm{d} x} = 1$$

In this view, is differentiation in single-variable calculus a special case of the scalar product of a cotangent vector and a tangent vector?

A differential is just a covector field, i.e. something that pairs with a vector field to give a function.

Q : T* x T --> R, rather than Q : T* x T* --> R? (This function being a scalar field, the derivative, f'(x)?)

For a good elementary explanation, read the first part of the book on differential equations by morris tenebaum and ???

Found it: Morris Tenenbaum & Harry Pollard. Chapter 2, Lesson 6, "Meaning of the differential of a function; seperable differential equations", pp. 47-51. They begin by defining

$$\mathrm{d}f(x,\Delta x):=f'(x) \Delta x$$

just as Fredrik did. Then there's some distinction I don't understand where they write $y=x$ and $y=\hat{x}$ "in order to distinguish between the function defined by $y=x$ and the variable $x$, so that $y = \hat{x}$ will define the function that assigns to each value of the indepenent variable $x$ the same unique value to the dependent variable $y$." Any idea what that they're getting at there? I think possibly they're saying that any letter with a hat on, be it x or t or whatever, will denote the identity function: I : R --> R : I(x) = x. What is the relationship between this an a "coordinate function"?

On p. 50, they call call the writing of expressions like dx/dy a "custom" rather than offering any general definition of a quotient of functions, at least not at this stage.

Comment 6.33, p. 50:

If y = f(x) and x = g(t), then y = f[g(t)] defines y as a function of t. The independent variable is therefore t; the dependent variables x and y. In general, if y is a dependent variable, the increment $\Delta y \neq \mathrm{d}y$. It follows therefore that $\Delta x \neq \mathrm{d}x$, since here x is also a dependent variable, thus there is no justification for replacing the increment $\Delta x$ by $\mathrm{d}x$ in "$\mathrm{d}y = f'(x) \, \Delta x$". However, if both $\mathrm{d}y$ and $\mathrm{d}x$ are differentials as defined in 6.13 then as we proved in Theorem 6.3, $\mathrm{d}y = f'(x) \, \mathrm{d} x$ even when x is itself dependent on a third variable t.

$$(6.13) \enspace\enspace \mathrm{d}y(t,\Delta t)=f'(x) \, \mathrm{d}x(t,\Delta t)$$

The proof in Theorem 6.3 is just "by the chain rule".

Last edited:
The dy/dx "differential" notation is an archaic notation that persists only because it is easier to hand-wave calculus to freshmen than it is to teach them analysis.

This may be true in the general case, but I vividly remember not "getting" calculus until I dropped the course and started doing epsilon delta proofs on my own. A lot of students will put up with hand waving. But not all.

I'm under the impression that widespread thinking in terms of infinitessimals is a modern phenomenon, relatively speaking.

For example, the ancient Greeks computed the area of a circle using the method of exhaustion -- you can disprove $A < \pi r^2$ by inscribing polygons, and you can disprove $A > \pi r^2$ by circumscribing polygons. Everything I've read indicates that's really how the ancient Greeks thought of things, rather than imagining that the circle was a polygon with infinitely many infinitessimal sides.

I'm interested to know where you read this, Hurkyl. Since it would seem to say that they did not believe in arbitrary (infinite) number of intersections.. i.e. they believed in http://http://en.wikipedia.org/wiki/Intuitionistic_logic" [Broken]

Last edited by a moderator:
lavinia
Gold Member
I think of dy as the small change in y for a suffiicently small change in x. It is a small displacement. The key is that dy can be arbitraily accurately predicted as a multiple of dx for sufficiently small dx. My impression is that historically dy was thought of as a displacement.

Rigorously this really doesn't work but practically it does. In real world situations we want to know what actual tangible changes are and linear approximations are always preferred if they are very accurate.

One can think of dy as a linear function of dx for very small displacements in x. The failure of this linear function becomes arbitrarily small as dx decreases. In the old days before the rigorous theory of limits, differentials were not rigorously defined it seems to me that they still were used correctly and people basically knew what they were dealing with.

Last edited:
Hurkyl
Staff Emeritus
Gold Member
I'm interested to know where you read this, Hurkyl. Since it would seem to say that they did not believe in arbitrary (infinite) number of intersections.. i.e. they believed in http://http://en.wikipedia.org/wiki/Intuitionistic_logic" [Broken]
It's Eudoxus's method of exhaustion.

I'm pretty sure they used classical logic in classical times. Where did you get the idea they rejected the law of the excluded middle?

(p.s. "infinite intersections"? )

Last edited by a moderator:
Tenenbaum & Pollard, p. 50:

Let z=f(x,y) define z as a function of x and y. The differential of z, written dz or df is defined by:

$$dz(x,y,\Delta x, \Delta y) = \frac{\partial z}{\partial x}\Delta x+\frac{\partial z}{\partial y} \Delta y$$

They tell us that if $z = \hat{x}$ (see my previous post for their definition of hatted variables) then $d\hat{x} = \Delta x$, and if $z = \hat{y}$, then $d\hat{y} = \Delta y$. Could we go for the hat-trick and say $dz = \Delta z$ in either of those cases, being then a linear approximation of a linear function? Or if not, why not?

They go on to rewrite this equation with hatted dx and dy. On p. 51, they decide to replace $d\hat{x}$ and $d\hat{y}$ with $dx$ and $dy$, which they say is a notation that "became custom" "in the course of time". They warn:

Keep in mind that $dx$ and $dy$ mean $d\hat{x}$ and $d\hat{y}$ and are therefore differentials not increments.

But didn't they just define them as increments in the equation above? If none of the d-things are increments, doesn't their definition of a differential become totally circular: a differential is... a multiple of a differential.

Char. Limit
Gold Member
Just as a little joke here...

No, W-hat is not a differential. W-hat is a unit vector.

It's Eudoxus's method of exhaustion.

I'm pretty sure they used classical logic in classical times. Where did you get the idea they rejected the law of the excluded middle?

(p.s. "infinite intersections"? )

I feel as though I've read something saying that Aristotle did not actually "believe" that $A = \pi r^2$, only that it was an approximation. I don't want to hijack this thread, so just take it as my mistake.

Update: Okay, wait, I found it: http://http://www.sciencenews.org/view/generic/id/8974/title/Math_Trek__A_Prayer_for_Archimedes" [Broken]

Critically, Archimedes never claimed that by adding triangles forever, you could make the straight-line construction exactly equal to the section of the parabola. That would require an actual infinity of triangles. Instead, he just said that you can make the approximation as good as you like, so he was sticking with potential infinity.

Modern historians and mathematicians have always believed whenever Archimedes dealt with infinities, he kept strictly to the potential kind. But Netz, who transcribed the newly found text, says that the recent discoveries show that Archimedes indeed used the notion of actual infinity. Netz and the project's lead researcher, William Noel of the Walters Art Museum in Baltimore, have co-authored a new book, The Archimedes Codex, which describes this discovery and the other facets of the project. It is scheduled for release on Nov. 1 of this year.

Seems I'm mistaken on the message of the article :(

PS: Oh yea, infinite intersections would give you a single point in the reals, where finite intersections cannot.

Last edited by a moderator: