Confused about definition of derivative

Click For Summary

Discussion Overview

The discussion revolves around the definition of the derivative as presented in Spivak's "Calculus on Manifolds." Participants explore the nuances of differentiability, particularly in the context of linear transformations and the notation used in the definition.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • Some participants clarify that in the proof provided by Spivak, the second instance of f(h) represents f applied to h, not a different function.
  • Others elaborate on the definition of the derivative, emphasizing that it is a unique linear transformation Df(a) satisfying a specific limit condition.
  • A participant questions why the definition includes λ(h) when λ represents Df(a), suggesting a potential contradiction.
  • Another participant asserts that there is no contradiction, as Df(a)(h) equals f(h) equals λ(h), indicating functional equivalence under certain conditions.
  • Some participants discuss the implications of linearity in the context of the derivative, noting that Df(a) is a linear function of h, not of a.
  • One participant provides examples to illustrate that Df(a) behaves differently for non-linear functions, highlighting the distinction between linear and non-linear behavior.
  • Another participant expresses confusion regarding the interpretation of Df(a)(h) as a matrix product, indicating a struggle with the conceptual understanding of the derivative.

Areas of Agreement / Disagreement

Participants express varying levels of understanding regarding the definition of the derivative and its implications. While some points are clarified, there remains uncertainty and debate about the general applicability of the definitions and the nature of linear transformations in this context.

Contextual Notes

Participants note that the uniqueness of the derivative is not established until later in the text, which affects the interpretation of λ in the definition. Additionally, the discussion reveals the complexity of differentiability when applied to non-linear functions.

krcmd1
Messages
61
Reaction score
0
Trying to work my way through Spivak Calculus on Manifolds.

On page 16, he states

"A function f: R[tex]^{}n[/tex] -> R[tex]^{}m[/tex] is differentiable at a [tex]\epsilon[/tex] R[tex]^{}n[/tex] if there is a lenear transformation [tex]\lambda[/tex]: R[tex]^{}n[/tex] -> R[tex]^{}m[/tex] such that


lim h->0 of |f(a+h) - f(a) - [tex]\lambda[/tex](h)|/|h| = 0.

"Note that h is a point of R[tex]^{}n[/tex] and f(a+h) - f(a) - [tex]\lambda[/tex](h) a point of R[tex]^{}m[/tex], so the norm signs are essential. The linear transformation [tex]\lambda[/tex] is denoted Df(a) and called the derivative of f at a."

on page 20, he states If f: R[tex]^{}n[/tex] -> R[tex]^{}m[/tex] is a linear transformation, then Df(a) = f.

His proof states, on p 21:

lim as |h| -> 0 of |f(a + h) - f(a) - f(h)| / |h| =


lim as |h| -> 0 of |f(a) + f(h) - f(a) - f(h)| / |h| = 0

Here's what confuses me is that it seems to me that in the second limit, the second f(h) represents f(a)(h), while the first f(h) is f of h.

by the way, how do you use tex to write |h| -> 0 below lim?

Thank you, in advance.
 
Physics news on Phys.org
The second f(h) is f applied to h, as he is proving that Df(a) = f, not that Df(a) = f(a).
 
To elaborate, the derivative of [tex]f \colon \mathbb{R}^n \to \mathbb{R}^m[/tex] at [tex]a \in \mathbb{R}^n[/tex] is defined to be the unique linear transformation [tex]Df(a) \colon \mathbb{R}^n \to \mathbb{R}^m[/tex] satisfying
[tex]\lim_{h \to 0} \frac{|f(a+h) - f(a) - Df(a)(h)|}{|h|} = 0.[/tex]
Assuming f is linear, we're trying to prove that f is the derivative of f at a, so we put f in for Df(a):
[tex]\lim_{h \to 0} \frac{|f(a+h) - f(a) - f(h)|}{|h|} = \lim_{h \to 0} \frac{|f(a) + f(h) - f(a) - f(h)|}{|h|} = 0.[/tex]

(Click on the LaTeX images to see the code; that should answer your question about LaTeX. Note: it looks rather ugly if you try to combine regular text with LaTeX in a single formula; you should use one or the other for the entire formula.)
 
Thank you both.

If I may, why then in his definition of derivative does the expression include [tex]\lambda[/tex](h) with [tex]\lambda[/tex] representing Df(a)?
 
krcmd1 said:
Thank you both.

If I may, why then in his definition of derivative does the expression include [tex]\lambda[/tex](h) with [tex]\lambda[/tex] representing Df(a)?

There's no contradiction there. Df(a)(h) = f(h) = [itex]\lambda[/itex](h). Functionally, Df(a) = f = [itex]\lambda[/itex].
 
Thank you.

Is that so in just the proof, where f is linear, or also in the definition of differentiable?
 
He uses lamda in the definition since he hasn't proved uniqueness. After he proves uniqueness he can call lamda = Df(a) THE derivative of f at a.
 
Sorry to be so dense or so literal.

Slider 142 wrote:

"There's no contradiction there. Df(a)(h) = f(h) = [tex]\lambda[/tex](h). Functionally, Df(a) = f = [tex]\lambda[/tex]"

It seems to me that this is true only if Df(a) is linear in (a)(as in the proof); not in the general case of the definition, right? I mean, if f(x) = x**2, Df(a) = 2a, so Df(a)(h) = 2a(h), not h**2.
 
That is indeed true. Df(a)(h) is always a linear function, not of a, but of h. The derivative is a way of looking at a non-linear function by looking at its locally linear behavior, since we know everything there is to know about linear functions (linear algebra). This is part of the definition of [itex]\lambda[/itex]. Finding a unique non-linear function that behaved this way would be impossible, and much less useful.
For example, if f(x) = x^3, then Df(a) = [3a^2]. I put that in square brackets because it is not a number; it is a function, or 1x1 matrix. A better way of writing it is to include its argument, h: Df(a)(h) = [3a^2]h. Df(a) in this context is a function of h, not a. Df(a) is different for each a, and it is only a convenience for these simple functions that we can express its general form as a polynomial in a for each a. If you want to think of it as a function of a, note that its argument is a point a in the domain of f, but its output is not a number, but a linear function from the domain of f to a subset of the codomain of f. The proper terminology for these types of functions and the proper way to deal with their domains and codomains will be revealed as you go along in Spivak and differential geometry.
Similarly, if f(x, y) = x + y, Df(a) is the linear map represented by the matrix [1, 1]. h in this case would be a vector [h1, h2]. The derivative in this case is a linear map from R^2 into R. In general, the derivative Df(a) at each point 'a' is a linear map from R^n into R^m whenever f is a differentiable function at 'a' from R^n into R^m. From our study of linear algebra, we know all such maps can be represented by matrices of real numbers. The components of the matrix of the derivative at 'a' will of course depend on 'a'.
 
Last edited:
  • #10
Ahhhhh. Thank you.

So in the definition of differentiability, "Df(a)(h)" is "Df(a) of h", which is a matrix product of Df(a), a matrix, times a column matrix or point in the domain, and not the product of a number and a point. I'm experiencing some mental vertigo at the moment as my mind stretches.


Thanks again.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 22 ·
Replies
22
Views
4K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 14 ·
Replies
14
Views
2K
  • · Replies 16 ·
Replies
16
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K