# Confused about definition of derivative

1. Nov 30, 2008

### krcmd1

Trying to work my way through Spivak Calculus on Manifolds.

On page 16, he states

"A function f: R$$^{}n$$ -> R$$^{}m$$ is differentiable at a $$\epsilon$$ R$$^{}n$$ if there is a lenear transformation $$\lambda$$: R$$^{}n$$ -> R$$^{}m$$ such that

lim h->0 of |f(a+h) - f(a) - $$\lambda$$(h)|/|h| = 0.

"Note that h is a point of R$$^{}n$$ and f(a+h) - f(a) - $$\lambda$$(h) a point of R$$^{}m$$, so the norm signs are essential. The linear transformation $$\lambda$$ is denoted Df(a) and called the derivative of f at a."

on page 20, he states If f: R$$^{}n$$ -> R$$^{}m$$ is a linear transformation, then Df(a) = f.

His proof states, on p 21:

lim as |h| -> 0 of |f(a + h) - f(a) - f(h)| / |h| =

lim as |h| -> 0 of |f(a) + f(h) - f(a) - f(h)| / |h| = 0

Here's what confuses me is that it seems to me that in the second limit, the second f(h) represents f(a)(h), while the first f(h) is f of h.

by the way, how do you use tex to write |h| -> 0 below lim?

2. Nov 30, 2008

### slider142

The second f(h) is f applied to h, as he is proving that Df(a) = f, not that Df(a) = f(a).

3. Nov 30, 2008

To elaborate, the derivative of $$f \colon \mathbb{R}^n \to \mathbb{R}^m$$ at $$a \in \mathbb{R}^n$$ is defined to be the unique linear transformation $$Df(a) \colon \mathbb{R}^n \to \mathbb{R}^m$$ satisfying
$$\lim_{h \to 0} \frac{|f(a+h) - f(a) - Df(a)(h)|}{|h|} = 0.$$
Assuming f is linear, we're trying to prove that f is the derivative of f at a, so we put f in for Df(a):
$$\lim_{h \to 0} \frac{|f(a+h) - f(a) - f(h)|}{|h|} = \lim_{h \to 0} \frac{|f(a) + f(h) - f(a) - f(h)|}{|h|} = 0.$$

(Click on the LaTeX images to see the code; that should answer your question about LaTeX. Note: it looks rather ugly if you try to combine regular text with LaTeX in a single formula; you should use one or the other for the entire formula.)

4. Dec 1, 2008

### krcmd1

Thank you both.

If I may, why then in his definition of derivative does the expression include $$\lambda$$(h) with $$\lambda$$ representing Df(a)?

5. Dec 1, 2008

### slider142

There's no contradiction there. Df(a)(h) = f(h) = $\lambda$(h). Functionally, Df(a) = f = $\lambda$.

6. Dec 2, 2008

### krcmd1

Thank you.

Is that so in just the proof, where f is linear, or also in the definition of differentiable?

7. Dec 2, 2008

### Vid

He uses lamda in the definition since he hasn't proved uniqueness. After he proves uniqueness he can call lamda = Df(a) THE derivative of f at a.

8. Dec 3, 2008

### krcmd1

Sorry to be so dense or so literal.

Slider 142 wrote:

"There's no contradiction there. Df(a)(h) = f(h) = $$\lambda$$(h). Functionally, Df(a) = f = $$\lambda$$"

It seems to me that this is true only if Df(a) is linear in (a)(as in the proof); not in the general case of the definition, right? I mean, if f(x) = x**2, Df(a) = 2a, so Df(a)(h) = 2a(h), not h**2.

9. Dec 3, 2008

### slider142

That is indeed true. Df(a)(h) is always a linear function, not of a, but of h. The derivative is a way of looking at a non-linear function by looking at its locally linear behavior, since we know everything there is to know about linear functions (linear algebra). This is part of the definition of $\lambda$. Finding a unique non-linear function that behaved this way would be impossible, and much less useful.
For example, if f(x) = x^3, then Df(a) = [3a^2]. I put that in square brackets because it is not a number; it is a function, or 1x1 matrix. A better way of writing it is to include its argument, h: Df(a)(h) = [3a^2]h. Df(a) in this context is a function of h, not a. Df(a) is different for each a, and it is only a convenience for these simple functions that we can express its general form as a polynomial in a for each a. If you want to think of it as a function of a, note that its argument is a point a in the domain of f, but its output is not a number, but a linear function from the domain of f to a subset of the codomain of f. The proper terminology for these types of functions and the proper way to deal with their domains and codomains will be revealed as you go along in Spivak and differential geometry.
Similarly, if f(x, y) = x + y, Df(a) is the linear map represented by the matrix [1, 1]. h in this case would be a vector [h1, h2]. The derivative in this case is a linear map from R^2 into R. In general, the derivative Df(a) at each point 'a' is a linear map from R^n into R^m whenever f is a differentiable function at 'a' from R^n into R^m. From our study of linear algebra, we know all such maps can be represented by matrices of real numbers. The components of the matrix of the derivative at 'a' will of course depend on 'a'.

Last edited: Dec 3, 2008
10. Dec 4, 2008

### krcmd1

Ahhhhh. Thank you.

So in the definition of differentiability, "Df(a)(h)" is "Df(a) of h", which is a matrix product of Df(a), a matrix, times a column matrix or point in the domain, and not the product of a number and a point. I'm experiencing some mental vertigo at the moment as my mind stretches.

Thanks again.