# Definition of derivative

matt grime said:
in R^n f is differentaible at x if there is a linaer map Df(x) satisfying

f(x+h)=f(x)+Df(x)h + ho(|h|)

this can be extended to any place where there is a notion of linear map or a map such as |?| to the reals, or some other ordered space.
Is there an easy way to extend this particular definition to when f : R^n -> R^m?

Last edited:

Hurkyl
Staff Emeritus
Gold Member
A function whose range is R^m is simply m functions whose range is R.

Yes, that makes sense. What I'm getting at is the term proportional to h at the end. If h is a vector in R^n and f(h) is a vector in R^m, how do you make that particular definition work?

Hurkyl
Staff Emeritus
Gold Member
Bleh, I wasn't paying any attention at all. I'm embarassed to have written my previous post!

That definition already works for f : R^n → R^m.

Though, I guess it's much less complicated to write R(h) instead of h o(|h|), and require that |R(h)| / |h| → 0. (R for remainder)

Last edited:
I could live with R(h) as the remainder if it is a vector in R^m. But it seems that h o(|h|) is a vector in R^n, and you can't add that to two vectors in R^m, can you?

Hurkyl
Staff Emeritus
Gold Member
You're assuming the thing in the o(|h|) is a number: it could be, for example, an operator R^n &rarr; R^m whose operator norm is asymptotically less than |h|.

You're right, I was assuming that. I know it doesn't really matter what the order is, but it might be more suggestive to write it o(|h|)h or something like that, to make it more clear that o(|h|) is a linear map. Thanks for the help Hurkyl
Staff Emeritus
Gold Member
It can't be a linear map. (Unless it's the zero operator)

Ah, okay. Can you explain?

edit: I guess it's because, if it were linear, you could always add more vectors to the sum so that

o(|h|)(Sum of vectors) > |h| as h->0?

Last edited:
Hurkyl
Staff Emeritus
Gold Member
Well, if o(|h|) was hiding a linear map, then the remainder would look like:

R(h) = A h

for some matrix A...

Right. I guess what I had in mind was a linear map parametrized by |h|, if that makes any sense at all. But that wouldn't be a linear map since it operates differently on different choices of h. Thanks.

HallsofIvy
Homework Helper
Of course, before you can define "derivative" of a function from Rn to Rm, you have to define "differentiable" (that's different from calculus I where a function is "differentiable" as long as the derivative exists!).

If f(x) is a function from Rn to Rm, the f is differentiable at x= a if and only if there exist a linear function, L, from Rn to Rm, and a function ε(x), from Rn to Rm, such that

f(x)= f(a)+ L(x-a)+ ε(x) and $$lim_{|x-a|->0}\frac{\epsilon}{|x-a|}= 0$$.

If that is true, then it is easy to show that the linear function, L, is unique (ε is not). We define the "derivative of f at a" to be that linear function, L.

Notice that, by this definition, in the case f:R1->R1, the derivative of f at a is a linear function from R['sup]1[/sup]->R1, not a number! However, any such linear function must be of the form L(x)= ax- multiplication by a number. That number is, of course, the "Calculus I" derivative of f.

Similarly, the derivative of a "vector valued function of a real variable", R1->Rm, is a linear function from R1 to Rm. Any such function can be written L(x)= x<a1, ....,am>, or x times a vector. That vector is the vector of derivatives in the usual "calculus III" sense.

The derivative of a "real valued function of several real variables", Rn->R1, is a linear function from Rn to R1. Such a function can be written as a dot product: <a1,...,an> dot product the x-vector. That vector is precisely the "gradient vector" of f. (And recall that, in Calculus III, a function may have partial derivatives at a point but not be "differentiable" there.)

This is, by the way, where the "second derivative test" for max or min (or saddle point) of a function of two variables comes from: You look at $\frac{\partial^2F}{\partial x^2}\frac{\partial^2F}{\partial y^2}- $$\frac{\partial^2F}{\partial x \partial y}$$^2$. If that is negative at a (where the partial derivatives are 0), then there is a saddle point at a. If that is positive, then you have either a max or min depending on the sign of the second partials (which must be the same).

The point is that, if F:R2->R, then its derivative, at each point, can be represented as a 2-vector (the gradient vector). That means that the derivative function, that to each point assigns to that point the derivative vector, is a function from R2 to R2- and its derivative is a linear transformation from R2 to R2- which can be represented by a 2 by 2 matrix at each point (the "Hessian" matrix). The calculation $\frac{\partial^2F}{\partial x^2}\frac{\partial^2F}{\partial y^2}- $$\frac{\partial^2F}{\partial x\partial y}$$^2$ is simply the determinant of that matrix. Since the mixed second derivatives are equal, that matrix is symmetric and can, by a coordinate change, be written as a diagonal matrix having the eigenvalues on the diagonal. In that coordinate system, the equation for F is just ax2+ b2= C (no xy term) so if a and b are both positive we have a minimum, if both positive a maximum, if one positive, the other negative, a saddle point. Of course, the determinant (which does not change with a change of coordinate system) is just the product ab.

Last edited by a moderator:
mathwonk
Homework Helper
to define a derivative you first define what it means for a map to have derivative equal to zero. a map o(h) has derivative equal to zero at h =0 if and only if |o(h)|/|h| approaches zero as h does.

this makes sense for vector valued maps if |h| is a norm.

then the original definition makes sense (corrected) if we say that f is differentiable at x provided there exists a linear map L(h) such that the difference

f(x+h) - L(h) - f(x) has derivative equal to zero at h = 0.

then Df(x) = L.

in the original definition there is possibly an error, since the term h.o(h) should have been merely o(h) in this sense.

Hurkyl
Staff Emeritus
Gold Member
So what precisely does little-oh notation mean when applied to a vector-valued function? I've only ever really used it in the context of algorithmic analysis, and time complexity isn't vector valued. mathwonk
Homework Helper
exactly what i said: |o(h)|/(h| approaches zero ( a number), as |h| does, or equivalently as h does (a vector).

Hurkyl
Staff Emeritus
Gold Member
Well, |o(h)| / |h| can't approach a number because, technically, o(h) is a set. :tongue2:

I guess this is what I assumed, so it's good to hear confirmation:

$$f \in o(g) \Leftrightarrow \lim \frac{|f|}{|g|} = 0$$

mathwonk
Homework Helper
??????

o(h) is a vector valued function of h, which is also a vector.

oh i see, my legg is being pulled..

mathwonk
Homework Helper
??????

o(h) is a vector valued function of h, which is also a vector.

oh i see, my legg is being pulled.. this notation was used this way by hjardy and also by loomis, among others.

Hurkyl
Staff Emeritus
Gold Member
!

In computer science, the notation o(f) is the class of all functions that are asymptotically less than f. Specifically:

$$o(f) := \left\{ g \, | \, \lim_{x \rightarrow \infty} \frac{g(x)}{f(x)} = 0 \right\}$$

(P.S. how do I get a nice, tall $|$?)

I understand that here we are looking at x &rarr; 0 instead, but still...

If I were to say something like:

f(x) = x ln x + o(x)

what I really mean is that there exists some function g in o(x) such that

f(x) = x ln x + g(x)

Last edited:
rachmaninoff
Hurkyl said:
(P.S. how do I get a nice, tall |?)
Use: \left| and \right| .................. $$\left| \frac{\mbox{Big}}{\mbox{Expression}} \right|$$

Hurkyl
Staff Emeritus
Gold Member
It won't complain if it's already in the middle of a left/right pair? Or that it doesn't have a matched right to go with it?

rachmaninoff
Nope. $$\left\{ \mbox{another} \left| \frac{\mbox{big}}{\mbox{expression}} \right\}$$

Thanks for the help, everyone. I think the definition that was given by Halls and mathwonk was probably what matt meant to write. Thanks for clearing that up. mathwonk