Definition of derivative

1. Aug 13, 2005

PBRMEASAP

Is there an easy way to extend this particular definition to when f : R^n -> R^m?

Last edited: Aug 13, 2005
2. Aug 13, 2005

Hurkyl

Staff Emeritus
A function whose range is R^m is simply m functions whose range is R.

3. Aug 13, 2005

PBRMEASAP

Yes, that makes sense. What I'm getting at is the term proportional to h at the end. If h is a vector in R^n and f(h) is a vector in R^m, how do you make that particular definition work?

4. Aug 13, 2005

Hurkyl

Staff Emeritus
Bleh, I wasn't paying any attention at all. I'm embarassed to have written my previous post!

That definition already works for f : R^n → R^m.

Though, I guess it's much less complicated to write R(h) instead of h o(|h|), and require that |R(h)| / |h| → 0. (R for remainder)

Last edited: Aug 13, 2005
5. Aug 13, 2005

PBRMEASAP

I could live with R(h) as the remainder if it is a vector in R^m. But it seems that h o(|h|) is a vector in R^n, and you can't add that to two vectors in R^m, can you?

6. Aug 13, 2005

Hurkyl

Staff Emeritus
You're assuming the thing in the o(|h|) is a number: it could be, for example, an operator R^n &rarr; R^m whose operator norm is asymptotically less than |h|.

7. Aug 13, 2005

PBRMEASAP

You're right, I was assuming that. I know it doesn't really matter what the order is, but it might be more suggestive to write it o(|h|)h or something like that, to make it more clear that o(|h|) is a linear map. Thanks for the help

8. Aug 13, 2005

Hurkyl

Staff Emeritus
It can't be a linear map. (Unless it's the zero operator)

9. Aug 13, 2005

PBRMEASAP

Ah, okay. Can you explain?

edit: I guess it's because, if it were linear, you could always add more vectors to the sum so that

o(|h|)(Sum of vectors) > |h| as h->0?

Last edited: Aug 13, 2005
10. Aug 13, 2005

Hurkyl

Staff Emeritus
Well, if o(|h|) was hiding a linear map, then the remainder would look like:

R(h) = A h

for some matrix A...

11. Aug 13, 2005

PBRMEASAP

Right. I guess what I had in mind was a linear map parametrized by |h|, if that makes any sense at all. But that wouldn't be a linear map since it operates differently on different choices of h. Thanks.

12. Aug 14, 2005

HallsofIvy

Staff Emeritus
Of course, before you can define "derivative" of a function from Rn to Rm, you have to define "differentiable" (that's different from calculus I where a function is "differentiable" as long as the derivative exists!).

If f(x) is a function from Rn to Rm, the f is differentiable at x= a if and only if there exist a linear function, L, from Rn to Rm, and a function ε(x), from Rn to Rm, such that

f(x)= f(a)+ L(x-a)+ ε(x) and $$lim_{|x-a|->0}\frac{\epsilon}{|x-a|}= 0$$.

If that is true, then it is easy to show that the linear function, L, is unique (ε is not). We define the "derivative of f at a" to be that linear function, L.

Notice that, by this definition, in the case f:R1->R1, the derivative of f at a is a linear function from R['sup]1[/sup]->R1, not a number! However, any such linear function must be of the form L(x)= ax- multiplication by a number. That number is, of course, the "Calculus I" derivative of f.

Similarly, the derivative of a "vector valued function of a real variable", R1->Rm, is a linear function from R1 to Rm. Any such function can be written L(x)= x<a1, ....,am>, or x times a vector. That vector is the vector of derivatives in the usual "calculus III" sense.

The derivative of a "real valued function of several real variables", Rn->R1, is a linear function from Rn to R1. Such a function can be written as a dot product: <a1,...,an> dot product the x-vector. That vector is precisely the "gradient vector" of f. (And recall that, in Calculus III, a function may have partial derivatives at a point but not be "differentiable" there.)

This is, by the way, where the "second derivative test" for max or min (or saddle point) of a function of two variables comes from: You look at $\frac{\partial^2F}{\partial x^2}\frac{\partial^2F}{\partial y^2}- $$\frac{\partial^2F}{\partial x \partial y}$$^2$. If that is negative at a (where the partial derivatives are 0), then there is a saddle point at a. If that is positive, then you have either a max or min depending on the sign of the second partials (which must be the same).

The point is that, if F:R2->R, then its derivative, at each point, can be represented as a 2-vector (the gradient vector). That means that the derivative function, that to each point assigns to that point the derivative vector, is a function from R2 to R2- and its derivative is a linear transformation from R2 to R2- which can be represented by a 2 by 2 matrix at each point (the "Hessian" matrix). The calculation $\frac{\partial^2F}{\partial x^2}\frac{\partial^2F}{\partial y^2}- $$\frac{\partial^2F}{\partial x\partial y}$$^2$ is simply the determinant of that matrix. Since the mixed second derivatives are equal, that matrix is symmetric and can, by a coordinate change, be written as a diagonal matrix having the eigenvalues on the diagonal. In that coordinate system, the equation for F is just ax2+ b2= C (no xy term) so if a and b are both positive we have a minimum, if both positive a maximum, if one positive, the other negative, a saddle point. Of course, the determinant (which does not change with a change of coordinate system) is just the product ab.

Last edited: Aug 14, 2005
13. Aug 14, 2005

mathwonk

to define a derivative you first define what it means for a map to have derivative equal to zero. a map o(h) has derivative equal to zero at h =0 if and only if |o(h)|/|h| approaches zero as h does.

this makes sense for vector valued maps if |h| is a norm.

then the original definition makes sense (corrected) if we say that f is differentiable at x provided there exists a linear map L(h) such that the difference

f(x+h) - L(h) - f(x) has derivative equal to zero at h = 0.

then Df(x) = L.

in the original definition there is possibly an error, since the term h.o(h) should have been merely o(h) in this sense.

14. Aug 14, 2005

Hurkyl

Staff Emeritus
So what precisely does little-oh notation mean when applied to a vector-valued function? I've only ever really used it in the context of algorithmic analysis, and time complexity isn't vector valued.

15. Aug 14, 2005

mathwonk

exactly what i said: |o(h)|/(h| approaches zero ( a number), as |h| does, or equivalently as h does (a vector).

16. Aug 14, 2005

Hurkyl

Staff Emeritus
Well, |o(h)| / |h| can't approach a number because, technically, o(h) is a set. :tongue2:

I guess this is what I assumed, so it's good to hear confirmation:

$$f \in o(g) \Leftrightarrow \lim \frac{|f|}{|g|} = 0$$

17. Aug 14, 2005

mathwonk

??????

o(h) is a vector valued function of h, which is also a vector.

oh i see, my legg is being pulled..

18. Aug 14, 2005

mathwonk

??????

o(h) is a vector valued function of h, which is also a vector.

oh i see, my legg is being pulled.. this notation was used this way by hjardy and also by loomis, among others.

19. Aug 14, 2005

Hurkyl

Staff Emeritus
!

In computer science, the notation o(f) is the class of all functions that are asymptotically less than f. Specifically:

$$o(f) := \left\{ g \, | \, \lim_{x \rightarrow \infty} \frac{g(x)}{f(x)} = 0 \right\}$$

(P.S. how do I get a nice, tall $|$?)

I understand that here we are looking at x &rarr; 0 instead, but still...

If I were to say something like:

f(x) = x ln x + o(x)

what I really mean is that there exists some function g in o(x) such that

f(x) = x ln x + g(x)

Last edited: Aug 14, 2005
20. Aug 14, 2005

rachmaninoff

Use: \left| and \right| .................. $$\left| \frac{\mbox{Big}}{\mbox{Expression}} \right|$$