Doubt about theorem in Calculus on Manifolds

psie · Jun 11, 2024

2-8 Theorem. If ##f:\mathbb R^n\to\mathbb R^m##, then ##Df(a)## exists if all ##D_jf^i(x)## exist in an open set containing ##a## and if each function ##D_jf^i## is continuous at ##a## (Such a function is called continuously differentiable at ##a##)

Here ##Df(a)## is the derivative of ##f## at ##a##, i.e. the linear transformation at . My question is simply; if the assumptions in the theorem hold, is the map ##a\mapsto Df(a)## also continuous? Spivak seems to only prove the existence, not the continuity. If it is true that ##a\mapsto Df(a)## is also continuous, I'd be grateful for some guidance on how to prove this.

EDIT: Here's an attempt at a proof. If we think of ##Df(a)## as a matrix instead of a linear transformation, which we denote ##f'(a)##, then the map ##a\mapsto f'(a)## is a map into ##\mathbb R^{m\cdot n}## and for maps ##g## into finite product spaces, they are continuous iff ##\pi^{i}\circ g## are continuous. We are given that the partials (exist and) are continuous at ##a##, but those are simply ##\pi^{ij}\circ f'## and so ##f'## is continuous. Thoughts? Comments?

fresh_42 · Jun 11, 2024

You can reduce the situation to ##m=1## i.e. one component of ##f.## That makes it a lot easier.

Yes, if the conditions in the theorem hold then ##a\longmapsto D_a(f)## is continuous, too. I don't have a proof in mind so I would have to search the internet first.

Here is a list and counterexamples of what follows from what:
https://www.physicsforums.com/insights/the-pantheon-of-derivatives-i/
You want to show the first implication: (all) continuous partial differentiability implies continuous total differentiability. The counterexample shows that at some point in the proof you need that the partial derivatives are continuous. Otherwise, you could simply write down the gradient.

psie · Jun 11, 2024

fresh_42 said:

Yes, if the conditions in the theorem hold then ##a\longmapsto D_a(f)## is continuous, too. I don't have a proof in mind so I would have to search the internet first.

Ok. Did you see my attempt at proving it?

I think it simply follows from a theorem in topology, with the understanding that ##\mathbb R^{m\times n}## is a product of ##mn## copies of ##\mathbb R##, if I'm not mistaken.

fresh_42 · Jun 11, 2024

Found it. Unfortunately has every author his own notations. In my case, I have even two translations to perform. The proof is given for ##n=2.##

If ##a\longmapsto D_a(f)## exists and is continuous then ##f_x:= \dfrac{\partial f}{\partial x} := D_a(f)\cdot (1,0)## and ##f_y:= \dfrac{\partial f}{\partial y} := D_a(f)\cdot (0,1)## do the job including continuity.

Yes, I saw your attempt but I think you are a little bit confused. We do not need manifolds here (and no metric ##g## or whatever it is), and we can assume ##m=1## so we don't need projections either.

Now to the other direction, if the partial derivatives ##f_x## and ##f_y## exist and are continuous. We need to find a linear map ##L## such that ##f(a+v)-f(a)- L(v) < \varepsilon .##

First, we chose a rectangular neighborhood of ##a## such that ##|f_x(p)-f_x(a)|<\varepsilon ## and ##|f_y(p)-f_y(a)|<\varepsilon ## for its points. This is possible because of the continuity of the partial derivatives ##f_x## and ##f_y##. The trick is now to consider (with ##L(\lambda,\mu):=\lambda \cdot f_x(a)+ \mu\cdot f_y(a)## and ##v=(\lambda,\mu)##)
\begin{align*}
f(a+v)- f(a)- L(v)&=\underbrace{\left(f(a+v)-f(a+(\lambda,0))-\mu \cdot f_y(a)\right)}_{=A}\\
&\phantom{=}+\underbrace{\left(f(a+(\lambda,0))-f(a)-\lambda \cdot f_x(a)\right)}_{=B}
\end{align*}
The second term ##B## is ##o(\|v\|)## because of partial differentiability along ##x## at ##a##.
The first term ##A## uses partial differentiability along ##y## at ##a+(\lambda,0)## and ##|f_y(a+(\lambda,0))-f_y(a)|<\varepsilon .##

Please check the details and if I made typos. The proof uses the definition of differentiation with the Weierstraß formula as in the link I gave. It is a proof for ##n=2.## You can do it recursively for any other ##n## which is only more work to write.

mathwonk · Jun 11, 2024

existence is the hard part, continuity is trivial. i.e. if the derivative exists, then its matrix is the matrix of partials, so the map to the derivative is the map to the matrix of partials, which is continuous if each partial is continuous. I.e. if f is C^1, then the map to the matrix of partials is continuous. existence proves that map is equal to the map to the derivative. ("existence" means that the matrix of partials does give a good linear approximation.)

psie · Jun 12, 2024

Thanks, @fresh_42. I have a couple of questions.

fresh_42 said:

Now to the other direction, if the partial derivatives ##f_x## and ##f_y## exist and are continuous. We need to find a linear map ##L## such that ##f(a+v)-f(a)- L(v) < \varepsilon .##

Could you explain how ##f(a+v)-f(a)- L(v) < \varepsilon ## shows the continuity of the derivative? I was expecting we would need to show ##\lim_{h\to 0}D_{a+h}(f)-D_{a}(f)=0##.

fresh_42 said:

The second term ##B## is ##o(\|v\|)## because of partial differentiability along ##x## at ##a##.
The first term ##A## uses partial differentiability along ##y## at ##a+(\lambda,0)## and ##|f_y(a+(\lambda,0))-f_y(a)|<\varepsilon .##

I'd be really grateful if you could elaborate these two observations a bit more. I don't understand why we require ##|f_x(p)-f_x(a)|<\varepsilon##. Basically I don't understand why we require ##|f_y(p)-f_y(a)|<\varepsilon## either. Moreover, why is ##B## little oh of ##\|v\|## (shouldn't it be ##o(|\lambda|)##?)? If I understand you correctly, ##v=(\lambda,\mu)## is the point which we want to tend to ##0##. Little oh's tend to confuse me, and I'd be grateful if you could explain, with ##\epsilon## and ##\delta##'s how we arrive at $$f(a+v)- f(a)- L(v)<\varepsilon. $$

psie · Jun 12, 2024

fresh_42 said:

Found it. Unfortunately has every author his own notations. In my case, I have even two translations to perform. The proof is given for ##n=2.##

If ##a\longmapsto D_a(f)## exists and is continuous then ##f_x:= \dfrac{\partial f}{\partial x} := D_a(f)\cdot (1,0)## and ##f_y:= \dfrac{\partial f}{\partial y} := D_a(f)\cdot (0,1)## do the job including continuity.

Yes, I saw your attempt but I think you are a little bit confused. We do not need manifolds here (and no metric ##g## or whatever it is), and we can assume ##m=1## so we don't need projections either.

Now to the other direction, if the partial derivatives ##f_x## and ##f_y## exist and are continuous. We need to find a linear map ##L## such that ##f(a+v)-f(a)- L(v) < \varepsilon .##

First, we chose a rectangular neighborhood of ##a## such that ##|f_x(p)-f_x(a)|<\varepsilon ## and ##|f_y(p)-f_y(a)|<\varepsilon ## for its points. This is possible because of the continuity of the partial derivatives ##f_x## and ##f_y##. The trick is now to consider (with ##L(\lambda,\mu):=\lambda \cdot f_x(a)+ \mu\cdot f_y(a)## and ##v=(\lambda,\mu)##)
\begin{align*}
f(a+v)- f(a)- L(v)&=\underbrace{\left(f(a+v)-f(a+(\lambda,0))-\mu \cdot f_y(a)\right)}_{=A}\\
&\phantom{=}+\underbrace{\left(f(a+(\lambda,0))-f(a)-\lambda \cdot f_x(a)\right)}_{=B}
\end{align*}
The second term ##B## is ##o(\|v\|)## because of partial differentiability along ##x## at ##a##.
The first term ##A## uses partial differentiability along ##y## at ##a+(\lambda,0)## and ##|f_y(a+(\lambda,0))-f_y(a)|<\varepsilon .##

Please check the details and if I made typos. The proof uses the definition of differentiation with the Weierstraß formula as in the link I gave. It is a proof for ##n=2.## You can do it recursively for any other ##n## which is only more work to write.

I just don't understand how ##f(a+v)-f(a)- L(v) < \varepsilon## shows that ##a\mapsto D_a(f)## is continuous. Do you think you could explain that bit?

fresh_42 · Jun 12, 2024

psie said:

I just don't understand how ##f(a+v)-f(a)- L(v) < \varepsilon## shows that ##a\mapsto D_a(f)## is continuous. Do you think you could explain that bit?

mathwonk said:

existence is the hard part, continuity is trivial. i.e. if the derivative exists, then its matrix is the matrix of partials, so the map to the derivative is the map to the matrix of partials, which is continuous if each partial is continuous.

You can of course do it manually and get
\begin{align*}
\left|D_a(f)-D_{a+\delta}(f)\right|&= \left|\nabla_a(f)-\nabla_{a+\delta}(f)\right|\\
&=\left|\left(\left.\dfrac{\partial }{\partial x_1}\right|_a f-\left.\dfrac{\partial }{\partial x_1}\right|_{a+\delta}f,\ldots,\left.\dfrac{\partial }{\partial x_n}\right|_a f-\left.\dfrac{\partial }{\partial x_n}\right|_{a+\delta}f\right)\right|\\
&\le n\cdot \dfrac{\varepsilon }{n}=\varepsilon
\end{align*}
That's the idea. You may need to adapt it to your definition of continuity, play around with ##\varepsilon :=\max\{\varepsilon_1,\ldots,\varepsilon_n\}## and set the quantifiers in the ##\varepsilon -\delta## definition correctly and such technical details, but at its core it is what @mathwonk said.

psie · Jun 12, 2024

fresh_42 said:

\begin{align*}
\left|D_a(f)-D_{a+\delta}(f)\right|&= \left|\nabla_a(f)-\nabla_{a+\delta}(f)\right|\\
&=\left|\left(\left.\dfrac{\partial }{\partial x_1}\right|_a f-\left.\dfrac{\partial }{\partial x_1}\right|_{a+\delta}f,\ldots,\left.\dfrac{\partial }{\partial x_n}\right|_a f-\left.\dfrac{\partial }{\partial x_n}\right|_{a+\delta}f\right)\right|\\
&\le n\cdot \dfrac{\varepsilon }{n}=\varepsilon
\end{align*}

Ok, this helped, thanks. I guess in your norm notation you are using the 1-norm, or? The taxicab-norm, in other words?

I think there are two different maps here.

The first one is ##Df:\mathbb R^n\to \mathcal{L}(\mathbb R^n,\mathbb R^m)## given by ##a\mapsto Df(a)##, where ##\mathcal{L}(\mathbb R^n,\mathbb R^m)## is the space of linear transformation with domain and codomain as indicated.
Then there's the map ##Df(a):\mathbb R^n\to\mathbb R^m##, i.e. the linear transformation.

When we want to check the continuity, we are talking about the first map, right?

fresh_42 · Jun 12, 2024

psie said:

Ok, this helped, thanks. I guess in your norm notation you are using the 1-norm, or? The taxicab-norm, in other words?

I think there are two different maps here.

The first one is ##Df:\mathbb R^n\to \mathcal{L}(\mathbb R^n,\mathbb R^m)## given by ##a\mapsto Df(a)##, where ##\mathcal{L}(\mathbb R^n,\mathbb R^m)## is the space of linear transformation with domain and codomain as indicated.

Then there's the map ##Df(a):\mathbb R^n\to\mathbb R^m##, i.e. the linear transformation.

When we want to check the continuity, we are talking about the first map, right?

I set ##m=1## but yes.

You do not need to consider ##m>1## since everything is happening on each component of ##f=(f_1,\ldots,f_m).## It is a rather formal collection of ##f_j\,.## Sure, we need an arbitrary dimension for the manifolds, but the considerations within an atlas, i.e. on ##\mathbb{R}^m## are component by component ##f_j\,.##

The continuity of linear transformations of finite-dimensional vector spaces is more or less trivially true (if ##v \to 0## then ##\varphi (v)\to 0 ## since ##\|\varphi \|<\infty ##).

We need to consider the location ##a## as our variable in ##D_a(f).## That's the crucial point whenever you deal with derivatives, to know what the variables are(!), and even more, if it is on manifolds where you additionally have to be aware of whether you're on a manifold or on an atlas of it. Things on Earth are messy, but things on a chart in an atlas of your city are ordinary flat, real calculus. That's why the charts are so helpful. We reduce the complicated situation to a locally easier one. And since differentiability and continuity are local events, we are allowed to do this. Just make the neighborhoods as small as necessary.

I once collected different perspectives on differentiability in a list of ten points and "slope" wasn't even among them (at the beginning):
https://www.physicsforums.com/insights/journey-manifold-su2mathbbc-part/
It's been a bit out of fun and maybe also off-topic in that article.

Doubt about theorem in Calculus on Manifolds

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

High School Straightforward integration…

Undergrad Unit Circle Confusion: A Self-Study Challenge?

Undergrad Proving that convexity implies second order derivative being positive

Undergrad Ambiguity of the term "indefinite integral"

High School Arc Length for Hyperbolic Sin

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight