The Chain Rule for Multivariable Vector-Valued Functions ....

Click For Summary

Discussion Overview

The discussion revolves around understanding Theorem 12.7 from Tom M Apostol's "Mathematical Analysis," specifically focusing on the Chain Rule for multivariable vector-valued functions. Participants seek clarification on the theorem's equations, definitions, and implications within the context of multivariable differential calculus.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Homework-related

Main Points Raised

  • One participant seeks to understand how Equation (16) follows from a previous equation, questioning the definitions and roles of the functions involved.
  • Another participant clarifies that ##E(y)## is defined in (17) and distinguishes it from ##E_a(y)## and ##E_b(v)##, emphasizing that they are different functions.
  • There is a discussion about the limit condition for the remainder function ##r(v)## and its implications for the continuity of ##E()## at zero.
  • Participants explore the formal definitions of differentiation and how they relate to Apostol's notation, raising questions about equivalences between different formulations.
  • Clarifications are sought regarding the conditions under which ##r(0) = 0## holds true, with references to the definitions provided in Apostol's work.

Areas of Agreement / Disagreement

Participants express varying levels of understanding and agreement on the definitions and implications of the equations presented. Some participants clarify points for others, but no consensus is reached on all aspects of the theorem or its applications.

Contextual Notes

Participants note that the definitions of the remainder functions and their limits are crucial to understanding the theorem, and there are unresolved questions about the implications of these definitions in different contexts.

Who May Find This Useful

Readers interested in multivariable calculus, particularly those studying differential calculus and the Chain Rule in vector-valued functions, may find this discussion beneficial.

Math Amateur
Gold Member
MHB
Messages
3,920
Reaction score
48
I am reading Tom M Apostol's book "Mathematical Analysis" (Second Edition) ...

I am focused on Chapter 12: Multivariable Differential Calculus ... and in particular on Section 12.9: The Chain Rule ... ...

I need help in order to fully understand Theorem 12.7, Section 12.9 ...

Theorem 12.7 (including its proof) reads as follows:
?temp_hash=94e326edac58a0ed69338d46334d19ae.png

?temp_hash=94e326edac58a0ed69338d46334d19ae.png


In the proof of Theorem 12.7 we read the following:

" ... ... Using (14) in (15) we find

##f(b+v) - f(b) = f'(b) [ g'(a) (y) ] + f'(b) [ \| y \| E_a(y) ] + \|v \| E_b(v)####= f'(b) [ g'(a) (y) ] + \| y \| E(y)## ... ... ... (16)Where ##E(0) = 0## and##E(y) = f'(b) [ E_a(y) ] + \frac{ \| v \| }{ \| y \| } E_b (v) \ \ \ \ \text{ if } y\neq 0## ... ... ... (17)... ... ... "
*** EDIT ***

It now occurs to me that in fact Apostol is defining E(y) in equations (16) and (17)

I should have seen this earlier ...

Peter

================================================================My questions are as follows:Question 1

Can someone show how Equation (16) follows ... that is ...

... how exactly does ##f(b+v) - f(b) = f'(b) [ g'(a) (y) ] + \| y \| E(y)##

follow from

##f(b+v) - f(b) = f'(b) [ g'(a) (y) ] + f'(b) [ \| y \| E_a(y) ] + \|v \| E_b(v)##?

Question 2

What is ##E## ... I know what ##E_a## and ##E_b## are ... but what is ##E##?

Similarly ... what is ##E(y)## in (16) and in (17) ... shouldn't it be ##E_a(y)## ... ?

Further ... why (formally and rigorously) does ##E(0) = 0##
Question 3

Can someone please demonstrate how/why

##E(y) = f'(b) [ E_a(y) ] + \frac{ \| v \| }{ \| y \| } E_b (v)##
Help will be appreciated ...

Peter

=========================================================================================

It may help Physics Forum readers of the above post to have access to Apostol's section on the Total Derivative ... so I am providing the same ... as follows:
?temp_hash=94e326edac58a0ed69338d46334d19ae.png

?temp_hash=94e326edac58a0ed69338d46334d19ae.png
Hope that helps ...

Peter
 

Attachments

  • Apostol - 1 - Theorem 12.7 - Chain Rule - PART 1 ... .png
    Apostol - 1 - Theorem 12.7 - Chain Rule - PART 1 ... .png
    43.4 KB · Views: 1,140
  • Apostol - 2 - Theorem 12.7 - Chain Rule - PART 2 ...  ... .png
    Apostol - 2 - Theorem 12.7 - Chain Rule - PART 2 ... ... .png
    31 KB · Views: 1,205
  • Apostol - 1 - Section 12.4 - PART 1 ... .png
    Apostol - 1 - Section 12.4 - PART 1 ... .png
    44.6 KB · Views: 519
  • Apostol - 2 - Section 12.4 - PART 2 ... .png
    Apostol - 2 - Section 12.4 - PART 2 ... .png
    39.7 KB · Views: 540
  • ?temp_hash=94e326edac58a0ed69338d46334d19ae.png
    ?temp_hash=94e326edac58a0ed69338d46334d19ae.png
    43.4 KB · Views: 736
  • ?temp_hash=94e326edac58a0ed69338d46334d19ae.png
    ?temp_hash=94e326edac58a0ed69338d46334d19ae.png
    31 KB · Views: 685
  • ?temp_hash=94e326edac58a0ed69338d46334d19ae.png
    ?temp_hash=94e326edac58a0ed69338d46334d19ae.png
    44.6 KB · Views: 542
  • ?temp_hash=94e326edac58a0ed69338d46334d19ae.png
    ?temp_hash=94e326edac58a0ed69338d46334d19ae.png
    39.7 KB · Views: 509
Last edited:
Physics news on Phys.org
Math Amateur said:
*** EDIT ***

It now occurs to me that in fact Apostol is defining E(y) in equations (16) and (17)

I should have seen this earlier ...

Peter
Which of the following questions does this answer already?
Question 1

Can someone show how Equation (16) follows ... that is ...

... how exactly does ##f(b+v) - f(b) = f'(b) [ g'(a) (y) ] + \| y \| E(y)##

follow from

##f(b+v) - f(b) = f'(b) [ g'(a) (y) ] + f'(b) [ \| y \| E_a(y) ] + \|v \| E_b(v)##?
I guess this is the question you answered yourself: ##E(y)## is defined in (17) such that this follows.
Question 2

What is ##E## ... I know what ##E_a## and ##E_b## are ... but what is ##E##?
The general formula for differentiation in direction ##v## goes: ##f\,'(a) \cdot v = f(a+v)-f(a)+r(v)## with a remainder function ##r()## that has to obey ##\lim_{v \to 0} \dfrac{r(v)}{||v||} = 0##. ##E()## is the remainder we get from our formulas. Now we have to show, that the limit condition is obeyed.
Similarly ... what is ##E(y)## in (16) and in (17) ... shouldn't it be ##E_a(y)## ... ?
No. (17) is what defines ##E(y)##, and it depends on both evaluation points. But since ##b=g(a)##, in the end it depends on ##a## alone. But we do not write ##E_a(y)## because we have to distinguish our new remainder function ##E()## from the earlier one ##E_a(y)##. They are two different functions, see (17).
Further ... why (formally and rigorously) does ##E(0) = 0##
We consider the general formula ##f\,'(a) \cdot v = f(a+v)-f(a)+r(v)## again. Now if ##v=0##, we have ##r(v)=0##. Applied to the differentiations of ##f(x)## and ##g(x)## in (14) and (15), this means ##r(0)=E_a(0)=0## resp. ##r(0)=E_b(0)=0##. So by (17) we have ##E(0)= \text{ sth. I }\cdot 0 + \text{ sth. II }\cdot 0##. Now if ##\text{ sth. II }## is bounded as ##y \to 0##, we have a continuous function ##E()## if we set ##E(0)=0##. Remember, that we defined ##E()## such that it fits our needs. So if the coefficient in (17) at ##E_b(0)## will remain finite if we set ##y=0##, then ##E(0)=0## is a continuous expansion of ##E()## at ##y=0##.
Question 3

Can someone please demonstrate how/why

##E(y) = f'(b) [ E_a(y) ] + \frac{ \| v \| }{ \| y \| } E_b (v)##
We set it so. ##E(y) := \ldots##
 
  • Like
Likes   Reactions: Math Amateur
THanks fresh_42 ...

Appreciate your help ...

Peter
 
fresh_42 said:
Which of the following questions does this answer already?

I guess this is the question you answered yourself: ##E(y)## is defined in (17) such that this follows.

The general formula for differentiation in direction ##v## goes: ##f\,'(a) \cdot v = f(a+v)-f(a)+r(v)## with a remainder function ##r()## that has to obey ##\lim_{v \to 0} \dfrac{r(v)}{||v||} = 0##. ##E()## is the remainder we get from our formulas. Now we have to show, that the limit condition is obeyed.

No. (17) is what defines ##E(y)##, and it depends on both evaluation points. But since ##b=g(a)##, in the end it depends on ##a## alone. But we do not write ##E_a(y)## because we have to distinguish our new remainder function ##E()## from the earlier one ##E_a(y)##. They are two different functions, see (17).

We consider the general formula ##f\,'(a) \cdot v = f(a+v)-f(a)+r(v)## again. Now if ##v=0##, we have ##r(v)=0##. Applied to the differentiations of ##f(x)## and ##g(x)## in (14) and (15), this means ##r(0)=E_a(0)=0## resp. ##r(0)=E_b(0)=0##. So by (17) we have ##E(0)= \text{ sth. I }\cdot 0 + \text{ sth. II }\cdot 0##. Now if ##\text{ sth. II }## is bounded as ##y \to 0##, we have a continuous function ##E()## if we set ##E(0)=0##. Remember, that we defined ##E()## such that it fits our needs. So if the coefficient in (17) at ##E_b(0)## will remain finite if we set ##y=0##, then ##E(0)=0## is a continuous expansion of ##E()## at ##y=0##.

We set it so. ##E(y) := \ldots##
Hi fresh_42 ...

Thanks again for the help ...

Just a clarification ...

You write ...

" ... ... The general formula for differentiation in direction ##v## goes: ##f\,'(a) \cdot v = f(a+v)-f(a)+r(v)## ... ..."

Changing ##a## to ##c## to conform with Apostol's notation we have ##f\,'(a) \cdot v = f(a+v)-f(a)+r(v)## ... ... ... (1)

Apostol's definition of differentiation (Equation (5) Section 12.4, page 346) reads as follows:

##f(c+v) = f(c) + f\, '(c)(v) + \| v \| E_c(v)## ... ... ... (2)

Now (2) ##\Longrightarrow f\, '(c)(v) = f(c+v) - f(c) - \| v \| E_c(v)## ... ... ... (3)

So comparing (1) and (3) we find they will be equivalent if

##f\,'(a) \cdot v = f\, '(c)(v)## ... ... ... (4)

and

##r(v) = - \| v \| E_c(v)## ... ... ... (5)Can you please explain why (4) and (5) hold true ... ?... and just another minor issue ...


You define r in the following sentence ...

" ... ... The general formula for differentiation in direction ##v## goes: ##f\,'(a) \cdot v = f(a+v)-f(a)+r(v)## with a remainder function ##r()## that has to obey ##\lim_{v \to 0} \dfrac{r(v)}{||v||} = 0##... "

and then later you assert "Now if ##v=0##, we have ##r(v)=0##." ... ...

Can you explain why ##r(0) = 0## ...
Thanks again for your help ...

Peter
 
Math Amateur said:
Hi fresh_42 ...

Thanks again for the help ...

Just a clarification ...

You write ...

" ... ... The general formula for differentiation in direction ##v## goes: ##f\,'(a) \cdot v = f(a+v)-f(a)+r(v)## ... ..."

Changing ##a## to ##c## to conform with Apostol's notation we have ##f\,'(a) \cdot v = f(a+v)-f(a)+r(v)## ... ... ... (1)

Apostol's definition of differentiation (Equation (5) Section 12.4, page 346) reads as follows:

##f(c+v) = f(c) + f\, '(c)(v) + \| v \| E_c(v)## ... ... ... (2)

Now (2) ##\Longrightarrow f\, '(c)(v) = f(c+v) - f(c) - \| v \| E_c(v)## ... ... ... (3)

So comparing (1) and (3) we find they will be equivalent if

##f\,'(a) \cdot v = f\, '(c)(v)## ... ... ... (4)

and

##r(v) = - \| v \| E_c(v)## ... ... ... (5)Can you please explain why (4) and (5) hold true ... ?
(4) is just a different notation. ##f\,'(c)## for functions ##f \, : \,\mathbb{R}^n \longrightarrow \mathbb{R}^m## is the Jacobi matrix evaluated at point ##x=c##. And ##v## is the direction, in which we consider the slope of the function, so ##f\,'(c)(v)= \text{ matrix times vector } = f\,'(c)\cdot v\,.##

(5) is also a notational difference. I abbreviated the remainder function by ##r()## and Apostol by ##E()## - for error function I guess. It has to run faster against zero, than the vector ##v##, i.e. ##\lim_{v\to 0}\dfrac{r(v)}{||v||}=0\,.## I only incorporated the ##||v||## term into the function ##r()## whereas Apostol operates with ##E(v)=\dfrac{r(v)}{||v||}##, i.e. we have ##\lim_{v\to 0}\dfrac{r(v)}{||v||}=\lim_{v \to 0}E(v)=0## as condition. It's just how you write the "error".
... and just another minor issue ...


You define r in the following sentence ...

" ... ... The general formula for differentiation in direction ##v## goes: ##f\,'(a) \cdot v = f(a+v)-f(a)+r(v)## with a remainder function ##r()## that has to obey ##\lim_{v \to 0} \dfrac{r(v)}{||v||} = 0##... "

and then later you assert "Now if ##v=0##, we have ##r(v)=0##." ... ...

Can you explain why ##r(0) = 0## ...
If we take ##f\,'(a) \cdot v = f(a+v)-f(a)+r(v)## and calculate with ##v=0## we get ##0=f\,'(a) \cdot 0 = f(a+0)-f(a)+r(0)=f(a)-f(a)+r(0)=r(0)##
Thanks again for your help ...

Peter
 
  • Like
Likes   Reactions: Math Amateur
Thanks fresh_42 ...

Most helpful ...

Peter
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 71 ·
3
Replies
71
Views
4K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K