The Chain Rule for Multivariable Vector-Valued Functions .... ....

In summary, Peter says that in order to understand Theorem 12.7, which proves the Chain Rule, one needs to have an understanding of the Total Derivative. Peter provides a brief explanation of this concept, along with equations that demonstrate how E(y) = f'(b) [ E_a(y) ] + \frac{ \| v \| }{ \| y \| } E_b (v).
  • #1
Math Amateur
Gold Member
MHB
3,990
48
I am reading Tom M Apostol's book "Mathematical Analysis" (Second Edition) ...I am focused on Chapter 12: Multivariable Differential Calculus ... and in particular on Section 12.9: The Chain Rule ... ...I need help in order to fully understand Theorem 12.7, Section 12.9 ...Theorem 12.7 (including its proof) reads as follows:
View attachment 8523
View attachment 8524
In the proof of Theorem 12.7 we read the following:

" ... ... Using (14) in (15) we find\(\displaystyle f(b+v) - f(b) = f'(b) [ g'(a) (y) ] + f'(b) [ \| y \| E_a(y) ] + \|v \| E_b(v) \)\(\displaystyle = f'(b) [ g'(a) (y) ] + \| y \| E(y)\) ... ... ... (16)Where \(\displaystyle E(0) = 0\) and \(\displaystyle E(y) = f'(b) [ E_a(y) ] + \frac{ \| v \| }{ \| y \| } E_b (v) \ \ \ \ \text{ if } y\neq 0\) ... ... ... (17)... ... ... "

My questions are as follows:Question 1

Can someone show how Equation (16) follows ... that is ...

... how exactly does \(\displaystyle f(b+v) - f(b) = f'(b) [ g'(a) (y) ] + \| y \| E(y)\)

follow from

\(\displaystyle f(b+v) - f(b) = f'(b) [ g'(a) (y) ] + f'(b) [ \| y \| E_a(y) ] + \|v \| E_b(v)\)?

Question 2

What is \(\displaystyle E(0)\) ... I know what \(\displaystyle E_a\) and \(\displaystyle E_b\) are ... but what is \(\displaystyle E\)?

Similarly ... what is \(\displaystyle E(y)\) in (16) and in (17) ... shouldn't it be \(\displaystyle E_a(y)\) ... ?

Further ... why (formally and rigorously) does \(\displaystyle E(0) = 0\)
Question 3

Can someone please demonstrate how/why

\(\displaystyle E(y) = f'(b) [ E_a(y) ] + \frac{ \| v \| }{ \| y \| } E_b (v)\)
Help will be appreciated ...

Peter

=========================================================================================

It may help MHB readers of the above post to have access to Apostol's section on the Total Derivative ... so I am providing the same ... as follows:
View attachment 8525
View attachment 8526
Hope that helps ...

Peter
 

Attachments

  • Apostol - 1 - Theorem 12.7 - Chain Rule - PART 1 ... .png
    Apostol - 1 - Theorem 12.7 - Chain Rule - PART 1 ... .png
    24.4 KB · Views: 66
  • Apostol - 2 - Theorem 12.7 - Chain Rule - PART 2 ...  ... .png
    Apostol - 2 - Theorem 12.7 - Chain Rule - PART 2 ... ... .png
    22.5 KB · Views: 57
  • Apostol - 1 - Section 12.4 - PART 1 ... .png
    Apostol - 1 - Section 12.4 - PART 1 ... .png
    32.4 KB · Views: 53
  • Apostol - 2 - Section 12.4 - PART 2 ... .png
    Apostol - 2 - Section 12.4 - PART 2 ... .png
    40.7 KB · Views: 56
Last edited:
Physics news on Phys.org
  • #2
Peter said:
In the proof of Theorem 12.7 we read the following:

" ... ... Using (14) in (15) we find\(\displaystyle f(b+v) - f(b) = f'(b) [ g'(a) (y) ] + f'(b) [ \| y \| E_a(y) ] + \|v \| E_b(v) \)\(\displaystyle = f'(b) [ g'(a) (y) ] + \| y \| E(y)\) ... ... ... (16)Where \(\displaystyle E(0) = 0\) and \(\displaystyle E(y) = f'(b) [ E_a(y) ] + \frac{ \| v \| }{ \| y \| } E_b (v) \ \ \ \ \text{ if } y\neq 0\) ... ... ... (17)... ... ... "

My questions are as follows:Question 1

Can someone show how Equation (16) follows ... that is ...

... how exactly does \(\displaystyle f(b+v) - f(b) = f'(b) [ g'(a) (y) ] + \| y \| E(y)\)

follow from

\(\displaystyle f(b+v) - f(b) = f'(b) [ g'(a) (y) ] + f'(b) [ \| y \| E_a(y) ] + \|v \| E_b(v)\)?

Question 2

What is \(\displaystyle E(0)\) ... I know what \(\displaystyle E_a\) and \(\displaystyle E_b\) are ... but what is \(\displaystyle E\)?

Similarly ... what is \(\displaystyle E(y)\) in (16) and in (17) ... shouldn't it be \(\displaystyle E_a(y)\) ... ?

Further ... why (formally and rigorously) does \(\displaystyle E(0) = 0\)
Question 3

Can someone please demonstrate how/why

\(\displaystyle E(y) = f'(b) [ E_a(y) ] + \frac{ \| v \| }{ \| y \| } E_b (v)\)
The answer to all three of those questions is that the equations in (17) are meant to be the definition of $E(y)$. In other words, if you define \[E(y) = \begin{cases}0&\text{if }y=0,\\ f'(b) [ E_a(y) ] + \frac{ \| v \| }{ \| y \| } E_b (v)&\text{if }y\ne0,\end{cases}\] then the second line of (16) follows immediately from the first line.
 
  • #3
Opalg said:
The answer to all three of those questions is that the equations in (17) are meant to be the definition of $E(y)$. In other words, if you define \[E(y) = \begin{cases}0&\text{if }y=0,\\ f'(b) [ E_a(y) ] + \frac{ \| v \| }{ \| y \| } E_b (v)&\text{if }y\ne0,\end{cases}\] then the second line of (16) follows immediately from the first line.
Thanks Opalg ...

Indeed ... you are right, of course!

Should have seen that ...

Peter
 

What is the chain rule for multivariable vector-valued functions?

The chain rule for multivariable vector-valued functions is a mathematical rule that describes how to find the derivative of a function that is composed of multiple variables and outputs a vector. It allows us to calculate the rate of change of a vector-valued function with respect to one of its input variables.

Why is the chain rule important in multivariable calculus?

The chain rule is important in multivariable calculus because it allows us to solve more complex problems involving multiple variables. It also helps us understand the relationship between the input and output variables in a vector-valued function and how changes in the input variables affect the output.

How is the chain rule used in real-world applications?

The chain rule is used in a variety of real-world applications, such as physics, engineering, economics, and computer graphics. It is used to model and analyze systems that involve multiple variables and their interdependencies, such as the motion of objects, the behavior of economic markets, and the rendering of 3D graphics.

What is the formula for the chain rule in multivariable calculus?

The formula for the chain rule in multivariable calculus is as follows: If z = f(x, y) is a vector-valued function and x = g(t) and y = h(t) are functions of a single variable t, then the derivative of z with respect to t is given by dz/dt = (∂f/∂x)(dx/dt) + (∂f/∂y)(dy/dt).

Can the chain rule be extended to more than two variables?

Yes, the chain rule can be extended to functions with more than two variables. In this case, the formula becomes more complex and involves partial derivatives with respect to each input variable. However, the basic principle remains the same - the derivative of a multivariable vector-valued function is the sum of the partial derivatives of the function with respect to each input variable multiplied by the corresponding rate of change of that input variable.

Similar threads

Replies
5
Views
2K
Replies
4
Views
2K
Replies
3
Views
1K
  • Topology and Analysis
Replies
24
Views
2K
Replies
2
Views
1K
Replies
4
Views
371
Replies
2
Views
2K
Replies
2
Views
394
  • Topology and Analysis
Replies
5
Views
2K
Back
Top