- #1
sponsoredwalk
- 533
- 5
I'm looking at the proof of the multivariable chain rule & just a little bit curious about something.
In the single variable chain rule proof the way I know it is that you take the derivative:
[itex] f'(x) \ = \ \lim_{ \Delta x \to \infty} \frac{ \Delta y}{ \Delta x} [/itex]
and manipulate it as follows:
[itex] f'(x) \ - \ \lim_{ \Delta x \to \infty} \frac{ \Delta y}{ \Delta x} \ = \ 0 [/itex]
[itex] f'(x) \ - \ \frac{ \Delta y}{ \Delta x} \ = \ \epsilon (x) [/itex]
[itex] \Delta y \ = \ f'(x) \Delta x \ + \ \epsilon (x) \Delta x [/itex]
and you work off that function to prove the single variable version.
The multivariable version uses a function:
[itex] \Delta z \ = \ f_x(x,y) \Delta x \ + \ f_y(x,y) \Delta y \ + \ \epsilon_1 (x) \Delta x \ + \ \epsilon_2 (x) \Delta y [/itex]
which I can see is analogous to the single variable version but having
trouble deriving to be honest. But assuming that I'm okay with this
function I wonder about the proof.
The special case is just do divide by Δt & take the limit:
[itex] \frac{dz}{dt} \ = \ \lim_{ \Delta t \to \infty} \frac{ \Delta z}{ \Delta t} \ = \ \lim_{ \Delta t \to \infty} \ [ \ f_x(x,y) \frac{ \Delta x}{ \Delta t} \ + \ f_y(x,y) \frac{ \Delta y}{ \Delta t} \ + \ \epsilon_1 (x) \frac{ \Delta x}{ \Delta t} \ + \ \epsilon_2 (x) \frac{ \Delta y}{ \Delta t} \ ] \ = \ \ f_x(x,y) \frac{ dx}{ dt} \ + \ f_y(x,y) \frac{ d y}{ dt} [/itex]
and if f(x,y) has both x & y as functions of two variables
z = f(x,y) = f [ x(s,t),y(s,t) ]
then you follow the exact same idea if you're taking the partial w.r.t.
to s or t.
The general chain rule would just be a natural extension of this right? i.e.
z = f(x₁,x₂,...,xᵢ) = f [ x₁(t₁,t₂,...,tᵢ),x₂(t₁,t₂,...,tᵢ),...,xᵢ(t₁,t₂,...,tᵢ) ]
and the partial w.r.t. to tᵥ is the exact same idea:[itex] \frac{\partial z}{\partial t_v} \ = \ f_{x_1}[x_2(t_1,t_2,...,t_v,...,t_i),x_2(t_1,t_2,...,t_v,...,t_i),...] \ \frac{dx_1}{dt_v} \ + \ f_{x_2}[x_2(t_1,t_2,...,t_v,...,t_i),x_2(t_1,t_2,...,t_v,...,t_i),...] \ \frac{dx_2}{dt_v} \ + \ ... \ + \ f_{x_i}[x_1(t_1,t_2,...,t_v,...,t_i),x_2(t_1,t_2,...,t_v,...,t_i),...] \ \frac{dx_i}{dt_v}[/itex]
obviously the notation can be shortened but that's it right?
---------------------------------------------------------------------------
Assuming that proof to be correct I'm wondering about the function[itex] \Delta z \ = \ f_x(x,y) \Delta x \ + \ f_y(x,y) \Delta y \ + \ \epsilon_1 (x) \Delta x \ + \ \epsilon_2 (x) \Delta y [/itex]
I mean rather than just saying it's analogous in different dimensions
shouldn't there be a way to derive it from the very similar arguments
involving tangent planes?
Start with the vector equation N • (X - X₀) = 0 to derive the plane.
N•(X - X₀) = 0
(A,B,C)•[(x - x₀),(y - y₀),(z - z₀)] = 0
A(x - x₀) + B(y - y₀) + C(z - z₀) = 0
z - z₀ = (-A/C)(x - x₀) + (-B/C)(y - y₀)
f(x,y) = f(x₀,y₀) + (-A/C)(x - x₀) + (-B/C)(y - y₀)
f(x,y) = f(x₀,y₀) + (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
Now, I understand that this is the description of the tangent plane that
intersects the point f(x₀,y₀) & can be used to approximate a function for
all x close to f(x₀,y₀)
f(x,y) ≈ f(x₀,y₀) + (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
I say this to make sure I have the correct understanding, when I derived
f(x,y) = f(x₀,y₀) + (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
above I was deriving a linear tangent plane equation but for any function
at the point f(x₀,y₀) we can use this equation to find the tangent plane
intersecting the point f(x₀,y₀) and we can also linearly approximate any
function for all x,y close to f(x₀,y₀) just like the single variable tangent line.
It is the extra terms of taylor's formula that turn
f(x,y) ≈ f(x₀,y₀) + ... into f(x,y) = f(x₀,y₀) + ...
That's been confusing me & I'd really appreciate confirmation that I've
got the logic right now.
But how do we turn f(x,y) - f(x₀,y₀) = (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀) into
[itex] \Delta z \ = \ f_x(x,y) \Delta x \ + \ f_y(x,y) \Delta y \ + \ \epsilon_1 (x) \Delta x \ + \ \epsilon_2 (x) \Delta y [/itex]
Or said maybe a bit more clearly, turning:
f(x,y) - f(x₀,y₀) = (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
f(x₀ + Δx,y₀ + Δy) - f(x₀,y₀) = (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
Δz= (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
into:
Δz= (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀) + ε₁(x)Δx + ε₂(y)Δy
Δz= (∂f/∂x)Δx + (∂f/∂y)Δy + ε₁(x)Δx + ε₂(y)Δy
in a more linear fashion than just saying it should work
In the single variable chain rule proof the way I know it is that you take the derivative:
[itex] f'(x) \ = \ \lim_{ \Delta x \to \infty} \frac{ \Delta y}{ \Delta x} [/itex]
and manipulate it as follows:
[itex] f'(x) \ - \ \lim_{ \Delta x \to \infty} \frac{ \Delta y}{ \Delta x} \ = \ 0 [/itex]
[itex] f'(x) \ - \ \frac{ \Delta y}{ \Delta x} \ = \ \epsilon (x) [/itex]
[itex] \Delta y \ = \ f'(x) \Delta x \ + \ \epsilon (x) \Delta x [/itex]
and you work off that function to prove the single variable version.
The multivariable version uses a function:
[itex] \Delta z \ = \ f_x(x,y) \Delta x \ + \ f_y(x,y) \Delta y \ + \ \epsilon_1 (x) \Delta x \ + \ \epsilon_2 (x) \Delta y [/itex]
which I can see is analogous to the single variable version but having
trouble deriving to be honest. But assuming that I'm okay with this
function I wonder about the proof.
The special case is just do divide by Δt & take the limit:
[itex] \frac{dz}{dt} \ = \ \lim_{ \Delta t \to \infty} \frac{ \Delta z}{ \Delta t} \ = \ \lim_{ \Delta t \to \infty} \ [ \ f_x(x,y) \frac{ \Delta x}{ \Delta t} \ + \ f_y(x,y) \frac{ \Delta y}{ \Delta t} \ + \ \epsilon_1 (x) \frac{ \Delta x}{ \Delta t} \ + \ \epsilon_2 (x) \frac{ \Delta y}{ \Delta t} \ ] \ = \ \ f_x(x,y) \frac{ dx}{ dt} \ + \ f_y(x,y) \frac{ d y}{ dt} [/itex]
and if f(x,y) has both x & y as functions of two variables
z = f(x,y) = f [ x(s,t),y(s,t) ]
then you follow the exact same idea if you're taking the partial w.r.t.
to s or t.
The general chain rule would just be a natural extension of this right? i.e.
z = f(x₁,x₂,...,xᵢ) = f [ x₁(t₁,t₂,...,tᵢ),x₂(t₁,t₂,...,tᵢ),...,xᵢ(t₁,t₂,...,tᵢ) ]
and the partial w.r.t. to tᵥ is the exact same idea:[itex] \frac{\partial z}{\partial t_v} \ = \ f_{x_1}[x_2(t_1,t_2,...,t_v,...,t_i),x_2(t_1,t_2,...,t_v,...,t_i),...] \ \frac{dx_1}{dt_v} \ + \ f_{x_2}[x_2(t_1,t_2,...,t_v,...,t_i),x_2(t_1,t_2,...,t_v,...,t_i),...] \ \frac{dx_2}{dt_v} \ + \ ... \ + \ f_{x_i}[x_1(t_1,t_2,...,t_v,...,t_i),x_2(t_1,t_2,...,t_v,...,t_i),...] \ \frac{dx_i}{dt_v}[/itex]
obviously the notation can be shortened but that's it right?
---------------------------------------------------------------------------
Assuming that proof to be correct I'm wondering about the function[itex] \Delta z \ = \ f_x(x,y) \Delta x \ + \ f_y(x,y) \Delta y \ + \ \epsilon_1 (x) \Delta x \ + \ \epsilon_2 (x) \Delta y [/itex]
I mean rather than just saying it's analogous in different dimensions
shouldn't there be a way to derive it from the very similar arguments
involving tangent planes?
Start with the vector equation N • (X - X₀) = 0 to derive the plane.
N•(X - X₀) = 0
(A,B,C)•[(x - x₀),(y - y₀),(z - z₀)] = 0
A(x - x₀) + B(y - y₀) + C(z - z₀) = 0
z - z₀ = (-A/C)(x - x₀) + (-B/C)(y - y₀)
f(x,y) = f(x₀,y₀) + (-A/C)(x - x₀) + (-B/C)(y - y₀)
f(x,y) = f(x₀,y₀) + (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
Now, I understand that this is the description of the tangent plane that
intersects the point f(x₀,y₀) & can be used to approximate a function for
all x close to f(x₀,y₀)
f(x,y) ≈ f(x₀,y₀) + (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
I say this to make sure I have the correct understanding, when I derived
f(x,y) = f(x₀,y₀) + (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
above I was deriving a linear tangent plane equation but for any function
at the point f(x₀,y₀) we can use this equation to find the tangent plane
intersecting the point f(x₀,y₀) and we can also linearly approximate any
function for all x,y close to f(x₀,y₀) just like the single variable tangent line.
It is the extra terms of taylor's formula that turn
f(x,y) ≈ f(x₀,y₀) + ... into f(x,y) = f(x₀,y₀) + ...
That's been confusing me & I'd really appreciate confirmation that I've
got the logic right now.
But how do we turn f(x,y) - f(x₀,y₀) = (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀) into
[itex] \Delta z \ = \ f_x(x,y) \Delta x \ + \ f_y(x,y) \Delta y \ + \ \epsilon_1 (x) \Delta x \ + \ \epsilon_2 (x) \Delta y [/itex]
Or said maybe a bit more clearly, turning:
f(x,y) - f(x₀,y₀) = (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
f(x₀ + Δx,y₀ + Δy) - f(x₀,y₀) = (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
Δz= (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
into:
Δz= (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀) + ε₁(x)Δx + ε₂(y)Δy
Δz= (∂f/∂x)Δx + (∂f/∂y)Δy + ε₁(x)Δx + ε₂(y)Δy
in a more linear fashion than just saying it should work
Last edited: