Chain Rule Variation: Does g Need to be Invertible?

Click For Summary

Discussion Overview

The discussion centers around the application of the chain rule in the context of composite functions, specifically questioning whether the function g must be invertible with respect to its variables for certain derivative relationships to hold. Participants explore the implications of treating g as a function versus a variable and the validity of expressions involving partial derivatives in multivariable contexts.

Discussion Character

  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants question the validity of the expression ∂f/∂g = (∂f/∂x)(∂x/∂g) + (∂f/∂y)(∂y/∂g) without g being invertible.
  • Others argue that treating g as a variable rather than a function leads to confusion, particularly regarding the meaning of ∂x/∂g.
  • A participant proposes a scenario where z = g(x,y) and introduces a parameter t to discuss the derivatives in terms of t, suggesting that df/dt is well-defined under certain conditions.
  • Some participants express that ∂f/∂g is not defined in the context of multivariable functions, while others challenge this by providing examples where they believe it can be interpreted meaningfully.
  • There is a discussion about the notation used in the chain rule, with some participants finding it misleading when applied to functions of multiple variables.
  • One participant acknowledges that the expression for the one-variable case is valid only if g can be inverted to express x in terms of g.

Areas of Agreement / Disagreement

Participants do not reach a consensus on whether ∂f/∂g is a well-defined concept in the context of multivariable functions. There are competing views on the validity of expressions involving g and the interpretation of the chain rule.

Contextual Notes

Limitations include the ambiguity in the definitions of the variables and functions involved, as well as the potential misinterpretation of notation in the context of multivariable calculus.

BucketOfFish
Messages
60
Reaction score
1
I have a composite function f(g(x,y)).

When is it true that ∂f/∂g = (∂f/∂x)(∂x/∂g) + (∂f/∂y)(∂y/∂g)?

Does g have to be invertible with respect to x and y for this to be true?
 
Physics news on Phys.org
BucketOfFish said:
I have a composite function f(g(x,y)).

When is it true that ∂f/∂g = (∂f/∂x)(∂x/∂g) + (∂f/∂y)(∂y/∂g)?
What you have on the right side doesn't make sense to me. You are treating g as if it were a variable rather than a function.

∂x/∂g is the partial of x with respect to g.
BucketOfFish said:
Does g have to be invertible with respect to x and y for this to be true?

Let's get some variables in here by assuming that z = g(x, y), and that x = h(t), y = k(t).

So f is a function of t alone, so it makes sense to talk about df/dt.

df/dt = ∂g/∂x * dx/dt + ∂g/∂y * dy/dt

df/dt exists and is defined provided that all of the other derivatives exist and are defined. IOW, provided that ∂g/∂x and ∂g/∂y exist and are defined, and that h and k are differentiable functions of t.

Note that dx/dt = h'(t) and dy/dt = k'(t).
 
Hey Mark, thanks for the reply.

It seems that what you are describing with the f(t) example is simply a normal chain rule. I already know that for f(x(t),y(t)) it is true that ∂f/∂t = (∂f/∂x)(∂x/∂t) + (∂f/∂y)(∂y/∂t).

However, I am more interested in the "other way around". In the one-variable case, I mean to ask whether it is okay to say that for f(g(x)), ∂f/∂g = (∂f/∂x)(∂x/∂g).

I'm afraid I don't understand your objection about differentiating with respect to a function. I know that taking ∂f/∂g is okay. Are you saying that ∂x/∂g is undefined? I believe I have seen similar usages in many physics problems, where ∂x/∂g is simply taken to be 1/(∂g/∂x). In this case, the change in x given a change in g, holding all other dependences of both x and g fixed, seems to be well defined. For example, if g=2x, then it's true that ∂g/∂x=2. Similarly, x=g/2 and ∂x/∂g= 1/2.

EDIT: I just realized that the equation I wrote for the one-variable case is valid if and only if g(x) can be inverted to obtain an equation for x(g). But I'm still confused about the validity of the original problem I posed, where g(x,y) is a function of two variables. My question remains the same. Does the multivariable "inverse chain rule" hold?
 
Last edited:
∂f/∂g is not defined...unless you just mean f'.

You have probably seen notations like
$$\frac{d}{dx}f(g(x),h(x)) =\frac{\partial f}{\partial g}\frac{\partial g}{\partial x} +\frac{\partial f}{\partial h}\frac{\partial h}{\partial x},$$ but here ##\partial f/\partial g## just means ##D_1 f##, i.e. the first partial derivative of f. It's written with a g in the denominator because we are evaluating the function that's the result of the partial derivative operation at (g(x),h(x)), and that makes it possible to think of g(x) as the "first variable".

I think it's a very misleading notation, since the operation of "taking the first partial derivative of f" is something that doesn't involve g in any way.
 
Last edited:
  • Like
Likes   Reactions: 1 person
Can you explain that, Fredrik? Why is ∂f/∂g not defined?

If f=g^2, where g=3x, is it not the case that ∂f/∂g=2g=6x?

EDIT: I'm confused, because in the normal chain rule, where ∂f/∂x=(∂f/∂g)(∂g/∂x), we see the term ∂f/∂g. Is it well-defined in this usage?
 
Last edited:
I explained a bit more in an edit that that I was still typing when you posted that.

If g is defined by ##g(x)=3x## for all x, and f is defined by ##f(x)=x^2## for all x. Then the chain rule tells us that ##(f\circ g)'(x)=f'(g(x))g'(x)=2g(x)\cdot 3 =18x## for all x.

This specific problem can also be done without using the chain rule.

##f(x)=g(x)^2=(3x)^2=9x^2\Rightarrow f'(x)=18x.##

The chain rule ##(f\circ g)'(x)=f'(g(x))g'(x)## is often written as
$$\frac{df}{dx}=\frac{df}{dg}\frac{dg}{dx}.$$ This is a notation that I find very misleading, for the reasons mentioned in my edit of my previous post. The notation df/dg makes it look like g is somehow involved in the process of taking the derivative of f.
 
BucketOfFish said:
Can you explain that, Fredrik? Why is ∂f/∂g not defined?

If f=g^2, where g=3x, is it not the case that ∂f/∂g=2g=6x?
Let's lose the ∂ notation, since the functions are single-variable here.

df/dg = 2g
dg/dx = 3x

df/dx = df/dg * dg/dx = 2g * 3x = 6x * 3x = 18x2
BucketOfFish said:
EDIT: I'm confused, because in the normal chain rule, where ∂f/∂x=(∂f/∂g)(∂g/∂x), we see the term ∂f/∂g. Is it well-defined in this usage?
 
The notation f(g(x,y)) strongly suggests that ##f:\mathbb R\to\mathbb R##. So if I see the notation df/dg, or worse, ∂f/∂g, I can only interpret it as f'.

But I couldn't even do that when I read your post, because you wrote things like ∂x/∂g on the right-hand side. This seems to rule out that we're talking about f', since a computation of f' doesn't involve any functions other than f.
 
Ok, thanks a lot guys, your explanations really helped. I see now that since ∂f/∂g is only used as a tool in evaluating the chain rule for a variable, it has no significance on its own. Thus, it really makes no sense to try and find an expression for it! Once again, thanks!
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 9 ·
Replies
9
Views
3K