The Chain Rule and Function Composition

In summary: If you have a curve representing a relation between two variables, and you can draw a vertical line that intersects the curve more than once, this means that one value of the independent variable is associated with more than one value of the dependent variable. In other words, if you have more than one point with the same x-coordinate, then you don't have a function. But if you have more than one point with the same y-coordinate, then you can still have a function. That's because the same point on the y-axis can correspond to more than one point on the curve. If you put your finger on a point
  • #1
PFuser1232
479
20
This is a problem that has been bugging me for ages. I just can't wrap my head around this weird result. I know I went wrong somewhere [as a matter of fact, that was the answer I was hoping for], but most sources, (including, but not limited to, wikipedia), suggest otherwise.

I will cut to the chase.

A function f is defined by $$f(x)=x^2$$

The derivative, f', is then $$f'(x)=2x$$

So far so good.

Here is where all hell broke loose for me:

The composition of f and f' yields $$f[f'(x)]=(2x)^2=4x^2$$

I will now switch to Leibniz Notation for convinience.
u=f(x), and v=f'(x)

According to the chain rule

[itex]\frac{du}{dx}[/itex]=[itex]\frac{du}{dv}[/itex][itex]\frac{dv}{dx}[/itex]

$$f'(x)=f'[f'(x)]f''(x)$$

Well, we know what f'(x) is, it is 2x as shown above.
du/dv is (2)(2x)=4x
And dv/dx is simply 2.
Putting all of this in the above equation, for nonzero x: $$2x=8x$$ $$2=8$$

My speculation as to why this bizarre result popped out was the fact that my substitution of f'(x) for x in f(x) to get f[f'(x)] was wrong. But after going through several websites, I came to realize that there is nothing mathematically wrong with this substitution.

Can someone please point out to me where I went wrong with this?
I would be more than grateful.
 
Last edited:
Physics news on Phys.org
  • #2
MohammedRady97 said:
I will now switch to Leibniz Notation for convinience.
u=f(x), and v=f'(x)

According to the chain rule

[itex]\frac{du}{dx}[/itex]=[itex]\frac{du}{dv}[/itex][itex]\frac{dv}{dx}[/itex]

$$f'(x)=f'[f'(x)]f''(x)$$
That's not what the chain rule says. In your notation, you would have to write u=f(f'(x)) instead of u=f(x). Now you have ##\frac{du}{dx}=\frac{du}{dv}\frac{dv}{dx}##. I prefer the following notation: The chain rule says that ##(g\circ h)'(x)=g'(h(x))h'(x)##. If you apply this with g=f and h=f', you get ##(f\circ f')'(x)=f'(f'(x))f''(x)##.
 
  • Like
Likes ComplexVar89 and PFuser1232
  • #3
Makes sense now, thank you for pointing this out! :)
But one question remains regarding function composition. When it comes to physical systems (the simplest of which is a case in kinematics), I have trouble making sense of what happens while changing the argument of the function I'm looking at.
For instance, the velocity of a body, v m/s, measured at time t s is given by: $$v(t)=3+5t$$
I am given to understand that to find velocity as a function of another variable, say, acceleration (a m/s^2), we do this simple substitution: $$v(a)=3+5a$$
But isn't this a wrong result? I want your thoughts on this please.
 
  • #4
If v(t)=3+5t for all t, then v(a)=3+5a for all a. For example, if you input 2 into v, the output will be 13 regardless of which of the symbols ##a## or ##t## that you used to represent the number 2. So the first formula (interpreted as a "for all" statement) certainly implies the latter.

But this isn't a way to find "velocity as a function of acceleration" in general. It only accomplishes that goal when the relationship between acceleration and time is t=a (which doesn't really make sense unless we somehow use the same units for acceleration and time, but I'll ignore that issue for now). If you have a formula that gives you a way to determine the value of t from the value of a, something like t=f(a), then you can certainly write 3+5t=3+5f(a). This is however not the same as plugging in a into v. The right-hand side is equal to v(t) (because that's what the left-hand side is equal to). It's not equal to v(f(a)) unless f(a)=t.
 
  • #5
I am glad you brought up the part about t=a being the constraint we assume if we are to find velocity as a function of acceleration in this manner.
Can't we use this same counterargument for any composite function ##f(/circ)g##? The mere act of substituting g(x) for x implies that ##g(x)=x##, doesn't it?
 
  • #6
Sorry, I meant ##f \circ g##
 
  • #7
MohammedRady97 said:
Can't we use this same counterargument for any composite function ##f(/circ)g##? The mere act of substituting g(x) for x implies that ##g(x)=x##, doesn't it?
We can substitute one variable (or a more complicated expression) for another if we know that they're equal (i.e. if they represent the same number). For example, if we know that the variables x and y satisfy the constraint g(y)=x, we can substitute g(y) for x in any formula that involves x. However, if a function maps several numbers to the same number, then you may be able to substitute a variable for one with a different value without causing any problems. For example, in the formula ##x^2=1##, you can substitute y for x if x=y OR if x=-y.
 
  • #8
But we are talking about ##g(x)=x##, and not ##g(y)=x##
Also, in the above example, is it technically wrong to say that ##v(a)## is velocity as a function of acceleration? Must it be ##(v \circ f)(a)##? I am getting confused here.
If inputting g(x) for x is allowed to compute ##(f \circ g)(x)##, why is plugging in a for t not allowed?
 
  • #9
Regarding the g(x)=x thing, I don't understand what you want to know. Doesn't the ##x^2=1## example show that sometimes you can substitute a number for a different number?

You can certainly say that v(a) is a function of a. You don't have to know anything about any other variable, or what physical quantity v or a represents, to say that v(a) is a function of a. The notation represents a number. What number it represents depends on the value of a, i.e. what number a represents. That's all you need to know to say that v(a) is a function of a.

There's a subtle difference between the phrases "is a function" and "is a function of" that I should probably explain. When x and y satisfy a constraint like x+y=1, which enables us to determine y from x, we say that y is a function of x. However, y is not a function. It's a real number. The symbol "y" is a variable that represents a real number. We can define a function f by f(x)=1-x for all x. Now the constraint can be written y=f(x). Now f is a function and f(x) is a number in its range...which depends on the value of x, so we say that f(x) is a function of x, but that doesn't mean that f(x) is a function. The function is f. The symbol "f" is a variable that represents a function. The expression "f(x)" represents a number.

It will be easier to discuss your concerns if you come up with a realistic example, something from your physics book.
 
  • #10
Does that mean that when I say ##v=v(t)##, then that second v is a function which maps t to the first v? So basically I can make the argument of v whatever i want, but it would ONLY equal the variable v, velocity, if ##v=v(t)##? In other words, v equals v(t), but it does not equal v(a).
Am I getting this right?
 
  • #11
I think this confusion arises because in Physics it is a common practice to choose the same letter for both the function and the variable.
 
  • #12
MohammedRady97 said:
Does that mean that when I say ##v=v(t)##, then that second v is a function which maps t to the first v? So basically I can make the argument of v whatever i want, but it would ONLY equal the variable v, velocity, if ##v=v(t)##? In other words, v equals v(t), but it does not equal v(a).
Am I getting this right?
The teacher I had for the introduction to classical mechanics used notations like "v=v(t)" to say that velocity is a function of time. I think it's a pretty common notation in physics, but mathematicians don't use it. I once used it on a math exam, and the teacher looked like he wanted to slap me when I tried to explain it to him. I stopped using it after that. If you use it, you have to keep in mind that it's an abbreviation for the following much longer statement:
v and t are variables that represent the values of two physical quantities. There's a function f such that v=f(t). We will denote this function by v.​
If you write v(a), where v is that function, then v(a) is the velocity at time a. This is equal to the velocity at acceleration a, if t=a. (This doesn't really make sense because of the units). But if the relationship between t and a is more complicated, then v(a) may not be the velocity of the object when its acceleration is a.

You really need to find a realistic example, where we don't have to ignore that the units are all wrong. Preferably it should be one in which the relationship between the variables isn't just an equality.
 
  • #13
Actually when I was talking about v, a and t i was referring to dimensionless quantities such that velocity is ##v ms^-1##, acceleration is ##a ms^-2## and time is ##t s##.
 
  • #14
If we consider the case of the ideal "falling body", the acceleration is constant (with respect to time), so making velocity a function of acceleration is hopeless. Both position and velocity change, so making velocity a function of position is possible. Try that example.
 

1. What is the chain rule in calculus?

The chain rule is a formula in calculus that allows you to determine the derivative of a composite function. In other words, it tells you how to find the rate of change of one function composed with another function.

2. How do you use the chain rule?

To use the chain rule, you first need to identify the inner and outer functions in a composite function. Then, you can apply the formula: f'(g(x)) * g'(x). This means you take the derivative of the outer function and multiply it by the derivative of the inner function.

3. Why is the chain rule important?

The chain rule is important because it allows us to find the derivative of complex functions that are composed of multiple simpler functions. It is a fundamental concept in calculus that is used to solve a variety of problems in mathematics, physics, and engineering.

4. Can the chain rule be applied to functions with more than two components?

Yes, the chain rule can be applied to functions with more than two components. It can be extended to functions with any number of components by simply taking the derivative of each component and multiplying them together.

5. How does the chain rule relate to function composition?

The chain rule and function composition are closely related because the chain rule is used to find the derivative of a composite function, which is formed by combining two or more functions. In other words, the chain rule tells us how the rate of change of one function affects the rate of change of another function.

Similar threads

  • Calculus
Replies
6
Views
1K
Replies
2
Views
1K
Replies
2
Views
1K
  • Calculus
Replies
2
Views
1K
Replies
1
Views
2K
  • Calculus
Replies
9
Views
2K
Replies
1
Views
930
Replies
3
Views
1K
Replies
36
Views
4K
Back
Top