# The Chain Rule and Function Composition

1. Sep 19, 2014

This is a problem that has been bugging me for ages. I just can't wrap my head around this weird result. I know I went wrong somewhere [as a matter of fact, that was the answer I was hoping for], but most sources, (including, but not limited to, wikipedia), suggest otherwise.

I will cut to the chase.

A function f is defined by $$f(x)=x^2$$

The derivative, f', is then $$f'(x)=2x$$

So far so good.

Here is where all hell broke loose for me:

The composition of f and f' yields $$f[f'(x)]=(2x)^2=4x^2$$

I will now switch to Leibniz Notation for convinience.
u=f(x), and v=f'(x)

According to the chain rule

$\frac{du}{dx}$=$\frac{du}{dv}$$\frac{dv}{dx}$

$$f'(x)=f'[f'(x)]f''(x)$$

Well, we know what f'(x) is, it is 2x as shown above.
du/dv is (2)(2x)=4x
And dv/dx is simply 2.
Putting all of this in the above equation, for nonzero x: $$2x=8x$$ $$2=8$$

My speculation as to why this bizarre result popped out was the fact that my substitution of f'(x) for x in f(x) to get f[f'(x)] was wrong. But after going through several websites, I came to realize that there is nothing mathematically wrong with this substitution.

Can someone please point out to me where I went wrong with this?
I would be more than grateful.

Last edited: Sep 19, 2014
2. Sep 20, 2014

### Fredrik

Staff Emeritus
That's not what the chain rule says. In your notation, you would have to write u=f(f'(x)) instead of u=f(x). Now you have $\frac{du}{dx}=\frac{du}{dv}\frac{dv}{dx}$. I prefer the following notation: The chain rule says that $(g\circ h)'(x)=g'(h(x))h'(x)$. If you apply this with g=f and h=f', you get $(f\circ f')'(x)=f'(f'(x))f''(x)$.

3. Sep 20, 2014

Makes sense now, thank you for pointing this out! :)
But one question remains regarding function composition. When it comes to physical systems (the simplest of which is a case in kinematics), I have trouble making sense of what happens while changing the argument of the function I'm looking at.
For instance, the velocity of a body, v m/s, measured at time t s is given by: $$v(t)=3+5t$$
I am given to understand that to find velocity as a function of another variable, say, acceleration (a m/s^2), we do this simple substitution: $$v(a)=3+5a$$
But isn't this a wrong result? I want your thoughts on this please.

4. Sep 20, 2014

### Fredrik

Staff Emeritus
If v(t)=3+5t for all t, then v(a)=3+5a for all a. For example, if you input 2 into v, the output will be 13 regardless of which of the symbols $a$ or $t$ that you used to represent the number 2. So the first formula (interpreted as a "for all" statement) certainly implies the latter.

But this isn't a way to find "velocity as a function of acceleration" in general. It only accomplishes that goal when the relationship between acceleration and time is t=a (which doesn't really make sense unless we somehow use the same units for acceleration and time, but I'll ignore that issue for now). If you have a formula that gives you a way to determine the value of t from the value of a, something like t=f(a), then you can certainly write 3+5t=3+5f(a). This is however not the same as plugging in a into v. The right-hand side is equal to v(t) (because that's what the left-hand side is equal to). It's not equal to v(f(a)) unless f(a)=t.

5. Sep 20, 2014

I am glad you brought up the part about t=a being the constraint we assume if we are to find velocity as a function of acceleration in this manner.
Can't we use this same counterargument for any composite function $f(/circ)g$? The mere act of substituting g(x) for x implies that $g(x)=x$, doesn't it?

6. Sep 20, 2014

Sorry, I meant $f \circ g$

7. Sep 20, 2014

### Fredrik

Staff Emeritus
We can substitute one variable (or a more complicated expression) for another if we know that they're equal (i.e. if they represent the same number). For example, if we know that the variables x and y satisfy the constraint g(y)=x, we can substitute g(y) for x in any formula that involves x. However, if a function maps several numbers to the same number, then you may be able to substitute a variable for one with a different value without causing any problems. For example, in the formula $x^2=1$, you can substitute y for x if x=y OR if x=-y.

8. Sep 20, 2014

But we are talking about $g(x)=x$, and not $g(y)=x$
Also, in the above example, is it technically wrong to say that $v(a)$ is velocity as a function of acceleration? Must it be $(v \circ f)(a)$? I am getting confused here.
If inputting g(x) for x is allowed to compute $(f \circ g)(x)$, why is plugging in a for t not allowed?

9. Sep 20, 2014

### Fredrik

Staff Emeritus
Regarding the g(x)=x thing, I don't understand what you want to know. Doesn't the $x^2=1$ example show that sometimes you can substitute a number for a different number?

You can certainly say that v(a) is a function of a. You don't have to know anything about any other variable, or what physical quantity v or a represents, to say that v(a) is a function of a. The notation represents a number. What number it represents depends on the value of a, i.e. what number a represents. That's all you need to know to say that v(a) is a function of a.

There's a subtle difference between the phrases "is a function" and "is a function of" that I should probably explain. When x and y satisfy a constraint like x+y=1, which enables us to determine y from x, we say that y is a function of x. However, y is not a function. It's a real number. The symbol "y" is a variable that represents a real number. We can define a function f by f(x)=1-x for all x. Now the constraint can be written y=f(x). Now f is a function and f(x) is a number in its range...which depends on the value of x, so we say that f(x) is a function of x, but that doesn't mean that f(x) is a function. The function is f. The symbol "f" is a variable that represents a function. The expression "f(x)" represents a number.

It will be easier to discuss your concerns if you come up with a realistic example, something from your physics book.

10. Sep 20, 2014

Does that mean that when I say $v=v(t)$, then that second v is a function which maps t to the first v? So basically I can make the argument of v whatever i want, but it would ONLY equal the variable v, velocity, if $v=v(t)$? In other words, v equals v(t), but it does not equal v(a).
Am I getting this right?

11. Sep 20, 2014

I think this confusion arises because in Physics it is a common practice to choose the same letter for both the function and the variable.

12. Sep 20, 2014

### Fredrik

Staff Emeritus
The teacher I had for the introduction to classical mechanics used notations like "v=v(t)" to say that velocity is a function of time. I think it's a pretty common notation in physics, but mathematicians don't use it. I once used it on a math exam, and the teacher looked like he wanted to slap me when I tried to explain it to him. I stopped using it after that. If you use it, you have to keep in mind that it's an abbreviation for the following much longer statement:
v and t are variables that represent the values of two physical quantities. There's a function f such that v=f(t). We will denote this function by v.​
If you write v(a), where v is that function, then v(a) is the velocity at time a. This is equal to the velocity at acceleration a, if t=a. (This doesn't really make sense because of the units). But if the relationship between t and a is more complicated, then v(a) may not be the velocity of the object when its acceleration is a.

You really need to find a realistic example, where we don't have to ignore that the units are all wrong. Preferably it should be one in which the relationship between the variables isn't just an equality.

13. Sep 20, 2014

Actually when I was talking about v, a and t i was referring to dimensionless quantities such that velocity is $v ms^-1$, acceleration is $a ms^-2$ and time is $t s$.