# Question about the implicit function theorem

I won't post the whole rigorous statement of the theorem, but basically the theorem states that
If ##F(x,y) = 0## on a neighborhood of the form ##[x-\delta ,x+\delta ]\times [y- \epsilon ,y+\epsilon ]## and if ##\frac{\partial F(x,y)}{\partial y} \neq 0##, then there exists a function ##y=\phi (x)\: s.t. F(x,\phi (x)) = 0 \: \forall x \in [x-\delta ,x+\delta ]##. My question is about ##\phi '(x)##, which can be obtained by deriving ##F(x,\phi (x))## w.r.t. x as follows:
$$\frac{d}{dx}F(x,\phi (x)) = F_x(x,\phi (x)) + F_y(x,\phi (x))\phi '(x) = 0\\\rightarrow \phi '(x) = -\frac{F_x(x,\phi (x))}{F_y(x,\phi (x))}$$
Why can we set ##\frac{d}{dx}F(x,\phi (x)) = 0##? By definition, by composing F with ##\phi##, we are on a curve where F is 0 for all x(and so it makes sense that all its derivatives w.r.t. x are 0). But ##\frac{\partial F(x,y)}{\partial x}## isn't necessarily equal to 0 in that same neighborhood. What is the difference between the two? Isn't ##F(x,\phi (x)) \in F(x,y)##? I'm quite confused

Last edited:

## Answers and Replies

Samy_A
Science Advisor
Homework Helper
You can set ##\frac{d}{dx}F(x,\phi (x)) = 0## because, as you said, ##F(x,\phi (x))=0## (in an interval).

Maybe use a trivial example to see what is going on here:
##F(x,y)=x²+2y##
Compute ##\phi(x)##, and verify that ##\phi '(x) = -\frac{F_x(x,\phi (x))}{F_y(x,\phi (x))}## by computing all the terms in this equation.

• Gianmarco
I see now. Thanks Samy! :)

Gianmarco, maybe your original F(x,y) is defined on a product neighborhood of the form [x−δ,x+δ]×[y−ϵ,y+ϵ], but it is not a correct statement (or even an informal paraphrase) of the implicit function theorem to assume that F(x,y) = 0 on that product neighborhood.
Hello zinq. For a function on the xy-plane to be implicit, doesn't it mean that it has to be in the form ##F(x,y) = 0##? Thank you for the link btw

Gianmarco, yes — in order to have an implicit function, you need to have an equation of the form F(x, y) = 0.

What I am pointing out is that you do not want to assume that F(x, y) is equal to 0 for all (x, y) in an entire product neighborhood, like

[x−δ, x+δ] × [y−ϵ, y+ϵ],​

as you did.

You have a function F defined on some (generally assumed to be) open set of the plane R2.

Let's call that open set — the domain of F — by the name U. Then F is a function

F: U → R

Suppose we make a graph of F(x, y) in 3D:

z = F(x, y)​

which means, of course, the set

G = {(x, y, z) | (x, y) ∈ U and z = f(x, y)}​

which we have called G. Now you are interested in the points (x, y) of U where

F(x, y) = 0.​

One way to picture this is to think of the plane P defined by

z = 0​

and where that plane intersects the graph

z = F(x, y).​

Typically, the intersection

G ∩ P = {(x, y, z) | (x, y) ∈ U and z = F(x, y) = 0}​

of these two surfaces in 3-space (P being just a plane) will be a curve. Not necessarily, but typically.

It would not be a curve if the value of F(x, y) is 0 on an entire product neighborhood such as

[x−δ, x+δ] × [y−ϵ, y+ϵ].​

In that unlikely case, that intersection

G ∩ P = {(x, y, z) | (x, y) ∈ U and z = F(x, y) = 0}​

will contain a set that contains the entire product neighborhood

[x−δ, x+δ] × [y−ϵ, y+ϵ]​

at the level z = 0.

In order to make sense of this, it would be best to draw pictures of what is going on.

Last edited:
• Gianmarco
Gianmarco, yes — in order to have an implicit function, you need to have an equation of the form F(x, y) = 0.

What I am pointing out is that you do not want to assume that F(x, y) is equal to 0 for all (x, y) in an entire product neighborhood, like

[x−δ, x+δ] × [y−ϵ, y+ϵ],​

as you did.

You have a function F defined on some (generally assumed to be) open set of the plane R2.

Let's call that open set — the domain of F — by the name U. Then F is a function

F: U → R

Suppose we make a graph of F(x, y) in 3D:

z = F(x, y)​

which means, of course, the set

G = {(x, y, z) | (x, y) ∈ U and z = f(x, y)}​

which we have called G. Now you are interested in the points (x, y) of U where

F(x, y) = 0.​

One way to picture this is to think of the plane P defined by

z = 0​

and where that plane intersects the graph

z = F(x, y).​

Typically, the intersection

G ∩ P = {(x, y, z) | (x, y) ∈ U and z = F(x, y) = 0}​

of these two surfaces in 3-space (P being just a plane) will be a curve. Not necessarily, but typically.

It would not be a curve if the value of F(x, y) is 0 on an entire product neighborhood such as

[x−δ, x+δ] × [y−ϵ, y+ϵ].​

In that unlikely case, that intersection

G ∩ P = {(x, y, z) | (x, y) ∈ U and z = F(x, y) = 0}​

will contain a set that contains the entire product neighborhood

[x−δ, x+δ] × [y−ϵ, y+ϵ]​

at the level z = 0.

In order to make sense of this, it would be best to draw pictures of what is going on.
Actually, your explanation was very clear. My mistake was not thinking of ##F(x,y)## in terms of a 3D function intersected with the plane z = 0. I have another question though. When could this theorem be useful? I see that you could study the level curves of any 3D function of the form ##F(x,y) - z = 0##, with z fixed, provided that either of its partial derivatives is not zero. But since the theorem only guarantees you that, given ##\frac{\partial F(x_0,y_0)}{\partial y} \neq 0## and ##F(x_0,y_0) = 0## for some point ##(x_0, y_0)##, the graph of the level curve IS a function, but it doesn't say that you can algebraically express y in terms of x (for instance in ##F(x,y)=y^2+x-e^y## which, at the point ##(1,0)## satisfies ##F(x,y) = 0## and has ##\frac{\partial F}{\partial y} \neq 0##), then how can this be useful? Is it because it makes it possible to study the critical points w.r.t. a certain axis?

When you have an equation like

F(x, y) = 0​

and the derivative condition is satisfied so you know that (at least locally) there exists a function

y = φ(x)​

that the implicit function satisfies:

F(x, φ(x)) = 0,​

then to actually find an algebraic expression for φ(x) is not as important in practical applications as being able to solve for φ(x) numerically, to a sufficient degree of accuracy for whatever the purpose is.

Of course, it's always nice if you can get an explicit expression for φ(x), but it is usually not necessary.

• Gianmarco
I see what you mean! You could numerically approximate the ##\phi (x)## with a taylor expansion. That's why we care about finding the value of ##\phi '(x)##. Thanks a lot zinq, that was enlightening :D