well the usual implicit function theorem says that locally the set where f(x,y) = 0 in the (x,y) plane looks like a graph of a function y(x). But actually it gives you that also all nearby curves where f(x,y) = t, for t near zero, also each look like a graph of a function.
So the point is that under the given condition on the derivative of f not being zero, the curves f(x,y) = t, for all t near zero, form a smooth family of roughly parallel curves.
But if they are smooth and parallel they should be able to be straightened out into a family of parallel lines.
Now the simplest function f for which that is true is projection on an axis. So the first abstract version of your theorem, the one at MIT, says if the derivative of the function is non zero enough (and because you are in infinite dimensions you also have to assume the domain of the linearization of the function, the derivative, can be split into two factors), then your function is equivalent to projection on one of the two “axes” or factor spaces.
does this help? i.e. you are trying to show that the set where f(x,y) = 0 is locally a graph y(x), i.e. that locally it projects isomorphically onto the x axis. but this is true for the projection map taking (x,y)-->y, since then the level set where the function equals zero, is the set of points (x,0), which clearly projects
via (x,0)-->x, isomorphically onto the x axis. so showing your function f is equivalent to projection does it.another way of saying it is to say that locally f can be used to replace y as a coordinate, i.e. that locally every point is determined by the 2 numbers (x,f(x,y)). that says we can coordinatize the plane near our point by x and f(x,y) instead of x and y. I.e. we can assume f(x,y) = y, i.e. f is projection onto the y axis.
is any of this making sense? if it makes you feel bettter i spent an awful lot of years thinking about why all these are the same. and after contemplating the banach version for a long time i was still puzzled when a mathjematician said all it means is locally you can solve for some of the variables in terms of the others,. i went "huh?" i.e. in one of my formulations above, f(x,y) = 0 is locally a graph y(x), i.e. y is locally determined by x, on the set where f(x,y) = 0