## inverse function theorem for surface mappings

Hi,

I have a limited background in differential geometry. I have a problem involving a surface mapping (from R2 to R3) which does not have a square Jacobian. I understand that for a mapping of preserved dimensionality I can compute a matrix inverse which will allow me to map tangent vector coordinates from image to domain or vice-versa.

i.e. S = (x(u,v), y(u,v), z(u,v))

I have come across some sites online that seem to cavalierly stick a column [0,0,1] and then invert, and/or utilize a matrix pseudoinverse to define an inverse for this mapping.

My question is what the heck are they doing and why? Do these pseudoinverse methods have any properties that make them useful? Perhaps my understanding is limited because I don't understand what the inverted tangent vectors mean in the domain space?

Sorry for my lack of background and thanks for any help :)
 PhysOrg.com science news on PhysOrg.com >> Ants and carnivorous plants conspire for mutualistic feeding>> Forecast for Titan: Wild weather could be ahead>> Researchers stitch defects into the world's thinnest semiconductor
 If the Jacobian has maximal rank, you can certainly define an inverse if you restrict to the range.
 Sorry, I don't understand what you mean by restricting its range. How would I do that? Are you saying I need restrict consideration to a location where I know there is a one-to-one mapping? And am I assured that by knowing that all of the corresponding 2x2 matrices are non-singular?

## inverse function theorem for surface mappings

 Quote by 7thSon Sorry, I don't understand what you mean by restricting its range. How would I do that? Are you saying I need restrict consideration to a location where I know there is a one-to-one mapping? And am I assured that by knowing that all of the corresponding 2x2 matrices are non-singular?
You're parametrizing a surface, so by definition, the Jacobian has maximal rank (2). Choose a basis of the tangent space of the image so that the first two basis elements are in the span of the image of the Jacobian. If we look just at this subspace, the Jacobian is surjective, hence invertible.
 Thanks I think I see what you are saying and I think I can see why I am confused, could you enlighten me? Using the surface mapping $$s(u,v) := (x(u,v), y(u,v), z(u,v))$$ I get a Jacobian that looks like $$$\left( \begin{array}{ccc} x_u & x_v\\ y_u & y_v \\ z_u & z_v \end{array} \right)$$$ Now, the vectors $$(x_u, y_u, z_u)$$ and $$(x_v, y_v, z_v)$$ are the columns of the Jacobian for this mapping, but they are themselves vectors in the image of this mapping, and form a basis for the tangent space of the image (is that correct?). Then there is nothing to keep me from defining a 2 x 3 psuedoinverse matrix. If you restrict yourself to operating on linear combinations of the $$(x_u, y_u, z_u)$$ and $$(x_v, y_v, z_v)$$, vectors in the image are mapped back to vectors in the domain of the mapping. My question then, is there any special significance to the inverse mapping of the vectors that defined my Jacobian? Hope that made sense.

 Quote by 7thSon Using the surface mapping $$s(u,v) := (x(u,v), y(u,v), z(u,v))$$ I get a Jacobian that looks like $$$\left( \begin{array}{ccc} x_u & x_v\\ y_u & y_v \\ z_u & z_v \end{array} \right)$$$ Now, the vectors $$(x_u, y_u, z_u)$$ and $$(x_v, y_v, z_v)$$ are the columns of the Jacobian for this mapping, but they are themselves vectors in the image of this mapping, and form a basis for the tangent space of the image (is that correct?).
Yes. The vectors that make up the columns of the Jacobian make up a basis of the tangent space. However, there is more that we can say about this, and the appropriate concepts and notation makes it more clear.

The mapping $$s(u,v) := (x(u,v), y(u,v), z(u,v))$$ defines a two dimensional surface in the three dimensional space x,y,z space. The functions u and v are coordinates on this surface, and you can characterize the coordinates by drawing constant surfaces of u and v. There are a couple of concepts that are more clear if we do not use a 2D mapping, so I will add a coordinate w to the list.

The mapping $$s(u,v,w) := (x(u,v,w), y(u,v,w), z(u,v,w))$$ defines new coordinate mapping of the x,y,z space. The functions u, v, and w are the new coordinates, and you can characterize the coordinates by drawing constant surfaces of u, v, and w. For example, the coordinate surfaces of u are drawn by holding u constant and drawing $$s(u=c,v,w)$$. We then increment u by $$\Delta u$$ and draw $$s(u=c + \Delta u, v, w)$$. We continue this process incrementing u by constant amounts and drawing $$s(u=c+n * \Delta u, v, w)$$. We then repeat the process by holding v constant rather than u. We repeat the process again for w as well. This is the typical way we characterize a coordinate system such as cylindrical or spherical coordinates.

When you define a set of coordinates, you also define a dual set of coordinate lines. These lines are parameterized lines, and the parameter for a coordinate line is the associated coordinate function. The coordinate lines are visibly when the coordinate system is drawn. The coordinate line for u falls along intersecting constant surfaces of v and w, and as noted the parameterization of the coordinate line for u is defined by coordinate u. Similarly the coordinate line for v falls along the intersection of the constant surfaces of u and w and the coordinate line parameter is the coordinate function v. We intuitively recognize the coordinate lines when we look at some coordinate system that has been drawn, because these fall at the intersection of the coordinate surfaces and the parameterization is visible by the associated coordinate function. The coordinate lines are easy to calculate from $$s(u,v,w) := (x(u,v,w), y(u,v,w), z(u,v,w))$$. For the u coordinate lines, you simply hold v and w constant and plot $$s(u,v=c1,w=c2)$$. There is a separate coordinate line for each value of v and w. For the v coordinate lines, you hold u and w constant and plot $$s(u=c1, v, w=c2)$$. The same pattern is followed for w.

However, it is often easier to work with the differential of the coordinate functions and the coordinate lines. The differential of the coordinate functions are sometimes called differential forms, and sometimes covectors. Sometimes these are represented as row vectors, but since the chain rule is the rule for transformation in differential geometry, it is often more convenient to represent these in differential form notation $$du, dv, dw$$. Just as u, v, and w form a new coordinate basis, $$du, dv, dw$$ form an alternative basis to $$dx, dy, dz$$ for the covector space. This basis is a vector basis that characterizes the rate of change of functions (it can also characterize things that are not functions, but that will not be discussed here). The conversion from the old basis to the new basis is given by:

$$dx = \frac{\partial x}{\partial u} du + \frac{\partial x}{\partial v} dv + \frac{\partial x}{\partial w} dw = x_u du + x_v dv + x_w dw$$

$$dy = \frac{\partial y}{\partial u} du + \frac{\partial y}{\partial v} dv + \frac{\partial y}{\partial w} dw = y_u du + y_v dv + y_w dw$$

$$dz = \frac{\partial z}{\partial u} du + \frac{\partial z}{\partial v} dv + \frac{\partial z}{\partial w} dw = z_u du + z_v dv + z_w dw$$

Note that the notation is convenient since the chain rule (or rule fro transformations) is naturally expressed in this form. The vector basis for u, v, and w can be represented by column vectors. However in differential geometry it is more convenient to use the notation $$\frac{\partial}{\partial u}, \frac{\partial}{\partial v}, \frac{\partial}{\partial w}$$. These can be calculated from $$s(u,v,w) := (x(u,v,w), y(u,v,w), z(u,v,w))$$, by the chain rule formula:

$$\frac{\partial}{\partial u} = \frac{\partial x}{\partial u} \frac{\partial}{\partial x} + \frac{\partial y}{\partial u} \frac{\partial}{\partial y} + \frac{\partial z}{\partial u} \frac{\partial}{\partial z} = x_u \frac{\partial}{\partial x} + y_u \frac{\partial}{\partial y} + z_u \frac{\partial}{\partial z}$$

$$\frac{\partial}{\partial v} = \frac{\partial x}{\partial v} \frac{\partial}{\partial x} + \frac{\partial y}{\partial v} \frac{\partial}{\partial y} + \frac{\partial z}{\partial v} \frac{\partial}{\partial z} = x_v \frac{\partial}{\partial x} + y_v \frac{\partial}{\partial y} + z_v \frac{\partial}{\partial z}$$

$$\frac{\partial}{\partial w} = \frac{\partial x}{\partial w} \frac{\partial}{\partial x} + \frac{\partial y}{\partial w} \frac{\partial}{\partial y} + \frac{\partial z}{\partial w} \frac{\partial}{\partial z} = x_w \frac{\partial}{\partial x} + y_w \frac{\partial}{\partial y} + z_w \frac{\partial}{\partial z}$$

The Jacobian for this system is given by:

$$$\left( \begin{array}{ccc} x_u & x_v & x_w\\ y_u & y_v & y_w \\ z_u & z_v & z_w \end{array} \right)$$$

Note that the columns of the Jacobian are not just new basis vectors, but the are the basis vectors $$\frac{\partial}{\partial u}, \frac{\partial}{\partial v}, \frac{\partial}{\partial w}$$, associated with the new coordinate system u, v, w. The rows of the Jacobian are dx, dy, and dz transformed to the new basis du, dv, dw. Multiplying a row vector (field) / covector (field) / one form by the Jacobian converts it from the dx, dy, dz basis to the du, dv, dw basis - it is just an expression of the chain rule. Multiplying the Jacobian by a vector (field) converts it from the $$\frac{\partial}{\partial u}, \frac{\partial}{\partial v}, \frac{\partial}{\partial w}$$ basis to the $$\frac{\partial}{\partial x}, \frac{\partial}{\parital y}, \frac{\partial}{\partial z}$$ basis - again just an expression of the chain rule. The Jacobian is a useful tool when writing vectors and covectors in matrix notation. When writing things in the more modern notation that is natural for the chain rule, the transformation rule is typically more obvious and the Jacobian does not need to be represented explicitly.

Note that the coordinate vectors $$\frac{\partial}{\partial u}, \frac{\partial}{\partial v}, \frac{\partial}{\partial w}$$ are not unit vectors which are usually represented by using an 'e' form like $$e_1 , e_2 , e_3$$. Unit vectors have a length 1 with respect a metric defined on the space. Coordinate vectors need no metric to be defined and are parameterized by the associated coordinate function.
 llarsen, thank you for your amazing post. After reviewing I have a couple of questions, and I entreat you to be brief because you've already been much more thorough than I could have hoped. 1. I assume that (d/dx, d/dy, d/dz) are just the same old vector basis (e1,e2,e3) in Euclidian R3, right? This is confusing to me because (dx,dy,dz) are the differentials along coordinate curves (as you describe), so which ones transform covariantly and which contravariantly? The (d/dx, d/dy, d/dz) is an object built along coordinate lines and so those basis vectors transform covariantly, correct? 2. I think this next question gets to my lack of ability to always distinguish between vectors and components of vectors. Let's say I wanted to transform a vector from basis (d/dx, d/dy, d/dz) to (d/du, d/dv, d/dw). If I multiply your 3x3 Jacobian by a 3x1 column vector representing the components of that vector in R3, the matrix multiplication gives me a transformation that looks like a one-form (ie. dx/du, dx/dv, dx/dw) instead of one with (dx/du, dy/du, dz/du), yet I know it is the Jacobian operating on a column vector is the one that will map a tangent vector in a coordinate transformation. Is this due to the fact that I am operating on components of the vector, which transform like (are?) covectors? And hence, if I were to map a basis vector that transforms covariantly, I would premultiply it as a row vector to get the appropriate transformation? Thanks for any help! Feel free to point me to useful references online but I've failed to find books/etc that have helped me get to the concepts I'm having trouble with.

 Quote by 7thSon I assume that (d/dx, d/dy, d/dz) are just the same old vector basis (e1,e2,e3) in Euclidian R3, right?
In my experience, the $$e_i$$ notation is used to indicate unit vectors. To define a unit vector you need to define a metric and use the metric to scale the basis vectors at each point in space such that the basis vectors are 1 unit long. It just so happens that in euclidean space, the basis vector (d/dx, d/dy, d/dz) are one unit long at each point in space. When using polar coordinates in a 2D space, d/dr is the same as $$e_r$$, but $\frac{\partial}{\partial \theta}$ is not the same as $$e_\theta$$ since the length of the $\frac{\partial}{\partial \theta}$ vector is defined by the rate of change of $$\theta$$ (or distance between constant surfaces of $$\theta$$) along the direction of constant r at the point where the basis vector is being calculated. Coordinate vectors are often more convenient than unit vectors to work with in differential geometry, and do not require the definition of a metric.

 Quote by 7thSon This is confusing to me because (dx,dy,dz) are the differentials along coordinate curves (as you describe), so which ones transform covariantly and which contravariantly?
Change of coordinates is common in differential geometry, and the terms 'Co' and 'Contra' are defined relative to how an object transforms in relation to coordinate functions, or more correctly, in relation to the differentials of coordinate functions. Since (exact) one forms, or covectors, or dual vectors, arise from differentials of functions, they transform in the same way as coordinate bases and are called "co"variant. Vectors transform using the inverse Jacobian and are "contra"variant. See the wikipedia article here for more details.

 Quote by 7thSon The (d/dx, d/dy, d/dz) is an object built along coordinate lines and so those basis vectors transform covariantly, correct?
The differential of functions transform covariantly. Vector terms like (d/dx, d/dy, d/dz) come from the differential of a curve in space (not a function) and transforms contavariantly.

 Quote by 7thSon Let's say I wanted to transform a vector from basis (d/dx, d/dy, d/dz) to (d/du, d/dv, d/dw). If I multiply your 3x3 Jacobian by a 3x1 column vector representing the components of that vector in R3, the matrix multiplication gives me a transformation that looks like a one-form (ie. dx/du, dx/dv, dx/dw) instead of one with (dx/du, dy/du, dz/du), yet I know it is the Jacobian operating on a column vector is the one that will map a tangent vector in a coordinate transformation.
First it is worth noting that when you transform from (d/dx, d/dy, d/dz) to (d/du, d/dv, d/dw) by multiplying a vector (dx/dt, dy/dt, dz/dt) transposed by the jacobian, you don't get (dx/du, dy/du, dz/du). What you get is the following column vector:

$$$\left( \begin{array}{ccc} \frac{dx}{dt} x_u + \frac{dy}{dt} x_v + \frac{dz}{dt} x_w\\ \frac{dx}{dt} y_u + \frac{dy}{dt} y_v + \frac{dz}{dt} y_w\\ \frac{dx}{dt} z_u + \frac{dy}{dt} z_v + \frac{dz}{dt} z_w\end{array} \right)$$$

Vectors and covectors are geometric objects that come in a package. You don't operate on each component one at a time. I should note that I prefer writing vector fields using the notation $V = f_1 \frac{\partial}{\partial x} + f_2 \frac{\partial}{\partial y} + f_3 \frac{\partial}{\partial z}$ rather than using matrix notation, because the chain rule transformation becomes much more transparent.

 Quote by 7thSon Is this due to the fact that I am operating on components of the vector, which transform like (are?) covectors? And hence, if I were to map a basis vector that transforms covariantly, I would premultiply it as a row vector to get the appropriate transformation?
Conponents of vectors are scalars and tranform like scalars (you do a substitution of variables). You use the jacobian on the dx, dy, dz terms, or the inverse jacobian on the d/dx, d/dy, d/dz terms. When you are expressing vectors and covectors using a1 dx + a2 dy + a3 dz and vectors using b1 d/dx + b2 d/dy + b3 d/dz, the transformation is just the chain rule, which is the same as using the jacobian or inverse jacobian on a vector or covector represented in matrix form.

 Quote by 7thSon Feel free to point me to useful references online but I've failed to find books/etc that have helped me get to the concepts I'm having trouble with.
There is a pdf by some EE professors at BYU which gives one of the best graphical presentations of differential forms I have found. The pdf is here. I think there were some other materials on the website here. It gives a good graphical intuition into differential forms, but doesn't cover vector fields, or the relationship between vector fields and differential forms. However, it is a good start.

There are a few books from William Burke that I found useful. His books are great for visualizing concepts from differential geometry, but not so good if you are interested in rigor. A website for him is found here. He has a lot of great insights and is worth looking at.

"The Geometry of Physics" by Theodore Frankel is about the level of rigor I like. It is reasonably approachable, but has fewer graphical depictions of concepts than I would like. Covers vectors and differential forms pretty well.
 Is 7thSon an Orson Scott Card reference?

 Quote by llarsen Is 7thSon an Orson Scott Card reference?
It is in fact :). Although I didnt know it at the time I adopted the screen name... I bought the Iron Maiden album and then found out it was a literary reference.

Thanks a million for your help again, I will look through your response later today
 I made a mistake in my post above. I claimed that when you transform from (d/dx, d/dy, d/dz) to (d/du, d/dv, d/dw) by multiplying the vector (dx/dt, dy/dt, dz/dt) transposed by the jacobian to get: $$$\left( \begin{array}{ccc} \frac{dx}{dt} x_u + \frac{dy}{dt} x_v + \frac{dz}{dt} x_w\\ \frac{dx}{dt} y_u + \frac{dy}{dt} y_v + \frac{dz}{dt} y_w\\ \frac{dx}{dt} z_u + \frac{dy}{dt} z_v + \frac{dz}{dt} z_w\end{array} \right)$$$ This isn't correct, and you can tell this since it doesn't agree with the chain rule. For example, examine the first term in the vector: $$\frac{dx}{dt} x_u + \frac{dy}{dt} x_v + \frac{dz}{dt} x_w = \frac{dx}{dt} \frac{\partial x}{\partial u} + \frac{dy}{dt}\frac{\partial x}{\partial v} + \frac{dz}{dt} \frac{\partial x}{\partial w}$$ The chain rule suggests that for the first component we should really have: $$\frac{dx}{dt} u_x + \frac{dy}{dt} u_y + \frac{dz}{dt} u_z = \frac{dx}{dt} \frac{\partial u}{\partial x} + \frac{dy}{dt}\frac{\partial u}{\partial y} + \frac{dz}{dt} \frac{\partial u}{\partial z}$$ What this is telling us is that we really need the inverse jacobian to transform from transform from (d/dx, d/dy, d/dz) to (d/du, d/dv, d/dw): $$$\left( \begin{array}{ccc} u_x & u_y & u_z\\ v_x & v_y & v_z \\ w_x & w_y & w_z \end{array} \right)$$$ The jacobian is used to transform from (d/du, d/dv, d/dw) to (d/dx, d/dy, d/dz).
 Just a followup that seems simple/dumb by comparison: Do we need any additional conditions to get a complex version of the inverse function theorem.?. I am thinking of the corollary that inverse image f^-1(a) is a submanifold. If we started out with a complex manifold M and the same conditions on the Jacobian of f, does it follow that the inverse image f^-1(w) ( w in C, complexes) is a complex submanifold.? Thanks. s
 The cases I deal with only required real coordinates. Since I don't have to deal with complex change of coordinates, I seldom take time to think about the modifications needed for complex systems, so I won't try to address this question for you. Sometimes the theorems need little modification to deal with the complex case, but there are sometimes subtleties you have to be careful of, and complex differential geometry is not my forte.
 One of the things I was trying to convey in my discussions above is that the chain rule tends to lie at the heart of differential geometry. One of the problems with matrix and tensor notation is that it obscures the chain rule. I realized that my discussion above may be difficult to follow because I jump between notations for describing vector fields and tensors. If you are not familiar with the different notations and how they are connected my comments may be confusing. Understanding the different notations and their connection can help in understanding the concepts in differential geometry. For example, a vector field is a basic object in differential geometry. A vector field indicates direction and magnitude at each point where the vector field is defined. One form a vector field can be represented in is as a set of first order ordinary differential equations (ODEs) such as the following: $$\begin{array}{ccc} \frac{dx_1}{dt} = f_1(X) \\ \frac{dx_2}{dt} = f_2(X) \\ ... \\ \frac{dx_n}{dt} = f_n(X) \end{array}$$ where $$X$$ represents dependence on coordinates $$(x_1 , x_2 , ..., x_n)$$. A vector field can also be represented as a row vector like so: $$$\left( \begin{array}{ccc} f_1(X) \\ f_2(X) \\ ... \\ f_n(X) \end{array} \right)$$$ Note that from the ODE definition, you can replace the column vector representation above with: $$$\left( \begin{array}{ccc} \frac{dx_1}{dt} \\ \frac{dx_1}{dt} \\ ... \\ \frac{dx_n}{dt} \end{array} \right)$$$ It is useful to write $\frac{dx_i}{dt}$ in place of $$f_i$$ sometimes because the chain rule becomes more explicit and it is easier to tell when you get transformations correct. But it can also be mysterious if you are not aware of the connection to vector fields and ODEs. You will notice that I wrote the column vector in this form in a previous post assuming this concept was understood. Another form to write a vector field in is in the tensor notation $$f_i$$ where expansion of i is understood. This can also be written as $\frac{dx_i}{dt}$ where the connection to the ODE form is understood. Another form to write a vector field in is the vector field notation: $$V = f_1(X) \frac{\partial}{\partial x_1} + f_2(X) \frac{\partial}{\partial x_1} + ... + f_n(X) \frac{\partial}{\partial x_n} = \frac{dx_1}{dt} \frac{\partial}{\partial x_1} + \frac{dx_2}{dt} \frac{\partial}{\partial x_2} + ... + \frac{dx_n}{dt} \frac{\partial}{\partial x_n}$$ Note that the form after the second equal sign is particularly representative of the chain rule. The terms $\frac{\partial}{\partial x_i}$ represent coordinate vectors as described in a post above. However, when applying the chain rule, they act as standard partial derivatives. So if we want to transform from coordinate system X to coordinate system Y, we use the chain rule formula: $$V = \frac{dx_i}{dt} \frac{\partial y_j}{\partial x_i} \frac {\partial}{\partial y_j}$$ The jacobian is built into this transformation, but since the notation is natural for expressing the chain rule, it is not necessary to write out the jacobian. It is also harder to get the transformation wrong in this format. Anyway, transformations typically reduce down to using the chain rule, so understanding how to represent the opertion in terms of the chain rule is quite useful.
 Excllent, llarsen; would you suggest any particular way of going about understanding the chain rule.?
 llarsen, I think I'm almost there. There is one more thing that started to confuse me at the wikipedia page after I thought I understood everything. I believe the following is correct. For the change of coordinates (u,v,w) --> (x,y,z), with Jacobian defined with terms like dx/du: 1. Vectors components in (u,v,w) basis can be found by using the inverse Jacobian to operate on the components of that vector in (x,y,z) basis 2. One can find components of covectors in (u,v,w) basis by multiplying the covector's components in (x,y,z) basis by Jacobian i.e. f_new = f * J 3. The basis vectors (d/du, d/dv, d/dw) are just the columns in the "forward" Jacobian, each with the appropriate coordinate vector appended (d/dx, d/dy, or d/dz) 4. If there was ever a reason to find (d/dx, d/dy, or d/dz) in terms of (d/du, d/dv, d/dz), it would come from the columns of the inverse Jacobian. Lastly, the wikipedia page defines a matrix "A" which is the linear combination converting the basis vectors in the domain to basis vectors in the mapping's image. The rules of this "A" matrix and the Jacobian seem to be opposite and I suspect they are in fact inverses, is this true? It's kind of hard for me to decipher because I am used to looking at a coordinate chart mapping (u,v,w) ---> (x,y,z), and normally we have the coordinates of something in (x,y,z) and want to find the corresponding components of the objects in the domain of the mapping. Since two things are going on, I just ended up confusing myself.

 Quote by Bacle Excllent, llarsen; would you suggest any particular way of going about understanding the chain rule.?
Bacle, do you think you have more trouble computing the chain rule or understanding what it means? Computing it for weird functional dependencies takes some practice as well.

Also, are you comfortable with thermodynamics and the total differential? I think sometimes a concrete example of a physical system using the chain rule can clarify things.