# Inverse function theorem for surface mappings

1. Jun 8, 2010

### 7thSon

Hi,

I have a limited background in differential geometry. I have a problem involving a surface mapping (from R2 to R3) which does not have a square Jacobian. I understand that for a mapping of preserved dimensionality I can compute a matrix inverse which will allow me to map tangent vector coordinates from image to domain or vice-versa.

i.e. S = (x(u,v), y(u,v), z(u,v))

I have come across some sites online that seem to cavalierly stick a column [0,0,1] and then invert, and/or utilize a matrix pseudoinverse to define an inverse for this mapping.

My question is what the heck are they doing and why? Do these pseudoinverse methods have any properties that make them useful? Perhaps my understanding is limited because I don't understand what the inverted tangent vectors mean in the domain space?

Sorry for my lack of background and thanks for any help :)

2. Jun 11, 2010

### zhentil

If the Jacobian has maximal rank, you can certainly define an inverse if you restrict to the range.

3. Jun 12, 2010

### 7thSon

Sorry, I don't understand what you mean by restricting its range. How would I do that? Are you saying I need restrict consideration to a location where I know there is a one-to-one mapping?

And am I assured that by knowing that all of the corresponding 2x2 matrices are non-singular?

4. Jun 12, 2010

### zhentil

You're parametrizing a surface, so by definition, the Jacobian has maximal rank (2). Choose a basis of the tangent space of the image so that the first two basis elements are in the span of the image of the Jacobian. If we look just at this subspace, the Jacobian is surjective, hence invertible.

5. Jun 13, 2010

### 7thSon

Thanks I think I see what you are saying and I think I can see why I am confused, could you enlighten me?

Using the surface mapping
$$s(u,v) := (x(u,v), y(u,v), z(u,v))$$

I get a Jacobian that looks like
$$$\left( \begin{array}{ccc} x_u & x_v\\ y_u & y_v \\ z_u & z_v \end{array} \right)$$$

Now, the vectors $$(x_u, y_u, z_u)$$ and $$(x_v, y_v, z_v)$$ are the columns of the Jacobian for this mapping, but they are themselves vectors in the image of this mapping, and form a basis for the tangent space of the image (is that correct?).

Then there is nothing to keep me from defining a 2 x 3 psuedoinverse matrix. If you restrict yourself to operating on linear combinations of the $$(x_u, y_u, z_u)$$ and $$(x_v, y_v, z_v)$$, vectors in the image are mapped back to vectors in the domain of the mapping. My question then, is there any special significance to the inverse mapping of the vectors that defined my Jacobian?

6. Jul 26, 2010

### llarsen

Yes. The vectors that make up the columns of the Jacobian make up a basis of the tangent space. However, there is more that we can say about this, and the appropriate concepts and notation makes it more clear.

The mapping $$s(u,v) := (x(u,v), y(u,v), z(u,v))$$ defines a two dimensional surface in the three dimensional space x,y,z space. The functions u and v are coordinates on this surface, and you can characterize the coordinates by drawing constant surfaces of u and v. There are a couple of concepts that are more clear if we do not use a 2D mapping, so I will add a coordinate w to the list.

The mapping $$s(u,v,w) := (x(u,v,w), y(u,v,w), z(u,v,w))$$ defines new coordinate mapping of the x,y,z space. The functions u, v, and w are the new coordinates, and you can characterize the coordinates by drawing constant surfaces of u, v, and w. For example, the coordinate surfaces of u are drawn by holding u constant and drawing $$s(u=c,v,w)$$. We then increment u by $$\Delta u$$ and draw $$s(u=c + \Delta u, v, w)$$. We continue this process incrementing u by constant amounts and drawing $$s(u=c+n * \Delta u, v, w)$$. We then repeat the process by holding v constant rather than u. We repeat the process again for w as well. This is the typical way we characterize a coordinate system such as cylindrical or spherical coordinates.

When you define a set of coordinates, you also define a dual set of coordinate lines. These lines are parameterized lines, and the parameter for a coordinate line is the associated coordinate function. The coordinate lines are visibly when the coordinate system is drawn. The coordinate line for u falls along intersecting constant surfaces of v and w, and as noted the parameterization of the coordinate line for u is defined by coordinate u. Similarly the coordinate line for v falls along the intersection of the constant surfaces of u and w and the coordinate line parameter is the coordinate function v. We intuitively recognize the coordinate lines when we look at some coordinate system that has been drawn, because these fall at the intersection of the coordinate surfaces and the parameterization is visible by the associated coordinate function. The coordinate lines are easy to calculate from $$s(u,v,w) := (x(u,v,w), y(u,v,w), z(u,v,w))$$. For the u coordinate lines, you simply hold v and w constant and plot $$s(u,v=c1,w=c2)$$. There is a separate coordinate line for each value of v and w. For the v coordinate lines, you hold u and w constant and plot $$s(u=c1, v, w=c2)$$. The same pattern is followed for w.

However, it is often easier to work with the differential of the coordinate functions and the coordinate lines. The differential of the coordinate functions are sometimes called differential forms, and sometimes covectors. Sometimes these are represented as row vectors, but since the chain rule is the rule for transformation in differential geometry, it is often more convenient to represent these in differential form notation $$du, dv, dw$$. Just as u, v, and w form a new coordinate basis, $$du, dv, dw$$ form an alternative basis to $$dx, dy, dz$$ for the covector space. This basis is a vector basis that characterizes the rate of change of functions (it can also characterize things that are not functions, but that will not be discussed here). The conversion from the old basis to the new basis is given by:

$$dx = \frac{\partial x}{\partial u} du + \frac{\partial x}{\partial v} dv + \frac{\partial x}{\partial w} dw = x_u du + x_v dv + x_w dw$$

$$dy = \frac{\partial y}{\partial u} du + \frac{\partial y}{\partial v} dv + \frac{\partial y}{\partial w} dw = y_u du + y_v dv + y_w dw$$

$$dz = \frac{\partial z}{\partial u} du + \frac{\partial z}{\partial v} dv + \frac{\partial z}{\partial w} dw = z_u du + z_v dv + z_w dw$$

Note that the notation is convenient since the chain rule (or rule fro transformations) is naturally expressed in this form. The vector basis for u, v, and w can be represented by column vectors. However in differential geometry it is more convenient to use the notation $$\frac{\partial}{\partial u}, \frac{\partial}{\partial v}, \frac{\partial}{\partial w}$$. These can be calculated from $$s(u,v,w) := (x(u,v,w), y(u,v,w), z(u,v,w))$$, by the chain rule formula:

$$\frac{\partial}{\partial u} = \frac{\partial x}{\partial u} \frac{\partial}{\partial x} + \frac{\partial y}{\partial u} \frac{\partial}{\partial y} + \frac{\partial z}{\partial u} \frac{\partial}{\partial z} = x_u \frac{\partial}{\partial x} + y_u \frac{\partial}{\partial y} + z_u \frac{\partial}{\partial z}$$

$$\frac{\partial}{\partial v} = \frac{\partial x}{\partial v} \frac{\partial}{\partial x} + \frac{\partial y}{\partial v} \frac{\partial}{\partial y} + \frac{\partial z}{\partial v} \frac{\partial}{\partial z} = x_v \frac{\partial}{\partial x} + y_v \frac{\partial}{\partial y} + z_v \frac{\partial}{\partial z}$$

$$\frac{\partial}{\partial w} = \frac{\partial x}{\partial w} \frac{\partial}{\partial x} + \frac{\partial y}{\partial w} \frac{\partial}{\partial y} + \frac{\partial z}{\partial w} \frac{\partial}{\partial z} = x_w \frac{\partial}{\partial x} + y_w \frac{\partial}{\partial y} + z_w \frac{\partial}{\partial z}$$

The Jacobian for this system is given by:

$$$\left( \begin{array}{ccc} x_u & x_v & x_w\\ y_u & y_v & y_w \\ z_u & z_v & z_w \end{array} \right)$$$

Note that the columns of the Jacobian are not just new basis vectors, but the are the basis vectors $$\frac{\partial}{\partial u}, \frac{\partial}{\partial v}, \frac{\partial}{\partial w}$$, associated with the new coordinate system u, v, w. The rows of the Jacobian are dx, dy, and dz transformed to the new basis du, dv, dw. Multiplying a row vector (field) / covector (field) / one form by the Jacobian converts it from the dx, dy, dz basis to the du, dv, dw basis - it is just an expression of the chain rule. Multiplying the Jacobian by a vector (field) converts it from the $$\frac{\partial}{\partial u}, \frac{\partial}{\partial v}, \frac{\partial}{\partial w}$$ basis to the $$\frac{\partial}{\partial x}, \frac{\partial}{\parital y}, \frac{\partial}{\partial z}$$ basis - again just an expression of the chain rule. The Jacobian is a useful tool when writing vectors and covectors in matrix notation. When writing things in the more modern notation that is natural for the chain rule, the transformation rule is typically more obvious and the Jacobian does not need to be represented explicitly.

Note that the coordinate vectors $$\frac{\partial}{\partial u}, \frac{\partial}{\partial v}, \frac{\partial}{\partial w}$$ are not unit vectors which are usually represented by using an 'e' form like $$e_1 , e_2 , e_3$$. Unit vectors have a length 1 with respect a metric defined on the space. Coordinate vectors need no metric to be defined and are parameterized by the associated coordinate function.

Last edited: Jul 26, 2010
7. Jul 28, 2010

### 7thSon

llarsen, thank you for your amazing post. After reviewing I have a couple of questions, and I entreat you to be brief because you've already been much more thorough than I could have hoped.

1. I assume that (d/dx, d/dy, d/dz) are just the same old vector basis (e1,e2,e3) in Euclidian R3, right? This is confusing to me because (dx,dy,dz) are the differentials along coordinate curves (as you describe), so which ones transform covariantly and which contravariantly? The (d/dx, d/dy, d/dz) is an object built along coordinate lines and so those basis vectors transform covariantly, correct?

2. I think this next question gets to my lack of ability to always distinguish between vectors and components of vectors.

Let's say I wanted to transform a vector from basis (d/dx, d/dy, d/dz) to (d/du, d/dv, d/dw). If I multiply your 3x3 Jacobian by a 3x1 column vector representing the components of that vector in R3, the matrix multiplication gives me a transformation that looks like a one-form (ie. dx/du, dx/dv, dx/dw) instead of one with (dx/du, dy/du, dz/du), yet I know it is the Jacobian operating on a column vector is the one that will map a tangent vector in a coordinate transformation.

Is this due to the fact that I am operating on components of the vector, which transform like (are?) covectors? And hence, if I were to map a basis vector that transforms covariantly, I would premultiply it as a row vector to get the appropriate transformation?

Thanks for any help! Feel free to point me to useful references online but I've failed to find books/etc that have helped me get to the concepts I'm having trouble with.

8. Jul 28, 2010

### llarsen

In my experience, the $$e_i$$ notation is used to indicate unit vectors. To define a unit vector you need to define a metric and use the metric to scale the basis vectors at each point in space such that the basis vectors are 1 unit long. It just so happens that in euclidean space, the basis vector (d/dx, d/dy, d/dz) are one unit long at each point in space. When using polar coordinates in a 2D space, d/dr is the same as $$e_r$$, but $\frac{\partial}{\partial \theta}$ is not the same as $$e_\theta$$ since the length of the $\frac{\partial}{\partial \theta}$ vector is defined by the rate of change of $$\theta$$ (or distance between constant surfaces of $$\theta$$) along the direction of constant r at the point where the basis vector is being calculated. Coordinate vectors are often more convenient than unit vectors to work with in differential geometry, and do not require the definition of a metric.

Change of coordinates is common in differential geometry, and the terms 'Co' and 'Contra' are defined relative to how an object transforms in relation to coordinate functions, or more correctly, in relation to the differentials of coordinate functions. Since (exact) one forms, or covectors, or dual vectors, arise from differentials of functions, they transform in the same way as coordinate bases and are called "co"variant. Vectors transform using the inverse Jacobian and are "contra"variant. See the wikipedia article http://en.wikipedia.org/wiki/Covariance_and_contravariance_of_vectors" [Broken] for more details.

The differential of functions transform covariantly. Vector terms like (d/dx, d/dy, d/dz) come from the differential of a curve in space (not a function) and transforms contavariantly.

First it is worth noting that when you transform from (d/dx, d/dy, d/dz) to (d/du, d/dv, d/dw) by multiplying a vector (dx/dt, dy/dt, dz/dt) transposed by the jacobian, you don't get (dx/du, dy/du, dz/du). What you get is the following column vector:

$$$\left( \begin{array}{ccc} \frac{dx}{dt} x_u + \frac{dy}{dt} x_v + \frac{dz}{dt} x_w\\ \frac{dx}{dt} y_u + \frac{dy}{dt} y_v + \frac{dz}{dt} y_w\\ \frac{dx}{dt} z_u + \frac{dy}{dt} z_v + \frac{dz}{dt} z_w\end{array} \right)$$$

Vectors and covectors are geometric objects that come in a package. You don't operate on each component one at a time. I should note that I prefer writing vector fields using the notation $V = f_1 \frac{\partial}{\partial x} + f_2 \frac{\partial}{\partial y} + f_3 \frac{\partial}{\partial z}$ rather than using matrix notation, because the chain rule transformation becomes much more transparent.

Conponents of vectors are scalars and tranform like scalars (you do a substitution of variables). You use the jacobian on the dx, dy, dz terms, or the inverse jacobian on the d/dx, d/dy, d/dz terms. When you are expressing vectors and covectors using a1 dx + a2 dy + a3 dz and vectors using b1 d/dx + b2 d/dy + b3 d/dz, the transformation is just the chain rule, which is the same as using the jacobian or inverse jacobian on a vector or covector represented in matrix form.

There is a pdf by some EE professors at BYU which gives one of the best graphical presentations of differential forms I have found. The pdf is http://eceformsweb.groups.et.byu.net/ftext.pdf" [Broken]. It gives a good graphical intuition into differential forms, but doesn't cover vector fields, or the relationship between vector fields and differential forms. However, it is a good start.

There are a few books from William Burke that I found useful. His books are great for visualizing concepts from differential geometry, but not so good if you are interested in rigor. A website for him is found http://www.ucolick.org/~burke/home.html" [Broken]. He has a lot of great insights and is worth looking at.

"The Geometry of Physics" by Theodore Frankel is about the level of rigor I like. It is reasonably approachable, but has fewer graphical depictions of concepts than I would like. Covers vectors and differential forms pretty well.

Last edited by a moderator: May 4, 2017
9. Jul 28, 2010

### llarsen

Is 7thSon an Orson Scott Card reference?

10. Jul 28, 2010

### 7thSon

It is in fact :). Although I didnt know it at the time I adopted the screen name... I bought the Iron Maiden album and then found out it was a literary reference.

Thanks a million for your help again, I will look through your response later today

11. Jul 29, 2010

### llarsen

I made a mistake in my post above. I claimed that when you transform from (d/dx, d/dy, d/dz) to (d/du, d/dv, d/dw) by multiplying the vector (dx/dt, dy/dt, dz/dt) transposed by the jacobian to get:

$$$\left( \begin{array}{ccc} \frac{dx}{dt} x_u + \frac{dy}{dt} x_v + \frac{dz}{dt} x_w\\ \frac{dx}{dt} y_u + \frac{dy}{dt} y_v + \frac{dz}{dt} y_w\\ \frac{dx}{dt} z_u + \frac{dy}{dt} z_v + \frac{dz}{dt} z_w\end{array} \right)$$$

This isn't correct, and you can tell this since it doesn't agree with the chain rule. For example, examine the first term in the vector:

$$\frac{dx}{dt} x_u + \frac{dy}{dt} x_v + \frac{dz}{dt} x_w = \frac{dx}{dt} \frac{\partial x}{\partial u} + \frac{dy}{dt}\frac{\partial x}{\partial v} + \frac{dz}{dt} \frac{\partial x}{\partial w}$$

The chain rule suggests that for the first component we should really have:

$$\frac{dx}{dt} u_x + \frac{dy}{dt} u_y + \frac{dz}{dt} u_z = \frac{dx}{dt} \frac{\partial u}{\partial x} + \frac{dy}{dt}\frac{\partial u}{\partial y} + \frac{dz}{dt} \frac{\partial u}{\partial z}$$

What this is telling us is that we really need the inverse jacobian to transform from transform from (d/dx, d/dy, d/dz) to (d/du, d/dv, d/dw):

$$$\left( \begin{array}{ccc} u_x & u_y & u_z\\ v_x & v_y & v_z \\ w_x & w_y & w_z \end{array} \right)$$$

The jacobian is used to transform from (d/du, d/dv, d/dw) to (d/dx, d/dy, d/dz).

12. Jul 29, 2010

### Bacle

Just a followup that seems simple/dumb by comparison:

Do we need any additional conditions to get a complex version of the
inverse function theorem.?. I am thinking of the corollary that inverse
image f^-1(a) is a submanifold. If we started out with a complex manifold
M and the same conditions on the Jacobian of f, does it follow that the
inverse image f^-1(w) ( w in C, complexes) is a complex submanifold.?

Thanks.
s

13. Jul 30, 2010

### llarsen

The cases I deal with only required real coordinates. Since I don't have to deal with complex change of coordinates, I seldom take time to think about the modifications needed for complex systems, so I won't try to address this question for you. Sometimes the theorems need little modification to deal with the complex case, but there are sometimes subtleties you have to be careful of, and complex differential geometry is not my forte.

14. Jul 30, 2010

### llarsen

One of the things I was trying to convey in my discussions above is that the chain rule tends to lie at the heart of differential geometry. One of the problems with matrix and tensor notation is that it obscures the chain rule. I realized that my discussion above may be difficult to follow because I jump between notations for describing vector fields and tensors. If you are not familiar with the different notations and how they are connected my comments may be confusing. Understanding the different notations and their connection can help in understanding the concepts in differential geometry.

For example, a vector field is a basic object in differential geometry. A vector field indicates direction and magnitude at each point where the vector field is defined. One form a vector field can be represented in is as a set of first order ordinary differential equations (ODEs) such as the following:

$$\begin{array}{ccc} \frac{dx_1}{dt} = f_1(X) \\ \frac{dx_2}{dt} = f_2(X) \\ ... \\ \frac{dx_n}{dt} = f_n(X) \end{array}$$

where $$X$$ represents dependence on coordinates $$(x_1 , x_2 , ..., x_n)$$. A vector field can also be represented as a row vector like so:

$$$\left( \begin{array}{ccc} f_1(X) \\ f_2(X) \\ ... \\ f_n(X) \end{array} \right)$$$

Note that from the ODE definition, you can replace the column vector representation above with:

$$$\left( \begin{array}{ccc} \frac{dx_1}{dt} \\ \frac{dx_1}{dt} \\ ... \\ \frac{dx_n}{dt} \end{array} \right)$$$

It is useful to write $\frac{dx_i}{dt}$ in place of $$f_i$$ sometimes because the chain rule becomes more explicit and it is easier to tell when you get transformations correct. But it can also be mysterious if you are not aware of the connection to vector fields and ODEs. You will notice that I wrote the column vector in this form in a previous post assuming this concept was understood. Another form to write a vector field in is in the tensor notation $$f_i$$ where expansion of i is understood. This can also be written as $\frac{dx_i}{dt}$ where the connection to the ODE form is understood. Another form to write a vector field in is the vector field notation:

$$V = f_1(X) \frac{\partial}{\partial x_1} + f_2(X) \frac{\partial}{\partial x_1} + ... + f_n(X) \frac{\partial}{\partial x_n} = \frac{dx_1}{dt} \frac{\partial}{\partial x_1} + \frac{dx_2}{dt} \frac{\partial}{\partial x_2} + ... + \frac{dx_n}{dt} \frac{\partial}{\partial x_n}$$

Note that the form after the second equal sign is particularly representative of the chain rule. The terms $\frac{\partial}{\partial x_i}$ represent coordinate vectors as described in a post above. However, when applying the chain rule, they act as standard partial derivatives. So if we want to transform from coordinate system X to coordinate system Y, we use the chain rule formula:

$$V = \frac{dx_i}{dt} \frac{\partial y_j}{\partial x_i} \frac {\partial}{\partial y_j}$$

The jacobian is built into this transformation, but since the notation is natural for expressing the chain rule, it is not necessary to write out the jacobian. It is also harder to get the transformation wrong in this format. Anyway, transformations typically reduce down to using the chain rule, so understanding how to represent the opertion in terms of the chain rule is quite useful.

Last edited: Jul 30, 2010
15. Jul 30, 2010

### Bacle

Excllent, llarsen; would you suggest any particular way of going about understanding
the chain rule.?

16. Jul 31, 2010

### 7thSon

llarsen, I think I'm almost there. There is one more thing that started to confuse me at the wikipedia page after I thought I understood everything. I believe the following is correct.

For the change of coordinates (u,v,w) --> (x,y,z), with Jacobian defined with terms like dx/du:

1. Vectors components in (u,v,w) basis can be found by using the inverse Jacobian to operate on the components of that vector in (x,y,z) basis

2. One can find components of covectors in (u,v,w) basis by multiplying the covector's components in (x,y,z) basis by Jacobian i.e. f_new = f * J

3. The basis vectors (d/du, d/dv, d/dw) are just the columns in the "forward" Jacobian, each with the appropriate coordinate vector appended (d/dx, d/dy, or d/dz)

4. If there was ever a reason to find (d/dx, d/dy, or d/dz) in terms of (d/du, d/dv, d/dz), it would come from the columns of the inverse Jacobian.

Lastly, the wikipedia page defines a matrix "A" which is the linear combination converting the basis vectors in the domain to basis vectors in the mapping's image. The rules of this "A" matrix and the Jacobian seem to be opposite and I suspect they are in fact inverses, is this true?

It's kind of hard for me to decipher because I am used to looking at a coordinate chart mapping (u,v,w) ---> (x,y,z), and normally we have the coordinates of something in (x,y,z) and want to find the corresponding components of the objects in the domain of the mapping. Since two things are going on, I just ended up confusing myself.

17. Jul 31, 2010

### 7thSon

Bacle, do you think you have more trouble computing the chain rule or understanding what it means? Computing it for weird functional dependencies takes some practice as well.

Also, are you comfortable with thermodynamics and the total differential? I think sometimes a concrete example of a physical system using the chain rule can clarify things.

18. Aug 2, 2010

### llarsen

Almost

The statements you make below are mostly right if the Jacobian is defined with terms like du/dx instead of dx/du.

Lets use the coordinate system $$(u_1,u_2,u_3) \rightarrow (x_1,x_2,x_3)$$ since this will allow me to write the Jacobian in the compact index form $\frac{\partial u_i}{\partial x_j}$ or equivalently $$u_{i,x_j}$$.

If the Jacobian is given by $$u_{i,x_j}$$, then this is correct. You use the inverse Jacobian $$x_{i,u_j}$$ to convert vector fields from the basis $\frac{\partial}{\partial x_i}$ to the basis $\frac{\partial}{\partial u_j}$.

Again, if the Jacobian is given by $$u_{i,x_j}$$ then this is correct. You use the Jacobian to transform from the $$dx_j$$ basis to the $$du_i$$ basis. Note that you would usually write this as df (df_new = df * J) rather than f because df is the typical notation used for the covector associated with f.

If the Jacobian is given by $$u_{i,x_j}$$ then this is correct.

If the inverse Jacobian is given by $$u_{i,x_j}$$ then this is correct. You can also multiply a vector field represented in terms of $\frac{\partial}{\partial x_i}$ by the inverse Jacobian to represent the vector in terms of the $\frac{\partial}{\partial u_j}$ basis.

The matrix A can be a map that converts basis vectors in the domain to basis vectors in the image without being the inverse Jacobian. If it is the inverse Jacobian, then multiplying this by the Jacobian should yield the identity matrix, otherwise it is not. For example, you can have a matrix A that maps a space back onto itself using the same coordinates. If the map is not the identify matrix then you will be modifying any vector field multiplied by A (scaling a rotating vectors at each point). The Jacobian and inverse Jacobian (i.e. the chain rule) gives you back the same vector field as the original, just represented in terms of a new set of coordinates. Matrices other than the Jacobian modify the vector field in some way, and usually have some interpretation defined by the context.

19. Aug 2, 2010

### 7thSon

Ok, it's good to know I am on the right track but need to invert my thinking. My problem is that in most cases the mapping appears like

(u1,u2,u3) ---- > (x1,x2, x3), yet the x's are explicit functions of the u's, not the other way around.

Suppose we considered spherical coordinates (r, theta, phi) ----> (x,y,z), or the mapping of a surface like s(u1, u2) := ( x(u1,u2), y(u1, u2), z(u1,u2) ).
We have x = r*cos(theta)*sin(phi), etc.

It is natural for us to take dx/dr and so that's what we end up doing (to avoid doing weird implicit differentiation, reparameterizaing, etc.)

In each case, our Jacobian looks like dx/dr, not dr/dx. Yet, it is my understanding that the "forward" Jacobian written this way (dx/dr) maps components of vectors in the (r,theta,phi) basis to components of that vector in the (x,y,z) basis, i.e. from the first "parameter space" listed in a mapping to the image of the mapping.

Likewise, if I started with the components of a vector in the (x,y,z) basis, I could find the components of that vector in the (r,theta,phi) basis using the inverse jacobian of dx/dr, which is why I stated #1:

"1. Vectors components in (u,v,w) basis can be found by using the inverse Jacobian to operate on the components of that vector in (x,y,z) basis" ... i.e. the natural way to express the derivatives is dx/dr, and I must take the inverse of this Jacobian (dr/dx) to find the components.

Is it possible that my idea of the directionality of mapping is wrong, or is it something else?

20. Aug 2, 2010

### llarsen

I think my statements above are partially incorrect, which may have led to a little confusion (sorry). I was only paying attention to the form of the Jacobian that you needed to get your transformations correct. I chose $\frac{\partial u_i}{\partial x_j}$ as the Jacobian without really considering the mapping you specified. This is really the forward Jacobian for $$(x_1,x_2,x_3) \rightarrow (u_1,u_2,u_3)$$. So you had the definition of the forward Jacobian correct (i.e. $\frac{\partial x_i}{\partial u_j}$), but had inverted whether you actually needed the Jacobian or the inverse Jacobian to perform each of the transformations you mentioned.

Last edited: Aug 2, 2010
21. Aug 2, 2010

### 7thSon

Ok but I still don't understand why my example is inverted then (sorry to be difficult, I'm just so close I want to make sure I got it).

If I am mapping a vector's coordinates in spherical basis (r,theta,phi) to its coordinates in (x,y,z), my map is like

(r,theta,phi) ---> (x,y,z)

my coordinate transformations look like
x = r*cos(theta)*sin(phi)

so my "forward Jacobian" is dx/dr, and the forward Jacobian defined this way maps coordinates of vectors in SPHERICAL to coordinates of vectors in CARTESIAN.

Inversely, my "inverse Jacobian" dr/dx maps coordinates of vectors in CARTESIAN (from the image space of my mapping) to SPHERICAL coordinates (from the domain of my mapping).

Is that correct or am I still crossed up?

22. Aug 2, 2010

### llarsen

I don't use the Jacobian notation often because I tend to make mistakes more easily. And this happens to be one of those cases. I wrote an example out and you are correct. I was the one that was using the wrong transformation. Anyway, it seems like you are understanding how the process works.

23. Aug 2, 2010

### Bacle

7th Son Wrote:

" Bacle, do you think you have more trouble computing the chain rule or understanding what it means? Computing it for weird functional dependencies takes some practice as well.

Also, are you comfortable with thermodynamics and the total differential? I think sometimes a concrete example of a physical system using the chain rule can clarify things. "

Thanks. Sorry If I am interrupting your exchanges with llarsen. I can do a separate
post, if you prefer. Let me know.

I th1nk I understand the total differential. As a mathematician (wish I knew
some physics beyond basic undergrad level, but I don't.), I understand the differential of a map between R^m and R^n , as the linear map that approximates (in a precise del-epsilon sense) the change in the value of a function.
I am not too clear on its meaning in a map between manifolds, tho.

As an example, 2xdx is the differential for f(x)=x^2 , in that the linear map
f(x)=2x approximates the change x_1^2-x_0^2 , and so on. So, say, near
x=3, the change in values of x^2 is approximately 6dx, where dx=x_1-x_0 as
above.

I think I understand the Jacobian to give a linear approximation of a map between
manifolds (both embedded in R^n) , in that the rows/colums give a basis for the
tangent space, and this tangent space at a point x in M, is/gives us, a linear
approximation of the value of f:M-->N, at f(x), where M,N are manifolds, tho
I am somewhat fuzzy on the details.

It would be great if you could suggest the example from physics; it may take me
a bit to understand the physics part, but it would still be helpful if you could
refer me to the example and/or comment on my description ( in a new thread, if you prefer).

24. Aug 3, 2010

### llarsen

Bacle,

I don't know exactly what you do and don't understand about the chain rule. But let me try and explain how it fits into differential. This may seem pretty basic, but sometimes the basics geometric concepts are not explained clearly in mathematical texts, and they can be pretty useful in understanding what you are working with.

Suppose we have a function f written in terms of a coordinates $$x_1 , x_2 , .., x_n$$ and we want to represent this in terms of a new coordinate system $$y_1, y_2, ..., y_n$$. If we know the X coordinates as functions of Y - $$x_1 (Y), x_2 (Y), ..., x_n (Y)$$, then for functions (or scalar quantities) defined in our space, the transformation of f to the new coordinate system is simply a matter of substitution $$f(x_1 (Y), x_2 (Y), ... x_n (Y))$$ - everywhere where we see an x we replace it with x as a function of y. Conversely, if we had a function defined in terms of y, and wanted to represent this in terms of x we would need the inverse transformation $$y_1 (X), y_2 (X), ..., y_n (X)$$, and again the transfromation would be a simple matter of substitution. The chain rule does not play a part in scalar transformations.

However, mathematical models are often written in terms of differential quantities. Whenever we defined a coordinate system such as $$x_1 , x_2 , ... , x_n$$ for a continuous space, we also define a covector basis which is found by taking the differential of the coordinate functions (i.e. $$(dx_1 , dx_2 , ..., dx_n)$$). The differentials are characterized by the local constant surfaces of the coordinates, and indicate the local rate of change of the coordinate functions. Taking the differential of any function gives a related covector characterized by the constant surfaces of the function. The covector tells the direction and rate of change of the function locally.

When you define a set of coordinate functions, you also define a set of associated parameterized coordinate lines. For a coordinate $$x_i$$, the coordinate lines are defined along the intersection of the constant surfaces of the complementary coordinates $$x_j , i \neq j$$. The coordinate line is parameterize by the associated coordinate $$x_i$$. By taking the derivative of this line, we define a dual vector basis $\frac{\partial}{\partial x_j}$ to the covector basis $$dx_i$$. For example in polar coordinates, the coordinate line for $$\theta$$ is defined by constant surfaces of the coordinate r (or by circles in other words). The line is parameterized by $$\theta$$.

While the chain rule is not used to transform scalar quantities, it is used to transform differential quantities that are made up of vectors and covectors. Vectors and covectors form the basis for tensors with vectors and covectors being order 1 tensors, and matrices being order 2 tensors. When index notation is used, the order of the tensor is indicated by the number of indices used. When representing tensors in index form, the chain rule operators are the Jacobian and inverse jacobian. The previous posts in this thread discussed the how the Jacobian is used to map vectors and covectors.

Note that the columns of the Jacobian are the vector basis in the new coordinate system written in terms of the old coordinate systems. The jacobian can be used to map vector fields from the new coordinate system back to the old one. Locally the Jacobian is just performing a dot product of a vector represented in the new coordinate system onto the basis of the old coordinate system. This is eactly what is typically done in linear algebra. However, we are working with a differential basis rather than macro vectors, and the basis changes from point to point typically. But the Jacobian is just a projection onto the old basis. For covectors, it is a projection from the old basis into the new basis, but the same idea applies.

Higher order tensors are created by using a tensor product, or exterior product. When you tensor multiply a covector $$df = a_1 dx_1 + a_2 dx_2 + ...$$ and another covector $$dg = b_1 dx_1 + b_2 dx_2 + ...$$, you get a second order tensor:

$$df \otimes dg = (a_1 dx_1 + a_2 dx_2 + ...) \otimes (b_1 dx_1 + b_2 dx_2 + ...) = a_1 b_1 dx_1 \otimes dx_1 + a_1 b_2 dx_1 \otimes dx_2 + a_2 b_1 dx_2 \otimes dx_1 + a_2 b_2 dx_2 \otimes dx_2 + ...$$

The tensor product is not comutative, and the result can be written in index and matrix notation as:

$$df_i dg_j = A_{ij} = $\left( \begin{array}{ccc} a_1 b_1 & a_1 b_2 & ... \\ a_2 b_2 & a_2 b_2 & ... \\ ... & ... & ... \end{array} \right)$$$

One disadvantage of the index and matrix notation is that the basis vectors are assumed rather than explicit, and it makes it less obvious how to transform the tensor to a new coordinate systems. If we use the tensor product form, then we just apply the chain rule to each of the differential terms $$dx_i$$ to represent the system in terms of $$dy_i$$:

$$df \otimes dg = a_1 b_1 \frac{\partial x_1}{\partial y_j} dy_j \otimes \frac{\partial x_1}{\partial y_j} dy_j + a_1 b_2 \frac{\partial x_1}{\partial y_j} dy_j \otimes \frac{\partial x_2}{\partial y_j} dy_j + a_2 b_1 \frac{\partial x_2}{\partial y_j} dy_j \otimes \frac{\partial x_1}{\partial y_j} dy_j + a_2 b_2 \frac{\partial x_2}{\partial y_j} dy_j \otimes \frac{\partial x_2}{\partial y_j} dy_j + ...$$

Since index and matrix form doesn't include the basis covectors (or vectors) explicitly, you have to know that the jacobian is multiplied by each covector index, and the inverse jacobian is multiplied each vector index to convert to the new coordinate system.

If you know how to convert tensors to a form where the basis vectors are explicit, it is easier to confirm that coordinate transformations are being applied correctly.

Last edited: Aug 3, 2010