Understanding Jacobian Matrix Transformation in Special Relativity

In summary, while converting between Cartesian and spherical coordinates, the Jacobian matrix is used to transform the differential terms from one coordinate system to another. This matrix differs for contravariant and covariant vectors and is represented by the inverse Jacobian for covariant transformations. This is because the covariant and contravariant components of vectors transform contragrediently to each other.
  • #1
Harry Case
3
0
TL;DR Summary
We use space transformation in Relativity and this is achieved by using Jacobian Matrix
While learning about Special Relativity I learned that we use the Transformation matrix to alter the space .This matrix differs for Contravariant and Covariant vectors.Why does it happen?,Why one kind of matrix (Jacobian) for basis vectors and other kind(Inverse Jacobian) for gradient ,divergence etc.Why does the matrix change ,why does changing quantities require a new matrix and why does Inverse Jacobian satisfy this?

My understanding of how Jacobian achieves this:

Here Grant says that by moving a small dist ##\partial x## ,we produce a change in ##\partial f_1## direction and ##\partial f_2## direction , ##\partial f_1## in being ##x+\sin(y)## direction & ##\partial f_2## being in ##y+\sin(x)## ,(ie) ##\hat{x}## transforms to ##x+\sin(y)## and ##\hat{y}## axis transforms to ##y+\sin(x)##, by moving a small distance ##\partial x## in ##x## axis in input we produce a effect on both the axes in output space, similar thing happens in y direction .$$(x,y) \rightarrow (x+\sin(y),y+\sin(x)) $$
[![Moving a small distance DX in x direction ][1]][1][![Moving a small distance ##\partial x## in x direction in input space (input direction) causes a change in both the axes in output space][2]][2]

$$
J = \begin{bmatrix}
\frac{\partial f_1}{\partial x} & \frac{\partial f_1}{\partial y} \\
\frac{\partial f_2}{\partial x} & \frac{\partial f_2}{\partial y} \\
\end{bmatrix}
$$ [1]: https://i.stack.imgur.com/KtNLh.jpg
[2]: https://i.stack.imgur.com/ilUw3.jpgIs my interpretation a right one?

And why does the Jacobian Matrix change for covariant and contravariant vectors?
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
I can remember the transformation rules mnemotechnically most easily by remembering that if ##q^k## are arbitrary coordinates contravariant components transform as the differentials ##\mathrm{d} q^k##, i.e.,
$$\mathrm{d} q^{\prime k}=\frac{\partial q^{\prime k}}{\partial q^j} \mathrm{d} q^j = \partial_j q^{\prime k} \mathrm{d} q^j,$$
while contravariant components transform like partial derivatives applied to a scalar field,
$$\partial_k'=\frac{\partial}{\partial q^{\prime k}}=\frac{\partial q^j}{\partial q^{\prime k}} \partial_j.$$
 
  • #3
vanhees71 said:
I can remember the transformation rules mnemotechnically most easily by remembering that if ##q^k## are arbitrary coordinates contravariant components transform as the differentials ##\mathrm{d} q^k##, i.e.,
$$\mathrm{d} q^{\prime k}=\frac{\partial q^{\prime k}}{\partial q^j} \mathrm{d} q^j = \partial_j q^{\prime k} \mathrm{d} q^j,$$
while contravariant components transform like partial derivatives applied to a scalar field,
$$\partial_k'=\frac{\partial}{\partial q^{\prime k}}=\frac{\partial q^j}{\partial q^{\prime k}} \partial_j.$$
But why do we use Inverse Jacobian for covariant transformations?
 
  • #4
Define
$${T^{k}}_j=\frac{\partial q^{\prime k}}{\partial q^j}.$$
Then the first equation reads
$$\mathrm{d}q^{\prime k}={T^k}_j \mathrm{d} q^j.$$
That transformation behavior is for contravariant components of vectors.

Further for the covariant components we have the transformation matrix
$${U^j}_{l}=\frac{\partial q_j}{\partial q^{\prime l}} \; \Rightarrow \; \partial_l'=\partial_j {U^j}_{l}.$$
Now
$${T^{k}}_j {U^j}_l = \frac{\partial q^{\prime k}}{\partial q^j} \frac{\partial q_j}{\partial q^{\prime l}}=\frac{\partial q^{\prime k}}{\partial q^j}=\delta_{j}^{k},$$
i.e., ##\hat{U}=\hat{T}^{-1}##. That means the covariant and contravariant components of vectors transform contragrediently to each other as claimed.
 
  • #5
vanhees71 said:
Define
$${T^{k}}_j=\frac{\partial q^{\prime k}}{\partial q^j}.$$
Then the first equation reads
$$\mathrm{d}q^{\prime k}={T^k}_j \mathrm{d} q^j.$$
That transformation behavior is for contravariant components of vectors.

Further for the covariant components we have the transformation matrix
$${U^j}_{l}=\frac{\partial q_j}{\partial q^{\prime l}} \; \Rightarrow \; \partial_l'=\partial_j {U^j}_{l}.$$
Now
$${T^{k}}_j {U^j}_l = \frac{\partial q^{\prime k}}{\partial q^j} \frac{\partial q_j}{\partial q^{\prime l}}=\frac{\partial q^{\prime k}}{\partial q^j}=\delta_{j}^{k},$$
i.e., ##\hat{U}=\hat{T}^{-1}##. That means the covariant and contravariant components of vectors transform contragrediently to each other as claimed.
Thanks for replying ,but you have to understand that I am not a math student ,I stumbled upon this while learning Multivariable Calculus ,to be specific while converting Gradient from Cartesian to spherical coordinates for Electrostatics .Here the differential terms are covariant I suppose ?,In this context why use Inverse Jacobian ?
 
  • #6
If ##\mathbf{S} = \left( \dfrac{\partial x^i}{\partial \tilde{x}^j} \right)## is the Jacobian matrix of the transformation ##\boldsymbol{x}(\boldsymbol{\tilde{x}})##, and if ##\mathbf{U}= \left( \dfrac{\partial \tilde{x}^i}{\partial x^j} \right)## is the Jacobian matrix of the transformation ##\boldsymbol{\tilde{x}}(\boldsymbol{x})##, the inverse function theorem states that ##\mathbf{S} = \mathbf{U}^{-1}##. As already, explained the transformation of upstairs and downstairs vector components look like:\begin{align*}
\mathrm{upstairs:} \ \ \ &a^i = \dfrac{\partial x^i}{\partial \tilde{x}^j} \tilde{a}^j \ \ \longleftrightarrow \ \ \boldsymbol{a} = \mathbf{S} \boldsymbol{\tilde{a}} \\
\mathrm{downstairs:} \ \ \ &b_i = \tilde{b}_j \dfrac{\partial \tilde{x}^j}{\partial x^i} \ \ \longleftrightarrow \ \ \boldsymbol{b}^T = \boldsymbol{\tilde{b}}^T \mathbf{U} \ \ \ \mathrm{or} \ \ \ \boldsymbol{b} = \mathbf{U}^T \boldsymbol{\tilde{b}}
\end{align*}In other words, the column vector of upstairs components transform by ##\mathbf{S}## whilst the column vector of downstairs components is transformed by ##(\mathbf{S}^{-1})^T##.
 
Last edited:
  • #7
The simple answer is that covariant vectors (which are also called forms) and contravariant vectors are very different geometric objects. Once you see this, it is easy to see why they transform differently. Misner, Thorne, and Wheeler (MTW) go into great detail on this so that you can understand why this is. It is a huge book, but it is available online below. Chapter 2 should answer your question.

http://xdel.ru/downloads/lgbooks/Misner C.W., Thorne K.S., Wheeler J.A. Gravitation (Freeman, 1973)(K)(T)(1304s)_PGr_.pdf

Edit: Correcting definition of covariant and contravariant vectors.
 
Last edited:
  • #8
I should add that in special relativity, Lorentz transformations ##x^i = {\Lambda^{i}}_{j} \tilde{x}^{j}## (i.e. those which satisfy ##\boldsymbol{\eta} = \boldsymbol{\Lambda}^T \boldsymbol{\eta} \boldsymbol{\Lambda}##) are usually denoted by ##\boldsymbol{\Lambda}##, so you would see usually these written like ##\boldsymbol{a} = \boldsymbol{\Lambda} \boldsymbol{\tilde{a}}## and also ##\boldsymbol{b} =(\boldsymbol{\Lambda}^{-1})^T\boldsymbol{\tilde{b}}##.

To convert this last one into index notation, recall that ##(\boldsymbol{\Lambda}^{-1})^T = \boldsymbol{\eta} \boldsymbol{\Lambda} \boldsymbol{\eta}^{-1}##, i.e. \begin{align*}
{{(\Lambda^{-1})^T}_{i}}^l = \eta_{ij} {\Lambda^{j}}_k \eta^{kl} \equiv {{\Lambda}_{i}}^l
\end{align*}so then ##b_i = {{\Lambda}_{i}}^l b_l##.
 
  • #9
Harry Case said:
Thanks for replying ,but you have to understand that I am not a math student ,I stumbled upon this while learning Multivariable Calculus ,to be specific while converting Gradient from Cartesian to spherical coordinates for Electrostatics .Here the differential terms are covariant I suppose ?,In this context why use Inverse Jacobian ?
My answer were not appropriate for a math student, because afaik the mathematicians abhorr the Ricci calculus ;-)). It is still not clear to me, what your question is.

You have to note that it is customary to use non-holonomic coordinates for the case of orthogonal curvilinear coordinates in 3D vector calculus and not to disinguish upper and lower indices, because in Cartesian bases they are the same, because the metric has then components ##g_{jk}=\delta_{jk}## and then of course also ##g^{jk}=\delta^{jk}##.

As an example let's take cylinder coordinates ##(r,\varphi,z)##. The relation to Cartesian coordinates is
$$\vec{r}=\begin{pmatrix} x \\ y \\ z \end{pmatrix}=\begin{pmatrix}r \cos \varphi \\ r \sin \varphi \\ z \end{pmatrix}.$$
As the basis you use the normalized orthogonal tangent vectors along the coordinate lines. To get them we first calculate the non-normalized ("holonomic") basis vectors,
$$\vec{b}_r=\partial_r \vec{r}=\begin{pmatrix} \cos \varphi \\ \sin \varphi \\ 0 \end{pmatrix}, \quad \vec{b}_{\varphi}=\partial_{\varphi} \vec{r} = \begin{pmatrix} -r \sin \varphi \\ r \cos \varphi \\ 0 \end{pmatrix}, \quad \vec{b}_z=\partial_z \vec{r}=\begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix}$$
and their lengths:
$$g_{r}=|\vec{b}_r|=1, \quad g_{\varphi}=|\vec{b}_{\varphi}|=r, \quad g_z=|\vec{b}_z|=1.$$
The orthonormal basis in cylinder coordinates is thus given by
$$\vec{e}_r=\frac{1}{g_r} \vec{b}_r=\begin{pmatrix} \cos \varphi \\ \sin \varphi \\ 0 \end{pmatrix}, \quad \vec{e}_{\varphi}=\frac{1}{g_{\varphi}} \vec{b}_{\varphi} = \begin{pmatrix} - \sin \varphi \\ \cos \varphi \\ 0 \end{pmatrix}, \quad \vec{e}_z=\frac{1}{g_z} \vec{b}_z=\begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix}.$$
Now to get the expression for the gradient of a scalar field, note that in a manifestly invariant way, it is defined as
$$\mathrm{d} \Phi = \mathrm{d} \vec{r} \cdot \vec{\nabla} \Phi.$$
Now you have
\begin{equation*}
\begin{split}
\mathrm{d} \Phi &= \mathrm{d} r \partial_r \Phi + \mathrm{d} \varphi \partial_{\varphi} \Phi + \mathrm{d}z \partial_z \Phi \\
&= \mathrm{d} r \vec{b}_r \cdot \vec{\nabla} \Phi + \mathrm{d} \varphi \vec{b}_{\varphi} \cdot \vec{\nabla} \Phi + \mathrm{d} z \vec{b}_{z} \cdot \vec{\nabla} \Phi \\
&=\mathrm{d} r \vec{e}_r \cdot \vec{\nabla} \Phi + \mathrm{d} \varphi \vec{e}_{\varphi} \cdot r \vec{\nabla} \Phi + \mathrm{d} z \vec{e}_{z} \cdot \vec{\nabla} \Phi.
\end{split}
\end{equation*}
So by comparing the coeffients of the coordinate increments you get
$$(\vec{\nabla} \Phi)_r=\vec{e}_r \cdot \vec{\nabla}\Phi = \partial_r \Phi, \quad (\vec{\nabla} \Phi)_{\varphi}=\vec{e}_{\varphi} \cdot \vec{\nabla}\Phi =\frac{1}{r} \partial_{\varphi} \Phi, \quad (\vec{\nabla} \Phi)_z=\vec{e}_z \cdot \vec{\nabla}\Phi =\partial_z \Phi.$$
In analogous ways you can also deduce div and rot using the invariant definition via surface and line integrals.
 
Back
Top