# Lorentz-Invariant Scalar Products

1. Sep 26, 2014

### Sunnyocean

Hi,

Are ALL scalar products of four-vectors Lorentz-invariant (as opposed to just the scalar product of a four-vector with itself)? And, if yes, what is the proof?

2. Sep 26, 2014

### DrGreg

Yes.

If you've already proved that the scalar product of a four-vector with itself is invariant, the simplest method is to expand $g(\textbf{U+V}, \textbf{U+V})$.

3. Sep 26, 2014

### Matterwave

Yes, the proof is quite simple (Einstein summation notation used):

$$A^{\mu'} B_{\mu'} =\frac{\partial x^{\mu'}} {\partial x^{\nu}} A^\nu\frac{\partial x^\rho}{\partial x^{\mu'}} B_\rho= \frac{\partial x^\rho}{\partial x^{\mu'}} \frac{\partial x^{\mu'}}{\partial x^{\nu}}A^\nu B_\rho = \delta^\rho_\nu A^\nu B_\rho = A^\rho B_\rho$$

4. Sep 26, 2014

### Orodruin

Staff Emeritus
You have already gotten some very good help in terms of proving that it is true. Let me just add that the mathematics really is not any different from what happens for rotations in an Euclidean space (in fact, spatial rotations are a subset of the Lorentz transformations). This might help to get a stronger feeling for why this is so.

5. Sep 27, 2014

### Sunnyocean

Matterwave,

Thank you very much but I am just beginning to learn the Einstein notation. Could you expand the proof please?

6. Sep 27, 2014

### Orodruin

Staff Emeritus
The Einstein notation is essentially to follow the rule "sum over all repeated indices". To be more specific in the case of the proof:

By definition, a vector transforms as
$$A'{}^\mu = \sum_\nu \frac{\partial x'{}^\mu}{\partial x^\nu} A^\nu = \{{\rm Einstein\ notation}\} = \frac{\partial x^\mu}{\partial x^\nu} A^\nu.$$
Similarly for the covector $B'_\mu$
$$B'_\mu = \frac{\partial x^\nu}{\partial x'{}^\mu} B_\nu.$$
Taking the inner product
$$B'_\mu A'{}^\mu = \frac{\partial x^\nu}{\partial x'{}^\mu} B_\nu \frac{\partial x'{}^\mu}{\partial x^\sigma} A^\sigma.$$
Here, there is a sum over $\mu$ since it is a repeated index. Thus,
$$\frac{\partial x^\nu}{\partial x'{}^\mu} \frac{\partial x'{}^\mu}{\partial x^\sigma} = \sum_\mu \frac{\partial x^\nu}{\partial x'{}^\mu} \frac{\partial x'{}^\mu}{\partial x^\sigma} = \frac{\partial x^\nu}{\partial x^\sigma},$$
where we have used the chain rule in the last step. Now, $\frac{\partial x^\nu}{\partial x^\sigma} = \delta^\nu_\sigma$ and performing the sum over $\sigma$ (or $\nu$) completes the proof.

7. Sep 27, 2014

### Sunnyocean

8. Sep 27, 2014

### Sunnyocean

Now what I have to understand in your proof is the definition. I mean why do you say that "by definition" we have the relationship below:

Maybe you are missing one "x superscript u" in the right hand side? I am writing this because it seems to me that without it, the right hand side becomes just a sum of scalars (which were the coefficients of the coordinates of the vector "A superscript nu" <-- sorry, having problems with finding the latex help page of physicsforums.com). If I am mistaken about the definition I quoted, please correct me.

9. Sep 27, 2014

### Fredrik

Staff Emeritus
10. Sep 27, 2014

### Fredrik

Staff Emeritus
This is only a proof to someone who has already worked through the (much more complicated) details of how tangent and cotangent vectors transform under a change of coordinate system.

11. Sep 27, 2014

### Fredrik

Staff Emeritus
It's what DrGreg said. These are the details: Let g be a metric tensor on a finite-dimensional vector space V. Let $\Lambda$ be a linear operator on V such that $g(\Lambda x,\Lambda x)=g(x,x)$ for all $x\in V$.

Let $u,v\in V$ be arbitrary. We have
\begin{align}
&g(u+v,u+v) = g(u,u)+2g(u,v)+g(v,v),\\
&g(\Lambda(u+v),\Lambda(u+v))= g(\Lambda u+\Lambda v,\Lambda u+\Lambda v) =g(\Lambda u,\Lambda u)+2 g(\Lambda u,\Lambda v) +g(\Lambda v,\Lambda v)\\
&=g(u,u)+2g(\Lambda u,\Lambda v)+g(v,v)
\end{align} The two left-hand sides are equal, so if we subtract the second equality from the first, we get
$$0=2g(u,v)-2g(\Lambda u,\Lambda v),$$ which implies that $g(\Lambda u,\Lambda v)=g(u,v)$.

12. Sep 27, 2014

### Sunnyocean

Thank you Fredrik, can you please point me to a web link / book regarding how tangent and cotangent vectors transform under a change of coordinate system?

13. Sep 27, 2014

### Fredrik

Staff Emeritus
That would be a textbook on differential geometry, like "Introduction to smooth manifolds" by John M. Lee. It will be very hard to dive right into it. If you're studying SR, classical electrodynamics or quantum field theory, you may want to avoid differential geometry for now. Instead, you should focus on understanding tensors in the context of multilinear algebra. Chapter 3 in "A first course in general relativity" by Schutz is a nice introduction. Link. When you have read enough to understand the concepts "dual space" and "dual basis", you should take a look at this post and the two posts that I linked to in it.

We can talk about the differential geometry stuff too if you want.

14. Sep 27, 2014

### Sunnyocean

Thank you again Fredrik,

The link you provided seems to have copyright problems, so I will not use it (no offense). It seems to me to be one of those cases of "legal stealing" so I will not download it, even though I could do so very easily. Thank you anyway for the link. I will go to a library or buy Scutz's book.

Also, I have already tried Lee's books (you or someone else recommended them to me about one month ago) but they are a bit above my level. Would you happen to know a good book which can "bridge" the gap between Schutz and Lee?

For example, Lee dives right into talking about topological spaces. He uses this term (and other terms whose precise definition I do not know) in theorems (etc.) so he assumes the reader knows what a topological space is. I don't know what a topological space is. Do you know of a good book which defines (precisely, in mathematical fashion, not in the careless fashion of most physics books) what a topological space is? As well as other stuff related to topological spaces and other "basics" one needs to know in order to be able to understand Lee's books?

Last edited: Sep 27, 2014
15. Sep 27, 2014

### Fredrik

Staff Emeritus
Lee has written a few other books. One of them is called "Introduction to topological manifolds". It haven't read it myself, but it's very likely that it contains all the topology you need and more. (A smooth manifold is a special kind of topological manifold, and a topological manifold is a special kind of topological space).

I find it a little frustrating that the best book on differential geometry is so heavy on topology. If I had the opportunity to teach differential geometry, I would do a non-rigorous presentation first, one that ignores most of the topology stuff.

16. Sep 27, 2014

### Sunnyocean

Sounds great, thank you :)

17. Sep 27, 2014

### Fredrik

Staff Emeritus
I'll provide a bit more information on the relevant parts of differential geometry. A manifold is (roughly, and ignoring topology stuff) a set M together with a bunch of functions $x:U\to\mathbb R^n$ whose domains are subsets of M. These functions are called coordinate systems or charts. Let p be a point in M. Let $x:U\to\mathbb R^n$ be a coordinate system such that $p\in U$. For each $i\in\{1,\dots,n\}$, the map $p\mapsto (x(p))^i$ is denoted by $x^i$. For each $i\in\{1,\dots,n\}$ and each (nice enough) function $f:M\to\mathbb R$, define
$$\frac{\partial}{\partial x^i}\bigg|_p f= (f\circ x^{-1})_{,i}(x(p)),$$ where ${}_{,i}$ denotes the partial derivative with respect the ith variable. This makes $\frac{\partial}{\partial x^i}\!\big|_p$ a linear functional that takes (nice enough) real-valued functions on M to real numbers. The finite sequence $\big(\frac{\partial}{\partial x^i}\!\big|_p\big)_{i=1}^n$ is an ordered basis for a vector space. That vector space is called the tangent space of M at p, and is denoted by $T_pM$.

This is the simplest definition of "tangent space", but it's both ugly and unintuitive. Ugly because it's not clear that $T_pM$ will be independent of x. Unintuitive because it's not clear why we should call linear combinations of these functionals "tangent vectors". The latter issue is a topic for another day. But we can address the former issue. Let $y:V\to\mathbb R^n$ be a coordinate system such that $p\in V$. For all i and all f, we have
\begin{align}
&\frac{\partial}{\partial y^i}\bigg|_p f = (f\circ y^{-1})_{,i}(y(p)) =(f\circ x^{-1}\circ x\circ y^{-1})_{,i}(y(p)) =(f\circ x^{-1})_{,j}((x\circ y^{-1})(y(p)))\, (x\circ y^{-1})^j{}_{,i}(y(p))\\
& (x\circ y^{-1})^j{}_{,i}(y(p))\, (f\circ x^{-1})_{,j}(x(p)) =\Lambda^i{}_j \frac{\partial}{\partial x^j}\bigg|_p f,
\end{align} where we have defined $\Lambda^i{}_j=(x\circ y^{-1})^j{}_{,i}(y(p))$, i.e. as the row i, column j component of the jacobian matrix of the coordinate change function $x\circ y^{-1}$. (The calculation above is essentially just the chain rule. There's also a minor technical issue regarding the insertion of $x^{-1}\circ x$ that I'm just going go ignore. It actually takes a lot of work to show that the first equality holds).

This result tells us two things:

1. Since the $\frac{\partial}{\partial x^i}\!\big|_p$ are linear combinations of the $\frac{\partial}{\partial y^i}\!\big|_p$, we can be sure that the former span the same vector space as the latter. So $T_pM$ as we have defined it, is independent of the coordinate system used in the definition.

2. The relationship between two arbitrary ordered bases for $T_pM$. So now we know how the basis vectors "transform" under a change of coordinates $x\to y$.

We can use this to determine how the components of tangent vectors transform under a change of coordinates, i.e. how they transform under the change of ordered basis induced by the change of coordinates. The post I linked to in post #13 explains how to do this. Tensors in the context of differential geometry aren't fundamentally different. Everything you've learned in multilinear algebra still applies. It's just that now the arbitrary vector space V is specifically $T_pM$ for some manifold M and some point p in M.

Last edited: Sep 27, 2014
18. Sep 27, 2014

### stevendaryl

Staff Emeritus
Rather than taking the transformation rules for $A^\mu$ and taking the transformation rules for $B_\mu$ and then using those to prove that $A^\mu B_\mu$ is a scalar, the alternative is to define a covector $B$ as a linear function that takes a vector $A$ and returns a scalar. That is, the transformation rules for the components of $B$ follow from the requirement that $B_\mu A^\mu$ is a scalar, rather than the other way around.