# Hyperbolic relations in deriving Lorentz transformations

Preface to my question: I can assure you this is not a homework question of any kind. I simply have a pedagogical fascination with physics outside of my own studies in school. Also, I did a quick search through the forum and could not find a question similar enough to what I want to know, so i decided to post it.

Anyway, on to my question. I started watching Susskinds special relativity lectures the other day. At one point he starts deriving the lorentz transformations using hyperbolic functions, which I know to be standard. What I do not understand is this.

\begin{align}
x' &= x*cosh(\theta) - t*sinh(\theta)
\end{align}

\begin{align}
t' &= - x*sinh(\theta) + t*cosh(\theta)
\end{align}

In the above coordinate transformation from the x-t frame to the x'-t' frame, I do not understand where his negative signs come from. I understand coordinate transformations using cosine and sine functions just fine, using triangles to determine where the signs of each component comes from. But, I have been unable to do the same thing with the hyperbolic functions. I cannot see where he gets the signs from, and I do not want to simply take the transformations on faith alone that they are true. I would prefer to understand where they come from. So, can anybody explain geometrically how to come to this transformation? Thanks in advance.

Bill_K
rethipher, The easiest way to convince yourself that the signs are correct is to show that the transformation leaves the "interval" invariant. The interval is x2 - c2t2. (But you're using c = 1 so it becomes just x2 - t2.)

And in particular it leaves the path of a light ray invariant: if x = ct then x' = ct' also. So using the properties of the hyperbolic functions, calculate x' - t' and show that if x - t = 0 then x' - t' = 0 also.

I know the signs are correct, and from the identity cosh^2 -sinh^2 = 1 that x^2 - t^2 = x'^2 - t'^2. I just don't understand where the transformation came from, like geometrically, or otherwise mathematicallly. It just sort of pops up in his writing.

Consider a fixed xt frame and a moving x't' frame. Should the x'-coordinate of a fixed event (fixed with respect to the xt) increase or decrease with time if the x't' frame has positive x velocity?

If it has positive x velocity with respect to the x-t frame I would think it would have a negative x' with respect to it's own frame, but I'm not entirely sure of that.

I'm saying the origin of the x't' frame moves to the right with respect to xt, so the x' coordinate must decrease over time.

Chestermiller
Mentor
Preface to my question: I can assure you this is not a homework question of any kind. I simply have a pedagogical fascination with physics outside of my own studies in school. Also, I did a quick search through the forum and could not find a question similar enough to what I want to know, so i decided to post it.

Anyway, on to my question. I started watching Susskinds special relativity lectures the other day. At one point he starts deriving the lorentz transformations using hyperbolic functions, which I know to be standard. What I do not understand is this.

\begin{align}
x' &= x*cosh(\theta) - t*sinh(\theta)
\end{align}

\begin{align}
t' &= - x*sinh(\theta) + t*cosh(\theta)
\end{align}

In the above coordinate transformation from the x-t frame to the x'-t' frame, I do not understand where his negative signs come from. I understand coordinate transformations using cosine and sine functions just fine, using triangles to determine where the signs of each component comes from. But, I have been unable to do the same thing with the hyperbolic functions. I cannot see where he gets the signs from, and I do not want to simply take the transformations on faith alone that they are true. I would prefer to understand where they come from. So, can anybody explain geometrically how to come to this transformation? Thanks in advance.
I assume you are not asking where the Lorentz Transformation comes from. I assume you are asking how the Lorentz Transformation can be worked into this mathematical form.

It is done by starting with the usual form of the Lorentz Transformation and making the substitution:

v = c tanh ($\theta$)

in which case γ = cosh ($\theta$)

Susskind, I think, is trying to make the analogy between a relativistic boost and an ordinary (non-relativistic) rotation of reference frames.

vanhees71
Gold Member
I would put it the other way: The given form with hyperbolic functions is the most natural one since it directly gives the pseudoorthogonal matrices of two-dimensional Minkowski space. These matrices are defined such that they keep the Minkowski product in two-dimensional Minkowski space, i.e.,
$$x \cdot y=x^0 y^0-x^1 y^1$$
invariant. These are the matrices building the group $\mathrm{O}(1,1)$.

The most important subgroup is that of proper orthochronous Lorentz transformations $\mathrm{SO}(1,1)^{\uparrow}$, where the S in the name of the group means "special" saying that the matrix should have determinant 1 and the little uppointing arrow means that it must not change the direction of time. These Matrices are naturally parametrized in terms of the rapidity $\eta$,
$$\Lambda=\begin{pmatrix} \cosh \eta & -\sinh \eta \\ -\sinh \eta & \cosh \eta \end{pmatrix}.$$
$$x'^0=x^0 \cosh \eta-x^1 \sinh \eta,\\ x'^1=-x^0 \sinh \eta + x^1 \cosh \eta.$$
The origin of the primed coordinate system, $x'^1=0$ moves in terms of the coordinates in the unprimed system at constant velocity, because for it one has
$$x^1=x^0 \tanh \eta =c t \tanh \eta.$$
The velocity of the primed frame wrt. to the unprimed frame is thus given by $v=c \tanh \eta=c \beta$. I've chosen the convention such that a positive rapidity means a motion of the primed frame in positive $x^1$ direction relative to the unprimed frame.

The more familiar form is easily derived from the fact that
$$\cosh \eta=\frac{1}{\sqrt{1-\tanh^2 \eta}}=\frac{1}{\sqrt{1-\beta^2}}=\gamma, \quad \sinh \eta=\frac{\tanh \eta}{\sqrt{1-\tanh^2 \eta}}=\gamma \beta.$$
The rapiditity is, however, way more convenient, because it's first of all a Lorentz invariant quantity and second the composition of several Lorentz transformations is given simply by adding the repidities, which is easily proven by the addition rules for hyperbolic functions,
$$\cosh(\eta_1+\eta_2)=\cosh \eta_1 \cosh \eta_2+\sinh \eta_1 \eta_2, \quad \sinh(\eta_1 + \eta_2)=\cosh \eta_1 \sinh \eta_2 + \cosh \eta_2 \sinh \eta_1.$$

Yes, Susskind made the analogy as a regular coordinate transformation which is something I am very familiar with from dynamics and attitude control classes. That's likely where my confusion stems from because I'm trying to make the connection to regular coordinate transformations.

As for the above post, I'm unfamiliar with the term pseudo-orthogonal and minkowski space, both of which I'm looking up right now to familiarize myself I'm not very far into the lectures of special relativity so I have not run into either of those terms yet. It will take me some time to sort through what you said as I am unfamiliar with most of the terms you used above, also including rapidity. Thank you very much for the thorough explanation though. I appreciate it very much.

Chestermiller
Mentor
Yes, Susskind made the analogy as a regular coordinate transformation which is something I am very familiar with from dynamics and attitude control classes. That's likely where my confusion stems from because I'm trying to make the connection to regular coordinate transformations.

As for the above post, I'm unfamiliar with the term pseudo-orthogonal and minkowski space, both of which I'm looking up right now to familiarize myself I'm not very far into the lectures of special relativity so I have not run into either of those terms yet. It will take me some time to sort through what you said as I am unfamiliar with most of the terms you used above, also including rapidity. Thank you very much for the thorough explanation though. I appreciate it very much.
OK. I am also an engineer, so I am going to try to explain this from an engineering perspective, which works for me. I'm sure that the physicists will not be particularly happy with how I explain this.

Imagine that time represents an actual spatial dimension (direction) in the universe, and that the universe is really 4 dimensional. Also imagine that, as 3 dimensional beings, we can see infinitely far into three of these dimensions that comprise our inertial rest frame of reference, but that we can't see one iota into our own 4th dimension (that is "assigned" to our rest frame of reference). Imagine that each inertial frame of reference has its own unique time direction which is different from ours, and that, in the 4D universe, the time direction of each inertial frame of reference is perpendicular to the spatial directions for that frame of reference. Finally, imagine (temporarily) that the 4D universe is Euclidean.

Suppose that there are two inertial frames of reference, S and S', that are in relative motion, and imagine that, unbeknownst to the observers at rest within each of these reference frames, the reference frames (coordinate systems) have experienced a rigid body rotation relative to one another (with some of the rotation involving the time direction); suppose that the offset caused by the rigid body rotation is a function to the relative velocity of the two reference frames, scaled by the speed of light c. Then, if ds represents a differential position vector in the (fictitious) 4D Euclidean space, its length reckoned with respect to the two coordinate systems will satisfy:

(ds)2 = (cdt)2 + (dx)2 + (dy)2 + (dz)2 = (cdt')2 + (dx')2 + (dy')2 + (dz')2

Now, imagine another 4D universe, the actual 4D universe that we occupy. In this universe, the coordinate systems for the S and S' frames of reference are also offset from one another by an amount that is a function of the relative velocity, scaled by the speed of light c. However, in this universe, the length of a differential position vector does not conform to the Euclidean equation given above. Instead, it satisfies the Minkowski metrical equation:

(ds)2 = -(cdt)2 + (dx)2 + (dy)2 + (dz)2 = -(cdt')2 + (dx')2 + (dy')2 + (dz')2

This is the only difference. If we express the differential position vector ds in component form, involving unit vectors, then in both 4D Euclidean and 4D Minkowski space, we have:

ds = cdt it + dx ix + dy iy+ dz iz

In Euclidean space, the dot product of each unit vector with itself is equal to 1, and the dot product of each unit vector with any of the other unit vectors is equal to 0. In our actual Minkowski universe, the dot product of each unit vector with any of the other unit vectors is still equal to zero, and the dot product of each unit vector with itself is still equal to 1, except for the time direction, where the dot product of the unit vector with itself is equal to -1. This is the reality of the 4D relativistic universe.

The Lorentz Transformation describes the geometric offset of the S' frame of reference with respect to the S frame of reference, and is the relativistic equivalent of a rigid body rotation in Euclidean space. In relativity, this offset is referred to as a "boost," rather than a rotation, and mathematically involves hyperbolic sines and cosines, rather than circular sines and cosines.

Chet

Yeah, once Minkowski spacetime is in place, everything about LTs and the invariant interval falls into place. The problem is getting there. To do that, you can see there are 3 types of flat geometries: time vector dots to +1, to 0, or -1. The first case describes Euclidean geometry, and it is clear it doesn't admit an invariant vector (which light must follow) under change of basis.

The second case describes time as an invariant vector. This is Galilean invariance, and we know how this is incompatible with, for example, the Maxwell equations.

Minkowski space is all that remains, and it does admit an invariant vector that isn't the time vector.

This motivates the choice of Minkowski spacetime as the spacetime of SR.

Okay, it has become clear that I may be missing a bit of mathematics background here. Are there any recommendations as to what mathematics subject that relativity comes from would fill this. I'm not sure what math class Minkowski Space time falls under. I have heard of it, but i don't quite know what properties it has, or what math operations would qualify classify a space as Minkowski, but I will look into it. Are there any quick suggestions as to what courses or material I can look into to fill this void? I find that I would rather take the math classes first, or read that material before trying to do the physics because it is much easier to see where something comes from already having the mathematical background, versus having to take a time out from the physics to go brush up on the math.

jcsd
Gold Member
Okay, it has become clear that I may be missing a bit of mathematics background here. Are there any recommendations as to what mathematics subject that relativity comes from would fill this. I'm not sure what math class Minkowski Space time falls under. I have heard of it, but i don't quite know what properties it has, or what math operations would qualify classify a space as Minkowski, but I will look into it. Are there any quick suggestions as to what courses or material I can look into to fill this void? I find that I would rather take the math classes first, or read that material before trying to do the physics because it is much easier to see where something comes from already having the mathematical background, versus having to take a time out from the physics to go brush up on the math.
Do you understand vector spaces? Do you understand the concept of an inner product on a vector space? Do you understand how an inner product induces a metric on an inner product space?

Well Minkowski space is a four dimensional real vector space, however rather than having an inner product defined on it, it has a Minkowski inner product defined on it. The Minkowski inner product isn't a true inner product as such because it fails to meet the condition of 'positive-definiteness' (i.e. the Minkowski inner product of two vectors can be less than zero), but it can be treated as the inner product and used it as such. The Minkowski inner product induces a Minkowski metric on Minkowski space, thoguh just as the Minkowski inner product isn't a true inner product, the Minkowski metric isn't a true metric, but again we treat as such.

Yes, I do understand vector spaces and inner products. I am not however familiar with how inner products induce metrics on inner product spaces.

Doing some reading I found that Euclidean geometry's magnitude of a vector is an example of a metric which can be represented by the standard ${\bar x^2 = x_1^2 + x_2^2 + x_3^2}$ where the metric is represented as the identity matrix of a ${\bf{3 \times 3}}$

The Minkowski metric, as I understand it, is the ${\bf{4 \times 4}}$ matrix $\left( \begin{array}{ccc} -1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{array} \right)$

Taking the Euclidean metric as example, that would make the Minkowski metric equivalently describing how to find the magnitude of a vector in Minkowski space. Is that correct?

My next question is how exactly does all of this relate back to the matrix $\Lambda =$$\left( \begin{array}{ccc} cosh\eta & -sinh\eta \\ -sinh\eta & cosh\eta\\ \end{array} \right)$ from my first post?

Can I not use the analogy from physical coordinate rotations to find equivalent coordinate transformations of spacetime using the hyperbolic functions.

For instance, if I have two coordinate frames ${\bf{A}}$ and ${\bf{B}}$ with coordinates ${\widehat{a_1}}$ along the horizontal and ${\widehat {a_2}}$ along the vertical of ${\bf{A}}$ and likewise for the ${\bf{B}}$ frame
${\widehat{b_1}}$ and ${\widehat{b_2}}$ respectively.
Then if I rotate frame ${\bf{B}}$ at some angle ${\theta}$ counter-clockwise the rotation would correspond to $\left( \begin{array}{ccc} {\widehat{b_1}}\\ {\widehat{b_2}}\\ \end{array} \right) =$$\left( \begin{array}{ccc} cos\theta & sin\theta \\ -sin\theta & cos\theta\\ \end{array} \right)$$\left( \begin{array}{ccc} {\widehat{a_1}}\\ {\widehat{a_2}}\\ \end{array} \right)$

which comes from simple geometry of finding the components of each axis, or equivalently using the euler formula ${e^i = cos\theta + isin\theta}$ to multiply out individual components.

What I tried to do originally was take the rigid coordinate transformation analogy and simply apply it to the geometry of hyperbolas to get the correct sign for each component
of the rotated coordinate frame of spacetime. Can that not be done using geometry, or must I rely on the Minkowski metric as was alluded to above to somehow come to the correct signs on ${\bf{\Lambda}}$ ?

You're really taking the matrix representing a Euclidean rotation for granted. I ask you, how would you derive that from just, say, the notion of distance in Euclidean geometry?

Once you understand how that is derived, I think you'll find the idea of deriving the Lorentz transformation matrix to be roughly analogous to that.

Chestermiller
Mentor
Okay, it has become clear that I may be missing a bit of mathematics background here. Are there any recommendations as to what mathematics subject that relativity comes from would fill this. I'm not sure what math class Minkowski Space time falls under. I have heard of it, but i don't quite know what properties it has, or what math operations would qualify classify a space as Minkowski, but I will look into it. Are there any quick suggestions as to what courses or material I can look into to fill this void? I find that I would rather take the math classes first, or read that material before trying to do the physics because it is much easier to see where something comes from already having the mathematical background, versus having to take a time out from the physics to go brush up on the math.
You are very close to understanding what is going on. I've been developing a set of notes for people with an engineering background who wish to achieve rapid progress in understanding the basics of special relativity. I've been very unhappy with how the subject is presented in standard text books and Susskind lectures. The first two chapters contain most of the material of interest, and are presented in a way that engineers can understand. If you would like a copy of the first two chapters (Microsoft Word document), email me at [email protected].

Chet

You're really taking the matrix representing a Euclidean rotation for granted. I ask you, how would you derive that from just, say, the notion of distance in Euclidean geometry?

Once you understand how that is derived, I think you'll find the idea of deriving the Lorentz transformation matrix to be roughly analogous to that.
I know that the two equations relating the coordinates come from the geometry of a triangle on the unit circle. What exactly do you mean when you say that I am taking the rotation for granted?

In terms of the definition of length, I would just use the fact that ${cos\theta = {adj \over hyp}}$ and ${sin\theta = {opos \over hyp}}$ to define the appropriate lengths of the axis.

For the unit hyperbola, I found I can do a similar geometric thing for rotations on the unit hyperbola. What am I missing in terms of matrix rotations?

I'm speaking in a more fundamental sense. When you speak of the unit circle, you already have an idea that there is a set of points equidistant from the origin. You need to already have a distance formula--a metric--to do this, and where in Euclidean geometry you get a circle, in Minkowski space you get a hyperbola.

But you're at that point already (I wasn't sure that you were), so that's fine. This is the point where I think a matter of interpretation is in order. Let's say you have a plane. You can talk about components of vectors in this plane by choosing a basis.

i.e. say you have a vector $A$. One extracts components by taking taking dot products. $A^x = A \cdot e^x$, for example. But you can always choose a different basis. You can choose a new ${e^x}'$. With a metric--a notion of distance--you can ensure that this vector is normalized, and you can compute its components based on the unit set (the unit circle in Euclidean space, the unit hyperbola in Minkowski).

I feel this is really important: when you choose a new basis, you can express those vectors in terms of the old basis, and in Minkowski space, this is what introduces the hyperbolic trig functions characteristic of the Lorentz transformations.

Let me work the example: one chooses the basis $e_t, e_x$ to describe the xt plane in a Minkowski space--in particular, $e_t \cdot e_t = -1$ and $e_x \cdot e_x = 1$.

Now, we can change the basis describing this plane. Let our new time vector ${e_t}'$ be proportional to $e_t + \beta e_x$. This should describe someone whose worldline moves in the positive x direction. When we normalize this vector, we get factors of gamma: ${e_t}' = \gamma(e_t + \beta e_x)$.

Now, we need to find a corresponding ${e_x}'$ to go with this vector. To do that, we can invoke a process like the Gram-Schmidt procedure. ${e_x}'$ should be in the direction of $e_x - (e_x \cdot e^t)e_t$. Consider this an exercise to show that ${e_x}' = \gamma(e_x + \beta e_t)$.

Now then, take a position vector $x e_x + t e_t$. Evaluate its components in the x't' frame:

$$(x e_x + t e_t) \cdot {e^x}' = (x e_x + t e_t) \cdot (\gamma e^x - \gamma \beta e^t) = \gamma (x - \beta t) \\ (x e_x + t e_t) \cdot {e^t}' = (x e_x + t e_t) \cdot (\gamma e^t - \gamma \beta e^x) = \gamma (t - \beta x)$$

This is a critical point: while ${e_x}' = \gamma e_x + \gamma \beta e_t$, the reciprocal basis vector ${e^x}' = \gamma e^x - \gamma \beta e^t$, and the same for the time vectors. This follows directly from the metric (but if you like, this can be more explicitly shown as well).

The components are now computed, and all that remains is to make the connection between $\gamma, \beta$ and the hyperbolic trig functions.