Showing that Lorentz transformations are the only ones possible

bob900
Messages
40
Reaction score
0
In a book ("The special theory of relativity by David Bohm") that I'm reading, it says that if (x,y,z,t) are coordinates in frame A, and (x',y',z',t') are coordinates in frame B moving with v in realtion to A, if we have (for a spherical wavefront)

c^2t^2 - x^2 - y^2 - z^2 = 0

and we require that in frame B,

c^2t'^2 - x'^2 - y'^2 - z'^2 = 0

then it can be shown that the only possible transformations (x,y,z,t) -> (x',y',z',t') which leave the above relationship invariant are the Lorentz transformations (aside from rotations and reflections).

I'm wondering how exactly can this be shown?
 
Physics news on Phys.org
To show it for a general Lorentz-Herglotz transformation is really difficult, you should only consider a (lorentzian) boost along Ox, for example, i.e. equal y to 0 and z to 0.

You should consider then x(x',t') and t(x',t') to be linear functions. Place some unknown coefficients and then determine them from physical assumptions.
 
Last edited:
Any transformation of the form
\left[<br /> \begin{matrix} a &amp; \sqrt{a^2-1}\\<br /> \sqrt{a^2-1} &amp; a<br /> \end{matrix} \right] \left[ \begin{matrix}dt\\<br /> dx<br /> \end{matrix} \right] = \left[ \begin{matrix}dt&#039;\\<br /> dx&#039;<br /> \end{matrix} \right]<br />
will preserve -dt'2 + dx'2 = -dt2 + dx2. More restraints than preserving the interval are needed.
 
Last edited:
Mentz114 said:
Any transformation of the form
\left[<br /> \begin{matrix} a &amp; \sqrt{a^2-1}\\<br /> \sqrt{a^2-1} &amp; a<br /> \end{matrix} \right] \left[ \begin{matrix}dt\\<br /> dx<br /> \end{matrix} \right] = \left[ \begin{matrix}dt&#039;\\<br /> dx&#039;<br /> \end{matrix} \right]<br />
will preserve -dt'2 + dx'2 = -dt2 + dx2. More restraints than preserving the interval are needed.

Why wouldn't something as simple as :

x&#039; = x - k

t&#039; = \sqrt{t^2-2kx+k^2}

work (where k is some constant)?

Seems like that, and any other similarly arbitrary transformation could work...
 
Mentz114 said:
[...]. More restraints than preserving the interval are needed.

Preserving the interval will ensure linearity of the transformations and only that.
 
dextercioby said:
Preserving the interval will ensure linearity of the transformations and only that.
Yes.
Taking the Taylor expansion of the matrix and dropping terms of order a2 or greater gives the generator, I think. Exponentiating this gives a = cosh(something) but no idea what 'something' is. That's probably an illicit fudge, in any case.
 
dextercioby said:
To show it for a general Lorentz-Herglotz transformation is really difficult, you should only consider a (lorentzian) boost along Ox, for example, i.e. equal y to 0 and z to 0.

So if given just the following pieces of information :

1. c^2 t^2 - x^2 - y^2 - z^2 = 0
2. c^2 t&#039;^2 - x&#039;^2 - y&#039;^2 - z&#039;^2 = 0

is it "difficult" or actually impossible to show that the Lorentz transformation is the only possibility (aside from rotation x^2+y^2+z^2=x&#039;^2+y&#039;^2+z&#039;^2 and t=t', and reflection x=-x', t=-t', etc.)?

You should consider then x(x',t') and t(x',t') to be linear functions. Place some unknown coefficients and then determine them from physical assumptions.

That I know how to do - what I'm trying to see is if the book is wrong in saying that you only need 1 and 2 above. Here's a quote from the book :

The question then naturally arises as to whether there are any other transformations that leave the speed of light invariant. The answer is that if we make the physically reasonable requirement that the transformation possesses no singular points (so that it is everywhere regular and continuous) then it can be shown that the Lorentz transformations plus rotations plus reflections are the only ones that are possible.
 
I will use units such that c=1. I will also use the definition
$$\eta=\begin{pmatrix}-1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\end{pmatrix},$$ because I'm more used to this sign convention than the other one. The Minkowski form (pseudo-inner product) on ##\mathbb R^4## is defined by ##g(x,x)=x^T\eta x## for all ##x\in\mathbb R^4##. Define ##P=\{x\in\mathbb R^4|g(x,x)=0\}##. The OP is asking us to prove the following statement:
If ##g(\Lambda(x),\Lambda(x))=g(x,x)## for all ##x\in P##, then ##\Lambda## is a Lorentz transformation.​
I don't know how to do that. I don't even know if it's possible. But I can prove a similar theorem that starts with a stronger assumption:
If ##\Lambda## is linear and ##g(\Lambda x,\Lambda x)=g(x,x)## for all ##x\in\mathbb R^4##, then ##\Lambda## is a Lorentz transformation.​
Proof: Suppose that ##\Lambda## is linear and that ##g(\Lambda x,\Lambda x)=g(x,x)## for all ##x\in\mathbb R^4##. Let ##y,z\in\mathbb R^4## be arbitrary. We have $$g(\Lambda(y-z),\Lambda(y-z))=g(y-z,y-z).$$ If we expand this using the linearity of ##\Lambda## and the bilinearity of g, and use that ##g(\Lambda x,\Lambda x)=g(x,x)## for all ##x\in\mathbb R^4##, we see that ##g(\Lambda y,\Lambda z)=g(y,z)##. Since y,z are arbitrary, this means that we have proved the following statement: For all ##x,y\in\mathbb R^4##, ##g(\Lambda x,\Lambda y)=g(x,y)##. Now let ##x,y\in\mathbb R^4## be arbitrary. We have $$x^T\eta y=g(x,y)=g(\Lambda x,\Lambda y)=x^T\Lambda^T\eta\Lambda y.$$ Let ##\{e_\mu\}_{\mu=0}^3## be the standard basis for ##\mathbb R^4##. I will use the notation ##M_{\mu\nu}## for the component on row ##\mu##, column ##\nu##, of a matrix ##M##. For all ##\mu,\nu\in\{0,1,2,3\}##, we have
$$\eta_{\mu\nu}=e_\mu{}^T\eta e_\nu=e_\mu{}^T\Lambda^T\eta\Lambda e_\nu=(\Lambda^T\eta\Lambda)_{\mu\nu}.$$ So ##\Lambda^T\eta\Lambda=\eta##, and this means that ##\Lambda## is a Lorentz transformation.

(The definition of "Lorentz transformation" goes like this: A linear ##\Lambda:\mathbb R^4\to\mathbb R^4## is said to be a Lorentz transformation if ##\Lambda^T\eta\Lambda=\eta##).
 
Hm, it looks like I can also prove the following variant:
If ##\Lambda## is surjective, and ##g(\Lambda(x),\Lambda(y))=g(x,y)## for all ##x,y\in\mathbb R^4##, then ##\Lambda## is a Lorentz transformation.​
With these assumptions, I can prove linearity by messing around with the expression
$$g(x,ay+bz)=g(\Lambda(x),\Lambda(ay+bz)).$$
 
  • #10
Mentz114 said:
Any transformation of the form
\left[<br /> \begin{matrix} a &amp; \sqrt{a^2-1}\\<br /> \sqrt{a^2-1} &amp; a<br /> \end{matrix} \right] \left[ \begin{matrix}dt\\<br /> dx<br /> \end{matrix} \right] = \left[ \begin{matrix}dt&#039;\\<br /> dx&#039;<br /> \end{matrix} \right]<br />
will preserve -dt'2 + dx'2 = -dt2 + dx2. More restraints than preserving the interval are needed.

Your a is just \gamma. I don't think there is any additional constraint needed, other than that the transformations should take the t axis to a line x=vt for some real v, and also not flip the orientation of the positive t axis. (These criteria rule out a<1).
 
Last edited:
  • #11
bob900 said:
So if given just the following pieces of information :

1. c^2 t^2 - x^2 - y^2 - z^2 = 0
2. c^2 t&#039;^2 - x&#039;^2 - y&#039;^2 - z&#039;^2 = 0

is it "difficult" or actually impossible to show that the Lorentz transformation is the only possibility (aside from rotation x^2+y^2+z^2=x&#039;^2+y&#039;^2+z&#039;^2 and t=t', and reflection x=-x', t=-t', etc.)?
Lorentz transformations are not the only possibility. The most general transformations of this kind are the conformal transformations. There's an older thread over in the tutorials forum which derives them, but from the point of view of finding transformations that leave the metric invariant up to a scale factor.

Alternatively, it is possible to find the conformal transformations by direct solution of the differential equations defining the transformation. The (messy, difficult) details can be found in Appendix A of this older text:

V. Fock, N. Kemmer (translator),
The theory of space, time and gravitation.
2nd revised edition. Pergamon Press, Oxford, London, New York, Paris (1964).

You might be able to access a copy at Library Genesis. ;-)

Fock also shows that if you assume only the relativity principle (equivalence of inertial observers) then the most general transformations are of linear-fractional form -- which are not the same as conformal transformations since the latter involve a quadratic denominator in general. But if we add the light principle (which is what you used above), then the "intersection" between linear-fractional and conformal transformations is indeed the Lorentz transformations.

Regarding what the book said about requiring that the transformations be well behaved everywhere: although the more general transformations fail to be nonsingular in general everywhere, there have been recent attempts to use them to construct foundations that might account for the success of the Lambda-CDM model in cosmology -- the singular part of the transformation only occurs at the radius of the universe. But this is probably a subject for the BTSM forum.
 
  • #12
Fredrik said:
So ##\Lambda^T\eta\Lambda=\eta##, and this means that ##\Lambda## is a Lorentz transformation.

(The definition of "Lorentz transformation" goes like this: A linear ##\Lambda:\mathbb R^4\to\mathbb R^4## is said to be a Lorentz transformation if ##\Lambda^T\eta\Lambda=\eta##).

But how is this definition of the Lorentz transformation equivalent to the "standard" definition:

x&#039; = \frac{x-vt}{\sqrt{1-v^2}}, t&#039; = \frac{t-vx}{\sqrt{1-v^2}}, y&#039;=y, z&#039;=z



?
 
  • #13
bcrowell said:
Your a is just \gamma. I don't think there is any additional constraint needed, other than that the transformations should take the t axis to a line x=vt for some real v, and also not flip the orientation of the positive t axis. (These criteria rule out a<1).
No, a is not \gamma. It can be anything you like. The values a < 1 are ruled out because we want a real result. This transformation keeps the interval invariant. It's still a long way short of the LT.

Frederik's calculation shows it's not trivial to get the LT from a few assumptions.
 
  • #14
bob900 said:
But how is this definition of the Lorentz transformation equivalent to the "standard" definition:

x&#039; = \frac{x-vt}{\sqrt{1-v^2}}, t&#039; = \frac{t-vx}{\sqrt{1-v^2}}, y&#039;=y, z&#039;=z


This isn't the most general Lorentz transformation. This is just a boost in the x direction. A Lorentz transformation is a member of the Lorentz group, and the Lorentz group includes parity (=reversal of the spatial axes), time reversal, rotations and boosts (in arbitrary directions).

Let's consider the 1+1, dimensional case. If we write
$$\Lambda=\begin{pmatrix}a & b\\ c & d\end{pmatrix},$$ we can easily see that a≠0 and that c/a=-v. To see this, first note that
$$\Lambda\begin{pmatrix}1\\ 0\end{pmatrix}=\begin{pmatrix}a\\ c\end{pmatrix}.$$ In my notation, the upper component of a 2×1 matrix is the time coordinate. I will refer to it as "the 0 component", and the lower component, i.e. the spatial coordinate, as "the 1 component". I will also number the rows and columns of my 2×2 matrices from 0 to 1. For example, the 01 component of ##\Lambda## is b. If a=0, then the result above tells us that ##\Lambda## takes the time axis of the "old" coordinate system to the spatial axis of the "new" coordinate system. This corresponds to an infinite velocity difference, because the time axis is the (coordinate representation of) the world line of an object with velocity 0, and the spatial axis is the (coordinate representation of) the world line of an object with infinite velocity. This is why we can rule out a=0. This allows us to take a outside the coordinate matrix.
$$\Lambda\begin{pmatrix}1\\ 0\end{pmatrix}=\begin{pmatrix}a\\ c\end{pmatrix} =a\begin{pmatrix}1\\ c/a\end{pmatrix}.$$ Now we can interpret c/a as -v, because we know that ##\Lambda## maps the time axis to the line
$$t\mapsto t\begin{pmatrix}1 \\ -v\end{pmatrix},$$ i.e. the line with x'=-vt'.

Now consider the effect of ##\Lambda## on two coordinate matrices ##\begin{pmatrix}t_1\\ 0\end{pmatrix}## and ##\begin{pmatrix}t_2\\ 0\end{pmatrix}## with ##t_1<t_2##.
\begin{align}
\Lambda\begin{pmatrix}t_1\\ 0\end{pmatrix} =\begin{pmatrix}at_1\\ ct_1\end{pmatrix},\qquad
\Lambda\begin{pmatrix}t_2\\ 0\end{pmatrix} =\begin{pmatrix}at_2\\ ct_2\end{pmatrix}\end{align} The 0 components of the new coordinate pairs are ##at_1## and ##at_2## respectively. If a>0, then ##t_1<t_2## implies that ##at_1<at_2##, but if a<0, then ##t_1<t_2## implies that ##at_1>at_2##. So ##\Lambda## preserves the temporal order of events on the 0 axis when a>0, and reverses them when a<0. To get the specific result you want, we need to assume that a>0. We are now dealing with an orthochronous Lorentz transformation.

A similar argument shows that ##\Lambda## preserves the order of events on the spatial axis when d>0 and reverses them when d<0. So we also assume that d>0. We are now dealing with a proper Lorentz transformation. A Lorentz transformation that's both proper and orthochronous is sometimes called a restricted Lorentz transformation.

Because of the above, we will write ##\Lambda## as
$$\Lambda=\gamma\begin{pmatrix}1 & \alpha\\ -v & \beta\end{pmatrix},$$ where ##\gamma,\beta>0##.
\begin{align}
\begin{pmatrix}-1 & 0\\ 0 & 1\end{pmatrix} &=\eta =\Lambda^T\eta\Lambda =\gamma^2\begin{pmatrix}1 & -v\\ \alpha & \beta\end{pmatrix}\begin{pmatrix}-1 & 0\\ 0 & 1\end{pmatrix}\begin{pmatrix}1 & \alpha\\ -v & \beta\end{pmatrix}\\
&=\gamma^2\begin{pmatrix}1 & -v\\ \alpha & \beta\end{pmatrix}\begin{pmatrix}-1 & -\alpha\\ -v & \beta\end{pmatrix} =\gamma^2\begin{pmatrix}-1+v^2 & -\alpha-v\beta\\ -\alpha-v\beta & -\alpha^2+\beta^2\end{pmatrix}
\end{align}
The 00 component of this equality tells us that ##-1=\gamma^2(-1+v^2)##, which implies both that |v|<1 (because ##\gamma^2>0##) and that $$\gamma=\frac{1}{\sqrt{1-v^2}}.$$ The 01 and 10 components both tell us that ##\gamma^2(-\alpha-v\beta)=0##, which implies that ##\alpha=-v\beta##. The 11 component tells us that ##\gamma^2(-\alpha^2+\beta^2)=1##. So
\begin{align}1-v^2 &=\frac{1}{\gamma^2} =-\alpha^2+\beta^2 =-\beta^2v^2+\beta^2=\beta^2(1-v^2)\\
\beta &=1\\
\alpha &= -v\beta=-v.
\end{align} So our final result for ##\Lambda## is
$$\Lambda=\frac{1}{\sqrt{1-v^2}}\begin{pmatrix}1 & -v\\ -v & 1\end{pmatrix}.$$
If you prefer to write this out as a system of equations,
$$\begin{pmatrix}t'\\ x'\end{pmatrix}=\gamma\begin{pmatrix}1 & -v\\ -v & 1\end{pmatrix}\begin{pmatrix}t\\ x\end{pmatrix}=\gamma\begin{pmatrix}t-vx\\ -vt+x\end{pmatrix}$$ \begin{align}
t' &=\gamma(t-vx)\\
x' &=\gamma(x-vt).
\end{align}
 
Last edited:
  • #15
Mentz114 said:
No, a is not \gamma. It can be anything you like. The values a < 1 are ruled out because we want a real result. This transformation keeps the interval invariant. It's still a long way short of the LT.

Frederik's calculation shows it's not trivial to get the LT from a few assumptions.

No, a is not simply anything you like. It's gamma. You can tell it's gamma because of the transformation's action on the t axis, which slants it with a slope v. Fredrik's calculation is unnecessarily complicated.
 
  • #16
bcrowell said:
No, a is not simply anything you like. It's gamma. You can tell it's gamma because of the transformation's action on the t axis, which slants it with a slope v. Fredrik's calculation is unnecessarily complicated.

Even if a were 1/v (say) the proper length would be invariant. a=γ and a=cosh(R) are two special cases. I'm addressing the question whether preserving the proper interval is sufficient to get the LT - and I'm asserting it is not.
 
  • #17
Mentz114 said:
Even if a were 1/v (say) the proper length would be invariant. a=γ and a=cosh(R) are two special cases. I'm addressing the question whether preserving the proper interval is sufficient to get the LT - and I'm asserting it is not.

You can't have a be anything but gamma, because v is defined by the action of the LT on the positive t axis.

Mentz114 said:
No, a is not \gamma.The values a < 1 are ruled out because we want a real result.

Actually this only rules out -1 \lt a lt 1. What rules out all values of a<1 is the definition of v.
 
  • #18
It is always assumed that the transformation is linear (at least if the origin is mapped to the origin, otherwise affine). But what is the physical reason for this assumption?
 
Last edited:
  • #19
bcrowell said:
You can't have a be anything but gamma, because v is defined by the action of the LT on the positive t axis.
..
..
Actually this only rules out -1 \lt a lt 1. What rules out all values of a<1 is the definition of v.
Ben, I think we're talking across each other so I'll let it go now.
 
  • #20
bcrowell said:
Fredrik's calculation is unnecessarily complicated.
In what way? What part of it can be simplified?
 
Last edited:
  • #21
Erland said:
It is always assumed that the transformation is linear (at least if the origin is mapped to the origin, otherwise affine). But what is the physical reason for this assumption?
The idea is that for each inertial (=non-accelerating) observer, there's a coordinate system in which the observer's own motion is described by the time axis, and the motion of any non-accelerating object is described by a straight line. So a function that changes coordinates from one of these coordinate systems to another must take straight lines to straight lines.
 
  • #22
Fredrik said:
The idea is that for each inertial (=non-accelerating) observer, there's a coordinate system in which the observer's own motion is described by the time axis, and the motion of any non-accelerating object is described by a straight line. So a function that changes coordinates from one of these coordinate systems to another must take straight lines to straight lines.
Hmm, are you saying something like that a map between vector spaces that takes lines to lines must be linear, or affine?
Well, that's certainly not true in one dimension, where the map f(x)=x^3 maps the entire line onto itself without being linear, or affine.
But perhaps in higher dimensions...? Is there a theorem of this kind?
 
  • #23
bob900 said:
In a book ("The special theory of relativity by David Bohm") that I'm reading, it says that if (x,y,z,t) are coordinates in frame A, and (x',y',z',t') are coordinates in frame B moving with v in realtion to A, if we have (for a spherical wavefront)

c^2t^2 - x^2 - y^2 - z^2 = 0

and we require that in frame B,

c^2t&#039;^2 - x&#039;^2 - y&#039;^2 - z&#039;^2 = 0

then it can be shown that the only possible transformations (x,y,z,t) -> (x',y',z',t') which leave the above relationship invariant are the Lorentz transformations (aside from rotations and reflections).

I'm wondering how exactly can this be shown?
I don't think Bohm said this! Lorentz group is a subgroup of a bigger group called the conformal group. It is the conformal group that preserves the light-cone structur.

Sam
 
  • #24
Didn't the paper that Ben mentioned in another thread, http://arxiv.org/abs/physics/0302045, go through all this?

The assumptions that that paper made were (skimming)

* replacing v with -v must invert the transform
* isotropy
*homogeneity of space and time

with a few tricks along the way:
* adding a third frame
* noting that x=vt implies x'=0

The result was pretty much that there must be some invariant velocity that was the same for all observers. (THere were some arguments about sign of a constant before this to establish that it was positive). The remaining step is to identify this with the speed of light.
 
  • #25
bob900 said:
So if given just the following pieces of information :

1. c^2 t^2 - x^2 - y^2 - z^2 = 0
2. c^2 t&#039;^2 - x&#039;^2 - y&#039;^2 - z&#039;^2 = 0

is it "difficult" or actually impossible to show that the Lorentz transformation is the only possibility (aside from rotation x^2+y^2+z^2=x&#039;^2+y&#039;^2+z&#039;^2 and t=t', and reflection x=-x', t=-t', etc.)?
That I know how to do - what I'm trying to see is if the book is wrong in saying that you only need 1 and 2 above. Here's a quote from the book :


Now Bohm is making sense.

see post #9 in
www.physicsforums.com/showthread.php?t=420204
 
  • #26
Erland said:
It is always assumed that the transformation is linear (at least if the origin is mapped to the origin, otherwise affine). But what is the physical reason for this assumption?
The most common reason is so-called homogeneity of space and time. By this, the authors mean that position-dependent (and time-dependent) dilations (scale changes) are ruled out arbitrarily.

Personally, I prefer a different definition of spacetime homogeneity: i.e., that it should look the same wherever and whenever you are. IOW, it must be a space of constant curvature.
This includes such things as deSitter spacetime, and admits a larger class of possibilities.

But another way that various authors reach the linearity assumption is to start with the most general transformations preserving inertial motion, which are fractional-linear transformations. (These are the most general transformations which map straight lines to straight lines -- see note #1.) They then demand that the transformations must be well-defined everywhere, which forces the denominator in the FL transformations to be restricted to a constant, leaving us with affine transformations.

In the light of modern cosmology, these arbitrary restrictions are becoming questionable.

--------
Note #1: a simpler version of Fock's proof can be found in Appendix B of this paper:
http://arxiv.org/abs/gr-qc/0703078c/0703078 by Guo et al.

An even simpler proof for the case of 1+1D can also be found in Appendix 1 of this paper:
http://arxiv.org/abs/physics/9909009 by Stepanov. (Take the main body of this paper with a large grain of salt, but his Appendix 1 seems to be ok, though it still needs the reader to fill in some of the steps -- speaking from personal experience. :-)
 
Last edited by a moderator:
  • #27
Erland said:
Hmm, are you saying something like that a map between vector spaces that takes lines to lines must be linear, or affine?
Well, that's certainly not true in one dimension, where the map f(x)=x^3 maps the entire line onto itself without being linear, or affine.
But perhaps in higher dimensions...? Is there a theorem of this kind?
The only book I know that suggests that there is such a theorem left the proof as an exercise. I tried to prove it a couple of years ago, but got stuck and put it aside. I just tried again, and I still don't see how to do it. It's pretty annoying. Three distinct vectors x,y,z are said to be collinear if they're on the same straight line. So x,y,z are collinear if and only if they're all different and there's a number a such that ##z=x+a(y-x)##, right? Note that the right-hand side is ##=(1-a)x+ay##. So three vectors are collinear if and only if they're all different and (any) one of them can be expressed as this special type of linear combination of the other two.

A linear transformation ##T:U\to V## is said to preserve collinearity if for all collinear x,y,z in U, Tx,Ty,Tz are collinear.

It's trivial to prove that linear maps preserve collinearity. Since ##T(ax+by)=aTx+bTy## for all a,b, we have ##T((1-a)x+ay)=(1-a)Tx+aTy## for all a.

I still haven't been able to prove that if T preserves collinearity, T is linear. Suppose that T preserves collinearity. Let x,y be arbitrary vectors and a,b arbitrary numbers. One idea I had was to rewrite ##T(ax+by)=T(ax+(1-a)z)##. All I have to do is to define ##z=by/(1-a)##. But this is a lot less rewarding than I hoped. All we can say now is that there's a number c such that
$$T(ax+by)=cTx+(1-c)Tz =cTx+(1-c)T\left(\frac{by}{1-a}\right).$$ The fact that we can't even carry the numbers a,b over to the right-hand side is especially troubling. I don't know, maybe I've misunderstood a definition or something.

The book I'm talking about is "Functional analysis: Spectral theory" by Sunder. It can be downloaded legally from the author's web page. Scroll down to the first horizontal line to find the download link. See exercise 1.3.1 (2) on page 9 (in the pdf, it may be on another page in the actual book). Edit: Direct link to the pdf.
 
Last edited:
  • #28
Fredrik said:
The only book I know that suggests that there is such a theorem left the proof as an exercise. I tried to prove it a couple of years ago, but got stuck and put it aside. I just tried again, and I still don't see how to do it. It's pretty annoying. [...]
I guess you didn't see my previous post #27, huh? :-)
 
  • #29
strangerep said:
I guess you didn't see my previous post #27, huh? :-)
Not until after I posted. I'm checking out those appendices now. I guess Sunder's exercise is just wrong then. No wonder I found it so hard to solve it. :smile:
 
  • #30
Fredrik said:
I guess Sunder's exercise is just wrong then.
He's restricting himself to the case of a vector space and linear transformations between them. But the more general case involves differentiable coordinate transformations on a more general manifold -- which is a different problem.

Edit: looking at his exercise, I think he means "##x,y,z## in ##V##", meaning that ##x,y,z## are vectors in ##V##. So the "straight line" also includes the origin. That makes his exercise almost trivial because "being on a straight line" means that the vectors are all simple multiples of each other (i.e., they're on the same ray), and linear transformations preserve this.

But this is somewhat tangential to the current issue since in relativity we want something more general which preserves continuous inertial motion.
 
Last edited:
  • #31
strangerep said:
He's restricting himself to the case of a vector space and linear transformations between them. But the more general case involves differentiable coordinate transformations on a more general manifold -- which is a different problem.

Edit: looking at his exercise, I think he means "##x,y,z## in ##V##", meaning that ##x,y,z## are vectors in ##V##. So the "straight line" also includes the origin.
Exercise 1.3.1 (2) is asking the reader to prove that if T (defined on ##\mathbb R^n##) takes straight lines to straight lines, then T is linear. The exercise also says something about mapping the domain onto W, but W is not defined. If he meant that W is the domain, he's also assuming that T is surjective.

I think he just meant that x,y,z are on the same line, not that they're all on the same line through the origin.
 
  • #32
strangerep said:
Note #1: a simpler version of Fock's proof can be found in Appendix B of this paper:
http://arxiv.org/abs/gr-qc/0703078c/0703078 by Guo et al.

An even simpler proof for the case of 1+1D can also be found in Appendix 1 of this paper:
http://arxiv.org/abs/physics/9909009 by Stepanov. (Take the main body of this paper with a large grain of salt, but his Appendix 1 seems to be ok, though it still needs the reader to fill in some of the steps -- speaking from personal experience. :-)
I started reading this, but so far I don't understand any of it. In the first one, the first thing the authors say after the word "Proof:" makes absolutely no sense to me. I don't understand anything in the first equation. I don't even understand if he's multiplying numbers with vectors (in that case, why does the last term look like a number?) or if it's a function taking a vector as input. It never ceases to amaze me how badly written published articles can be.

In the second one, I apply the chain rule to ∂f/∂t' and there appears a factor of ∂x/∂t' that I don't see how to deal with, so I don't understand (35). I guess I need to refresh my memory about partial derivatives of multivariable inverses.
 
Last edited by a moderator:
  • #33
Fredrik said:
I started reading this [Guo?], but so far I don't understand any of it. In the first one, the first thing the authors say after the word "Proof:" makes absolutely no sense to me. I don't understand anything in the first equation. I don't even understand if he's multiplying numbers with vectors (in that case, why does the last term look like a number?) or if it's a function taking a vector as input. It never ceases to amaze me how badly written published articles can be.
Yeah, it took me several months (elapsed time) before I understood what's going on here. You should see the original version in Fock's textbook -- it's even more obscure.

The crucial idea here is that the straight line is being parameterized in terms of an arbitrary real ##\lambda##. Also think of ##x_0^i## as an arbitrary point on the line so that ##\lambda## and ##v^i## generate the whole line. Then they adopt a confusing notation that ##x## is an abbreviation for the 3-vector with components ##x^i##. Using a bold font would have been more helpful.

But persevering with their notation, ##x = x(\lambda) = x_0 + \lambda v##. Since we want the transformed ##x'^{\,i}## to be a straight line also, in general parameterized by a different ##\lambda'## and ##v'##, we can write
$$
x'^{\,i}(x) ~=~ x'^{\,i}(x_0) ~+~ \lambda'(\lambda) \, v'^{\,i}
$$
where the first term on the RHS is to be understood as what ##x_0## is mapped into. I.e., think of ##x'^{\,i}## as a mapping. It might have been more transparent if they'd written ##x'^{\,i}_0## and then explained why this can be expressed as ##x'^{\,i}(x_0)##.

Confusing? Yes, I know that only too well. I guess it becomes second nature when one is working in this way all the time. Fock also does a lot of this sort of thing.

In the second one [Stepanov], I apply the chain rule to ∂f/∂t' and there appears a factor of ∂x/∂t' that I don't see how to deal with, so I don't understand (35).
Denoting partial derivatives by suffices in the same way as Stepanov does,
$$
dx' = df = f_x dx + f_t dt = (f_x u + f_t) dt ~;~~~~~~
dt' = dg = g_x dx + g_t dt = (g_x u + g_t) dt ~;
$$
and so Stepanov's (35) is obtained by
$$
u' ~=~ dx'/dt' ~=~ \frac{f_x u + f_t}{g_x u + g_t} ~.
$$

[Edit: I have more detailed writeups of both proofs where I try to fill in some of these gaps, but they're in ordinary latex, not PF latex. If you get stuck, I could maybe post a pdf.]
 
Last edited:
  • #34
strangerep said:
Yeah, it took me several months (elapsed time) before I understood what's going on here. You should see the original version in Fock's textbook -- it's even more obscure.

The crucial idea here is that the straight line is being parameterized in terms of an arbitrary real ##\lambda##. Also think of ##x_0^i## as an arbitrary point on the line so that ##\lambda## and ##v^i## generate the whole line. Then they adopt a confusing notation that ##x## is an abbreviation for the 3-vector with components ##x^i##. Using a bold font would have been more helpful.

But persevering with their notation, ##x = x(\lambda) = x_0 + \lambda v##. Since we want the transformed ##x'^{\,i}## to be a straight line also, in general parameterized by a different ##\lambda'## and ##v'##, we can write
$$
x'^{\,i}(x) ~=~ x'^{\,i}(x_0) ~+~ \lambda'(\lambda) \, v'^{\,i}
$$
where the first term on the RHS is to be understood as what ##x_0## is mapped into. I.e., think of ##x'^{\,i}## as a mapping. It might have been more transparent if they'd written ##x'^{\,i}_0## and then explained why this can be expressed as ##x'^{\,i}(x_0)##.

Confusing? Yes, I know that only too well. I guess it becomes second nature when one is working in this way all the time. Fock also does a lot of this sort of thing.
Thanks for explaining. I think I understand now. This notation is so bad it's almost funny. The coordinate transformation takes the straight line ##t\mapsto x_0+tv## to a straight line ##t\mapsto \Lambda(x_0)+tu##, where ##\Lambda## denotes the coordinate transformation and u denotes a tangent vector to the new straight line. That much is clear. Now it would make sense to write ##x_0'## instead of ##\Lambda(x_0)##, but these guys denote the components of this vector by ##x'^i(x_0)##??! I guess for all y, x'(y) should be read as "the primed coordinates of the event whose unprimed coordinates are y".

It doesn't make a lot of sense to put a prime on the λ, but I guess they're doing it as a reminder that if the old straight line is the map B defined by ##B(\lambda)=x_0+\lambda v##, then the new straight line isn't necessarily ##\Lambda\circ B##. It could be ##\Lambda\circ B\circ f##, where f is a "reparametrization". I really don't like that they write v' for the vector I denoted by u, because it suggests that ##v'=\Lambda v##.

I realized something interesting when I looked at the statement of the theorem they're proving. They're saying that if ##\Lambda## takes straight lines to straight lines, there's a 4×4 matrix A, two 4×1 matrices y,z, and a number c, such that
$$\Lambda(x)=\frac{Ax+y}{z^Tx+c}.$$
If we just impose the requirement that ##\Lambda(0)=0##, we get y=0. And if z≠0, there's always an x such that the denominator is 0. So if we also require that ##\Lambda## must be defined on all of ##\mathbb R^4##, then the theorem says that ##\Lambda## must be linear. Both of these requirements are very natural if what we're trying to do is to explain e.g. what the principle of relativity suggests about theories of physics that use ##\mathbb R^4## as a model of space and time.

strangerep said:
Denoting partial derivatives by suffices in the same way as Stepanov does,
$$
dx' = df = f_x dx + f_t dt = (f_x u + f_t) dt ~;~~~~~~
dt' = dg = g_x dx + g_t dt = (g_x u + g_t) dt ~;
$$
and so Stepanov's (35) is obtained by
$$
u' ~=~ dx'/dt' ~=~ \frac{f_x u + f_t}{g_x u + g_t} ~.
$$
Cool. This doesn't look rigorous, because dx and dt are independent variables when they first appear in this calculation, and then you use dx/dt=u. But it's certainly enough to convince me that the result is correct.

strangerep said:
[Edit: I have more detailed writeups of both proofs where I try to fill in some of these gaps, but they're in ordinary latex, not PF latex. If you get stuck, I could maybe post a pdf.]
Thanks for the offer. I'm not sure I'll have the time to look at this. I have to go to bed now, and I will be very busy in the near future. Actually, I think that for now, I'll just try to figure out the best way to use the two additional assumptions I suggested above to simplify the problem.
 
  • #35
Fredrik said:
I realized something interesting when I looked at the statement of the theorem they're proving. They're saying that if ##\Lambda## takes straight lines to straight lines, there's a 4×4 matrix A, two 4×1 matrices y,z, and a number c, such that
$$\Lambda(x)=\frac{Ax+y}{z^Tx+c}.$$
If we just impose the requirement that ##\Lambda(0)=0##, we get y=0. And if z≠0, there's always an x such that the denominator is 0. So if we also require that ##\Lambda## must be defined on all of ##\mathbb R^4##, then the theorem says that ##\Lambda## must be linear.
Yes, that's what I tried to explain in earlier posts.
Both of these requirements are very natural if what we're trying to do is to explain e.g. what the principle of relativity suggests about theories of physics that use ##\mathbb R^4## as a model of space and time.
But it gets trickier if you take a more physics-first approach to the foundations: by itself the relativity principle doesn't give you (flat) ##\mathbb R^4## as a model of space and time -- you've got to make some other assumptions about omnipresent rigid rods and standard clocks which might not be so reasonable in the large.
for now, I'll just try to figure out the best way to use the two additional assumptions I suggested above to simplify the problem.
If you mean "just assume linearity", the best physicist-oriented proof I've seen is in Rindler's SR textbook.
 
  • #36
strangerep said:
If you mean "just assume linearity", the best physicist-oriented proof I've seen is in Rindler's SR textbook.
I meant that I would like to prove that if ##\Lambda## is is a permutation of ##\mathbb R^4## (or ##\mathbb R^2##) that takes straight lines to straight lines, and 0 to 0, then ##\Lambda## is linear. I think I know how to do the rest after that, at least in 1+1 dimensions.
 
  • #37
It is shown by geometrical inspection that the Lorentz transformation is the only solution to accounting for the invariant speed of light. We begin with a graphical representation of three examples of observers moving at arbitrarily selected different speeds with respect to the black inertial frame of reference. The speed of light in the black inertial reference system is already known to have the value of c and is represented by the world line of a single photon (green line slanted at an angle of 45 degrees in the black frame).
Lorentz_2A.jpg

Next, we inquire as to what orientation of the X1 axis for each observer we must have for the speed of light to be invariant among the inertial frames. By trial and error inspection we can only have those orientations of the X1 axis for which the photon world line bisects the angle between the X1 axis and the X4 axis as shown below.

Lorentz_2B.jpg

So, based on this result we wish to derive the coordinate transformations between any two arbitrarily selected frames. Again by geometric inspection we identify a right triangle for which we can apply the Pythagorean Theorem. Notice that we have selected two of the moving observer frames, entirely arbitrarily, and then found a new black inertial frame for which two other inertial frames are moving in opposite directions with the same speed. This is a perfectly general situation, since for any pair of observers moving relative to each other, you can always find such a reference frame. Having derived the time dilation, the result for length contraction can easily be shown by similar triangle inspection.

Lorentz_Derivation_C.jpg
 
Last edited:
  • #38
strangerep said:
Yeah, it took me several months (elapsed time) before I understood what's going on here. You should see the original version in Fock's textbook -- it's even more obscure.

The crucial idea here is that the straight line is being parameterized in terms of an arbitrary real ##\lambda##. Also think of ##x_0^i## as an arbitrary point on the line so that ##\lambda## and ##v^i## generate the whole line. Then they adopt a confusing notation that ##x## is an abbreviation for the 3-vector with components ##x^i##. Using a bold font would have been more helpful.

But persevering with their notation, ##x = x(\lambda) = x_0 + \lambda v##. Since we want the transformed ##x'^{\,i}## to be a straight line also, in general parameterized by a different ##\lambda'## and ##v'##, we can write
$$
x'^{\,i}(x) ~=~ x'^{\,i}(x_0) ~+~ \lambda'(\lambda) \, v'^{\,i}
$$
where the first term on the RHS is to be understood as what ##x_0## is mapped into. I.e., think of ##x'^{\,i}## as a mapping. It might have been more transparent if they'd written ##x'^{\,i}_0## and then explained why this can be expressed as ##x'^{\,i}(x_0)##.

Confusing? Yes, I know that only too well. I guess it becomes second nature when one is working in this way all the time. Fock also does a lot of this sort of thing.
Ok, but a little bit further down in the proof, the author seems to use this, which is based upon a particular representation of a particular line, to draw conclusions about other lines at other positions, it is where he introduces a function f(x,v), and I don't understand this at all.

And still, the conclusion of the theorem seems wrong to me. It is nowhere stated that we must have n>1, and for n=1, the function f(x)=x^3+x seems to contradict the theorem, since it is a differentialble bijection from R (a line) onto itself, with a differentiable inverse, but f does not have the required form.
 
  • #39
Erland said:
Ok, but a little bit further down in the proof, the author [Guo et al] seems to use this, which is based upon a particular representation of a particular line, to draw conclusions about other lines at other positions, it is where he introduces a function f(x,v), and I don't understand this at all.
From their equation
$$
v^j v^k \, \frac{\partial^2 x'^{\,i}}{\partial x^j \partial x^k}
(x_0 + \lambda v)
~=~ v^j\,\frac{\partial x'^{\,i}}{\partial x^j} \,
\frac{\,\frac{d^2 \lambda'}{d\lambda^2}\,}{d\lambda'/d\lambda} ~,
$$
we see that ##\frac{d^2 \lambda'}{d\lambda^2}/\frac{d\lambda'}{d\lambda}## at ##(x^i)## depends not only on ##x^i## but also on ##v^i##. Therefore, there must exist a function ##f(x,v)## such that
$$
v^j v^k \, \frac{\partial^2 x'^{\,i}}{\partial x^j \partial x^k}
~=~ v^j \, \frac{\partial x'^{\,i}}{\partial x^j} \,f(x,v) ~.
$$
Strictly, ##f(x,v)## also depends on ##\lambda##, but this dependence is suppressed in the notation here, since we only need the fact that ##f## depends at least on ##x## and ##v##.
And still, the conclusion of the theorem seems wrong to me. It is nowhere stated that we must have n>1, and for n=1, the function f(x)=x^3+x seems to contradict the theorem, [...]
No, that 's ##n=2##, not ##n=1##.
Think of the (x,y) plane. A straight line on this plane can be expressed as
$$
y ~=~ y(x) ~=~ y_0 + s x ~.
$$ for some constants ##y_0## and ##s##.
Alternatively, the same straight line can be expressed in terms of a parameter ##\lambda## and constants ##v_x, v_y## as
$$
y = y(\lambda) ~=~ y_0 + \lambda v_y ~,~~~~~
x = x(\lambda) ~=~ \lambda v_x ~,
$$ and eliminating ##\lambda## gives the previous form, with ##s = v_y/v_x##.
That's what going on here: straight lines are expressed in the parametric form. Your cubic cannot be expressed in this form, hence is in no sense a straight line.
 
  • #40
strangerep said:
From their equation
$$
v^j v^k \, \frac{\partial^2 x'^{\,i}}{\partial x^j \partial x^k}
(x_0 + \lambda v)
~=~ v^j\,\frac{\partial x'^{\,i}}{\partial x^j} \,
\frac{\,\frac{d^2 \lambda'}{d\lambda^2}\,}{d\lambda'/d\lambda} ~,
$$
we see that ##\frac{d^2 \lambda'}{d\lambda^2}/\frac{d\lambda'}{d\lambda}## at ##(x^i)## depends not only on ##x^i## but also on ##v^i##. Therefore, there must exist a function ##f(x,v)## such that
$$
v^j v^k \, \frac{\partial^2 x'^{\,i}}{\partial x^j \partial x^k}
~=~ v^j \, \frac{\partial x'^{\,i}}{\partial x^j} \,f(x,v) ~.
$$
Strictly, ##f(x,v)## also depends on ##\lambda##, but this dependence is suppressed in the notation here, since we only need the fact that ##f## depends at least on ##x## and ##v##.
It is precisely this I don't understand. If we are talkning about a single line and its image, then ##v## is a constant vector, a direction vector of the line, and then it doesn't seem meaningful to take a function depending upon it.
If, on the other hand, we are talking about several, perhaps all, lines and their images, then the problem is that the parametric equations of the lines are not unique, we can freely choose between points on the line and parallell direction vectors, and it is hard to see how we can associate one such choice for the image line with one for the original line in a consistent way. How can then ##f(x,v)## be well defined?
strangerep said:
No, that 's ##n=2##, not ##n=1##.

[---]

Your cubic cannot be expressed in this form, hence is in no sense a straight line.
No, I am not talking about the curve ##y=f(x)=x^3+x## in ##R^2##. I talk about ##f## as a transformation from ##R^1## to itself. In ##R^1##, there is only one line, ##R^1## itself, and it is mapped onto itself by ##f##.
 
  • #41
Erland said:
No, I am not talking about the curve ##y=f(x)=x^3+x## in ##R^2##. I talk about ##f## as a transformation from ##R^1## to itself. In ##R^1##, there is only one line, ##R^1## itself, and it is mapped onto itself by ##f##.

Remember that for the one dimensional case it doesn't make sense to single out mappings of straight lines to straight lines since they all are "straight lines", curvature for one-dimensional objects is only extrinsic unlike what happens in higher dimension spaces.
So even if you want to restrict the function to the real line, you need the 2-dimensional representation as strangerep pointed out if you want to make any distinction between linearity and non-linearity of lines(curves).
 
  • #42
TrickyDicky said:
Remember that for the one dimensional case it doesn't make sense to single out mappings of straight lines to straight lines since they all are "straight lines",
That's precisely why it's disturbing that the theorem doesn't assume that the dimension of the vector space is at least 2. Since every ##f:\mathbb R\to\mathbb R## takes straight lines to straight lines, the theorem says that there are numbers a,b such that
$$f(x)=ax+b$$
for all x in the domain. Actually it says that there are numbers a,b,c,d such that
$$f(x)=\frac{ax+b}{cx+d}$$
for all x in the domain, but since we're considering an f with domain ℝ, we must have c=0, and this allows us to define a'=a/d, b'=b/d. Since there are lots of other functions from ℝ to ℝ, the theorem is wrong.

It's possible that the only problem with the theorem is that it left out a statement that says that the dimension of the vector space must be at least 2, but then the proof should contain a step that doesn't work in 1 dimension. (I still haven't studied the proof, so I have no opinion).
 
  • #43
One dimensional vector spaces? That would be scalars, in linear algebra the vector spaces are assumed to be of dimension 2 or higher, aren't they?
 
  • #44
TrickyDicky said:
One dimensional vector spaces? That would be scalars, in linear algebra the vector spaces are assumed to be of dimension 2 or higher, aren't they?
No, they can even be 0-dimensional. That would be a set with only one member. (Denote that member by 0. Define addition and scalar multiplication by 0+0=0, and a0=0 for all scalars a. The triple ({0},addition,scalar multiplication) satisfies the definition of a vector space). 0-dimensional vector spaces are considered "trivial". ℝ is a 1-dimensional real vector space.
 
  • #45
Fredrik said:
No, they can even be 0-dimensional. That would be a set with only one member. (Denote that member by 0. Define addition and scalar multiplication by 0+0=0, and a0=0 for all scalars a. The triple ({0},addition,scalar multiplication) satisfies the definition of a vector space). 0-dimensional vector spaces are considered "trivial". ℝ is a 1-dimensional real vector space.
Sure, I'm not saying they can't be defined in those dimensions, by assumed I referred to the usually found in linear transformations involving velocities.
 
Last edited:
  • #46
I think most theorems in linear algebra hold for any finite-dimensional vector space. But I'm sure there are some that only hold when the dimension is ≥2, and some that only hold when it's ≥3.
 
  • #47
Erland said:
[...] If, on the other hand, we are talking about several, perhaps all, lines and their images, then the problem is that the parametric equations of the lines are not unique, we can freely choose between points on the line and parallel direction vectors, and it is hard to see how we can associate one such choice for the image line with one for the original line in a consistent way. How can then ##f(x,v)## be well defined?
We're talking about all lines and their images. The idea is that, for any given line, pick a parameterization, and find mappings such that the image is still a (straight) line, in some parameterization of the same type. The ##f(x,v)## is defined in terms of whatever parameterization we chose initially.

No, I am not talking about the curve ##y=f(x)=x^3+x## in ##R^2##. I talk about ##f## as a transformation from ##R^1## to itself. In ##R^1##, there is only one line, ##R^1## itself, and it is mapped onto itself by ##f##.
But that case is irrelevant to the physics applications here since there's only one component ##x^i## (which I'll just write as ##x##), hence the notion of velocity cannot be defined since one needs at least ##n=2## for that so we can write ##dx/dt##.

In your ##n=1## objection, ##x'## is parallel (or antiparallel) to ##x##. Afaict, this means that the 2nd derivatives in the proof such as
$$
\frac{\partial^2 x'{^i}}{\partial x^j \, \partial x^k}
$$
always vanish. Probably this is a degenerate case, though I haven't tracked it through to find precisely where this affects things. The authors are interested in ##dx/dt## which is an ##n\ge 2## case, hence probably didn't bother with that subtlety. Maybe the proof should have a caveat about ##n\ge 2##, but for the intended physics applications, this doesn't change anything.

BTW, note that Stepanov's proof does not use the parameterization technique used by Guo et al, but rather works directly with 1+1D spacetime, requiring that the condition of zero acceleration is preserved. This is more physically intuitive, and less prone to subtle oversights.
 
Last edited:
  • #48
I may as well go ahead and complete the derivation for the Lorentz transformations (boost). So, continuing from the previous time dilation derivation (post #37) we identify congruent triangles from which an easy derivation of the length contraction follows.

Lorentz_2D.jpg
 
  • #49
strangerep said:
The most common reason is so-called homogeneity of space and time. By this, the authors mean that position-dependent (and time-dependent) dilations (scale changes) are ruled out arbitrarily.

Personally, I prefer a different definition of spacetime homogeneity: i.e., that it should look the same wherever and whenever you are. IOW, it must be a space of constant curvature.
This includes such things as deSitter spacetime, and admits a larger class of possibilities.

But another way that various authors reach the linearity assumption is to start with the most general transformations preserving inertial motion, which are fractional-linear transformations. (These are the most general transformations which map straight lines to straight lines -- see note #1.) They then demand that the transformations must be well-defined everywhere, which forces the denominator in the FL transformations to be restricted to a constant, leaving us with affine transformations.

In the light of modern cosmology, these arbitrary restrictions are becoming questionable.

--------
Note #1: a simpler version of Fock's proof can be found in Appendix B of this paper:
http://arxiv.org/abs/gr-qc/0703078c/0703078 by Guo et al.

An even simpler proof for the case of 1+1D can also be found in Appendix 1 of this paper:
http://arxiv.org/abs/physics/9909009 by Stepanov. (Take the main body of this paper with a large grain of salt, but his Appendix 1 seems to be ok, though it still needs the reader to fill in some of the steps -- speaking from personal experience. :-)

I think this post is exposing the central problematic. Lorentz transformations are stronghly related to a pragmatic necessity: inertial observers must have the sensation that the essential properties of the space are presserved (one peculiar example is the length element).

Conversely, does it mean that non-inertial observers must use different transformations than the Lorentz's ones? If yes, which ones?
 
Last edited by a moderator:
  • #50
Anyone see a simple proof of the following less general statement? If ##\Lambda:\mathbb R^n\to\mathbb R^n## is a bijection that takes straight lines to straight lines, and takes 0 to 0, then ##\Lambda## is linear.

Feel free to add assumptions about differentiability of ##\Lambda## if you think that's necessary.

I've got almost nothing so far. I can see that given an arbitrary vector x and an arbitrary real number t, there's a real number s such that ##\Lambda(tx)=s\Lambda(x)##. This means that there's a function ##s:\mathbb R^n\times\mathbb R\to\mathbb R## such that ##\Lambda(tx)=s(x,t)\Lambda(x)## for all x,t. For all x, we have ##0=\Lambda(0)=\Lambda(0x)=s(x,0)\Lambda(x)##. This implies that ##s(x,0)=0## for all ##x\neq 0##. We should be able to choose our s such that s(0,0)=0 as well.

I don't see how to proceed from here, and I don't really see how to begin with the evaluation of ##\Lambda(x+y)## where x,y are arbitrary. One idea I had was to let r be a number such that x+y is on the line through rx and ry. (If x,y are non-zero, there's always such a number. And if one of x,y is zero, there's nothing to prove). Then there's a number t such that
$$\Lambda(x+y)=(1-t)\Lambda(rx)+t\Lambda(ry)=(1-t)s(x,r)\Lambda(x)+ts(y,r)\Lambda(y).$$ But I don't see how to use this. If we want to turn the above into a "For all x,y" statement, we must write t(x,y) instead of t.

By the way, one of the reasons why I think there should be a simple proof is that this was an exercise in the book I linked to in post #27. Unfortunately the author didn't even mention that the map needs to take 0 to 0, so there's definitely something wrong with the exercise, but perhaps that omission is the only thing wrong with it. The author also assumed that the map is a surjection (onto a vector space W), rather than a bijection.
 
Last edited:
Back
Top