Mazur-Ulam theorem (bijective isometries are affine maps)

  • Context: Graduate 
  • Thread starter Thread starter Fredrik
  • Start date Start date
  • Tags Tags
    Theorem
Click For Summary

Discussion Overview

The discussion centers around the Mazur-Ulam theorem, which states that if a function is a bijective isometry between normed spaces, then it is an affine map. Participants explore the proof details, definitions, and implications of the theorem, raising questions about specific claims and the nature of affine maps in both real and complex normed spaces.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • Some participants question the definition of an affine map and how it implies linearity, particularly in the context of the function f and its behavior with respect to scalar multiplication.
  • Concerns are raised about the claim that the map ##\psi(x) = 2z - x## is an isometry, with examples provided that suggest it may not hold in certain cases.
  • There is a discussion about the implications of proving that ##f\left(\frac{a+b}{2}\right)=\frac{1}{2}f(a)+\frac{1}{2}f(b)##, with participants debating whether this condition is sufficient to conclude that f is affine.
  • A participant introduces the idea that if E is a Hilbert space, the proof simplifies, and they provide a detailed argument supporting this claim.
  • Another participant raises a question about the equivalence of conditions defining affine maps in the context of complex normed spaces, suggesting that the equivalence may not hold as it does in the real case.
  • One participant provides a counterexample involving the complex conjugate function to illustrate that the equivalence of conditions defining affine maps may fail in complex spaces.

Areas of Agreement / Disagreement

Participants express differing views on the implications of certain definitions and the validity of claims made in the proof of the Mazur-Ulam theorem. There is no consensus on whether the theorem holds in complex normed spaces, and the discussion remains unresolved regarding the appropriate definition of "affine map" in that context.

Contextual Notes

Participants note that the proof's reliance on specific properties of functions and their behavior under scalar multiplication may not directly translate between real and complex normed spaces, highlighting the need for careful consideration of definitions and conditions.

Fredrik
Staff Emeritus
Science Advisor
Homework Helper
Insights Author
Gold Member
Messages
10,876
Reaction score
423
I've been studying the proof of the Mazur-Ulam theorem in the pdf linked to at the end of this Wikipedia article. I'm struggling with some details in that pdf.

Theorem: Let E and F be arbitrary normed spaces. If ##f:E\to F## is a bijective isometry, then f is an affine map.

Some of the things I don't get:

1. Their definition of "affine map". How does ##f(tx+(1-t)y)=tf(x)+(1-t)f(y)## for all ##x,y\in X## and all ##t\in[0,1]## imply that f-f(0) is linear? I don't see how to deal with (f-f(0))(ax) where ##a\in\mathbb R## is arbitrary.

2. The claim that ##\psi## is an isometry. ##\psi:E\to E## is defined by ##\psi(x)=2z-x## for all ##x\in E##. This map sends an arbitrary point in E to the point that's "on the opposite side of z", i.e. the point y such that y=z+(z-x). They claim that ##\psi## is an isometry, but consider e.g. ##E=\mathbb R##, z=2, x=1. We have ##\psi(1)=2\cdot 2-1=3##, but ##\|\psi(1)\|=3\neq 1=\|1\|##.

3. If ##\psi## isn't an isometry, then I don't see a reason to think that the map the author denotes by g* should be an isometry either. It's defined by ##g^*=\psi\circ g^{-1}\circ\psi\circ g##, where g is a bijective isometry. The step ##\|g^*(z)-z\|\leq\lambda## relies on ##g^*## being an isometry.

4. All they're proving is that for all ##a,b\in E##, we have ##f\left(\frac{a+b}{2}\right)=\frac 1 2 f(a)+\frac 1 2 f(b)##. It's not obvious that this implies that f is affine.
 
Last edited:
Physics news on Phys.org
Fredrik said:
1. Their definition of "affine map". How does ##f(tx+(1-t)y)=tf(x)+(1-t)f(y)## for all ##x,y\in X## and all ##t\in[0,1]## imply that f-f(0) is linear? I don't see how to deal with (f-f(0))(ax) where ##a\in\mathbb R## is arbitrary.

So the claim is that if ##f## is an affine map such that ##f(0)=0##, then ##f## is linear. First, take ##t\in [0,1]## arbitrary, then
f(tx) = f(tx + (1-t)0) = tf(x) + (1-t)f(0) = tf(x)

Now it follows that
\frac{1}{2}f(x+y) = f(\frac{1}{2} x+ \frac{1}{2}y) = \frac{1}{2} f(x) + \frac{1}{2}f(y)
so ##f(x+y) = f(x) + f(y)## and ##f## is additive.

Now take ##\lambda\geq 0## arbitrary. Then we can write ##\lambda = nt## for some positive integer ##n## and some ##t\in [0,1]##. Then
f(\lambda x) = f(ntx) = f(tx + ...+ tx) = f(tx) + ... + f(tx) = tf(x) + ... + tf(x) = ntf(x) = \lambda f(x)

Now take ##\lambda<0##, then

0 = f(0) = f(\lambda x - \lambda x) = f(\lambda x) -\lambda f(x)
Thus ##f(\lambda x) = \lambda f(x)## and ##f## is linear.

2. The claim that ##\psi## is an isometry. ##\psi:E\to E## is defined by ##\psi(x)=2z-x## for all ##x\in E##. This maps sends an arbitrary point in E to the point that's "on the opposite side of z", i.e. the point y such that y=z+(z-x). They claim that ##\psi## is an isometry, but consider e.g. ##E=\mathbb R##, z=2, x=1. We have ##\psi(1)=2\cdot 2-1=3##, but ##\|\psi(1)\|=3\neq 1=\|1\|##.

You are verifying the property ##\|\psi(x)\| = \|x\|##, but this is not the property that you want to verify. The property is ##\|\psi(x) - \psi(y)\| =\|x-y\|##. This is the one that holds for ##\psi##. The two properties are equivalent, but only for linear maps.

4. All they're proving is that for all ##a,b\in E##, we have ##f\left(\frac{a+b}{2}\right)=\frac 1 2 f(a)+\frac 1 2 f(b)##. It's not obvious that this implies that f is affine.

So, it suffices to show that if ##f\left(\frac{a+b}{2}\right)=\frac 1 2 f(a)+\frac 1 2 f(b)## and ##f(0) = 0##, then ##f## is linear.

Note that
f\left(\frac{a+0}{2}\right)=\frac 1 2 f(a)+\frac 1 2 f(0)
implies that ##f(a/2) = f(a)/2##. And as above, we can prove ##f(a+b) = f(a) + f(b)## now.
By induction, it follows easily that
f\left(\frac{1}{2^n}x\right)=\frac{1}{2^n} f(x)
Now if ##t= c/2^n## for some ##c\in \{0,...,2^n\}## then it follows easily (since the map is additive) that
f(tx) = tf(x)

Now, fix ##x## and define ##g:[0,1]\rightarrow E:t\rightarrow f(tx) - tf(x)##. This function is continuous and vanishes on the dense set ##\{c/2^n~\vert~c\in \{0,...,2^n\}\}##. So ##g## vanishes everywhere. Thus ##f(tx) = tf(x)## for all ##t\in [0,1]##. As above, we can now prove that ##f## is linear.
 
Awesome reply. :approve: I don't know how you manage to explain everything I'm stuck on so fast, but I'm glad that you do.
 
You might be interested that if ##E## is a (real) Hilbert space, then things simplify considerably.

Let ##\varphi## be an surjective isometry, then . Thus ##\|\varphi(x) - \varphi(y)\| = \|x-y\|##.

Let ##\psi(x) = \varphi(x) - \varphi(0)##, then ##\psi## is a surjective isometry which satisfies ##\psi(0) = 0##.

Note that in a Hilber space, we have ##\|x - y\|^2 = \|x\|^2 + \|y\|^2 - 2<x,y>##. Thus

<br /> \begin{eqnarray*}<br /> 2&lt;\psi(x),\psi(y)&gt; <br /> &amp; = &amp; \|\psi(x) - \psi(0)\|^2 + \|\psi(y) - \psi(0)\|^2 -\|\psi(x) - \psi(y)\|^2\\<br /> &amp; = &amp; \|x\|^2 -\|y\|^2 - \|x-y\|^2\\<br /> &amp; = &amp; 2&lt;x,y&gt;<br /> \end{eqnarray*}<br />

Now let ##z## be arbitrary. Then ##z=\psi(y)## for some ##y##. Then

<br /> \begin{eqnarray*}<br /> &lt;\psi(\alpha a +\beta b),z&gt;<br /> &amp; = &amp; &lt;\psi(\alpha a + \beta b),\psi(y)&gt;\\<br /> &amp; = &amp; &lt;\alpha a + \beta b, y&gt;\\<br /> &amp; = &amp; \alpha &lt;a,y&gt; + \beta &lt;b,y&gt;\\<br /> &amp; = &amp; \alpha &lt;\psi(a),\psi(y)&gt; + \beta &lt;\psi(b),\psi(y)&gt;\\<br /> &amp; = &amp; &lt;\alpha \psi(a) + \beta \psi(b), z&gt;<br /> \end{eqnarray*}<br />

This holds for all ##z##, thus ##\psi(\alpha a + \beta b) = \alpha \psi(a) + \beta \psi(b)## and ##\psi## is linear.
 
micromass said:
You might be interested that if ##E## is a (real) Hilbert space, then things simplify considerably.
Yes, I am interested in that. Thanks for posting it.

I have a followup about the first issue I brought up in post #1. After reading your reply, I see that if X and Y are normed spaces over ℝ, the following conditions on a function ##f:X\to Y## are equivalent.

(a) ##f-f(0)## is linear.
(b) There's a linear ##L:X\to Y## and a ##y\in Y## such that ##f(x)=Lx+y## for all ##x\in X##.
(c) For all ##t\in[0,1]## and all ##x,y\in X##, we have ##f(tx+(1-t)y)=tf(x)+(1-t)f(y)##.

My question is, is this still true if X and Y are normed spaces over ℂ? Define F=f-f(0). Your method shows that (c) implies that F is ℝ-linear. But is it ℂ-linear? I think I see how to deal with arbitrary complex numbers if it's true that F(ix)=iF(x) for all x. But I don't see how to deal with F(ix).

As far as I can tell, this has no relevance to the validity of the proof of the Mazur-Ulam theorem that we've been discussing, since we prove (a) directly, not (c). I'm just curious if (c) works as a definition of "affine map" even in the complex case.
 
I don't think it holds. Take ##\mathbb{C}\rightarrow \mathbb{C}:z\rightarrow \overline{z}##. Then this map is easily checked to be affine, but it's not linear. So (a) and (b) are not equivalent anymore.

Note that the same example shows that Mazur-Ulam fails in complex vector spaces.
 
micromass said:
I don't think it holds. Take ##\mathbb{C}\rightarrow \mathbb{C}:z\rightarrow \overline{z}##. Then this map is easily checked to be affine, but it's not linear. So (a) and (b) are not equivalent anymore.

Note that the same example shows that Mazur-Ulam fails in complex vector spaces.

Ah, I should have thought of that example. It satisfies (c), but not (a) or (b). (a) and (b) are so trivially equivalent that you probably forgot that I wrote them down as two separate statements.

(a) and (b) are equivalent, and imply (c). But (c) doesn't imply (a) or (b), if "linear" now means "ℂ-linear".

I'm not entirely clear on which condition is the appropriate definition of "affine map" in the context of complex normed spaces. I suspect that it's (a). In that case, Mazur-Ulam doesn't hold for complex normed spaces. But if it's (c), then it does hold.

I have typed up my version of the proof from the pdf (completed by your insights) for my notes. It seems to me that that what we actually prove is that if ##f:X\to Y## is a bijective isometry between normed spaces, then f-f(0) is ℝ-linear. This implies that f-f(0) satisfies (c). But your example shows that in general, it doesn't satisfy (a) (with "linear" meaning "ℂ-linear").
 
I think the right notion of of affine map is the following. Let k be a field. Then an affine map of a k-vector space E to a k-vector space F must satisfy

f(\lambda_1 x_1 + ... + \lambda_n x_n) = \lambda_1 f(x_1) + ... + \lambda_n f(x_n)

for any ##\lambda_1 + ... + \lambda_n = 1##

This is again a new definition that didn't show up yet. However, it can be shown that it's always equivalent to (a).
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 26 ·
Replies
26
Views
1K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 7 ·
Replies
7
Views
1K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 0 ·
Replies
0
Views
1K
  • · Replies 17 ·
Replies
17
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K