[ALGEBRA] Unitary Matrixes and length conservation

libelec
Messages
173
Reaction score
0

Homework Statement



Prove that the Unitary matrixes are the only ones that preserve the length of the vectors.

The Attempt at a Solution



It's an iff, so I have to prove that a) If the matrix is Unitary, then it preserves the length and b) If the matrix preserves the length, then it's Unitary

I could only solve a) (using the canonic inner product for R)

Let it be A \in \RenXn an unitary matrix, x \in \Ren, then (Ax, Ax) = (Ax)T(Ax) = xTATAx = xTx, because ATA = I, for A is unitary, then xTx = (x,x). Then, A preserves the length.

But I don't know how to prove b).
 
Physics news on Phys.org
You can start from the fact that it preserves the length: (Ax, Ax) = (Ax)T(Ax) = xTATAx = xTx (as you already said).
What does this tell you about ATA?

(Sorry, didn't bother to make the superscripts, I'm sure you see what I mean).
 
That ATA must be I? Because it could also happen that x is eigenvector of A and AT associated to the eigenvalue 1 (which is the reason why I thought that reasoning was going nowhere).
 
That can happen. But in general, it is not.
Remember, ATA is always the same, but x is an arbitrary vector.
 
libelec said:
...But I don't know how to prove b).

It is critical that you use the fact that it is unitary only if it preserves the lengths of all vectors.

Clearly you can have a non-unitary matrix which leaves one dimension alone but say doubles length in another.

Given this, it is then sufficient to show it is true for a basis since the action is linear.

That should be sufficient for you to solve the second part.

[EDIT: I may have been hasty there... let me think about whether this is a good approach.]
[EDIT2: Actually you don't need to invoke the basis per se. You should be able to use the fact that given (x , M x) = (x,x) for all x M must be the identity. That one is easy enough to show by resolving it i.t.o. basis and inner product properties that you should be able to take it as a standing lemma.]
 
Last edited:
It may be useful to recall this trick that is useful in a lot of problems:

If you have an expression (like ATA), and you think it should be equal to some other expression (like I) then it is useful to try and study how the two expressions are different.

e.g. one might try and look at the properties of ATA-I.
 
jambaugh said:
[EDIT2: Actually you don't need to invoke the basis per se. You should be able to use the fact that given (x , M x) = (x,x) for all x M must be the identity. That one is easy enough to show by resolving it i.t.o. basis and inner product properties that you should be able to take it as a standing lemma.]
I assume you'll catch this error yourself shortly, but just in case -- that lemma is very not true. (consider pretty much any nontrivial example)
 
Hurkyl said:
I assume you'll catch this error yourself shortly, but just in case -- that lemma is very not true. (consider pretty much any nontrivial example)

Hmmm... I guess I should have said (y, M x) = (y,x) for all x, and all y implies M=1.

But for a counter-example to my error clearly it is true if M is a normal operator then (x,Mx)=(x,x) must imply all eigen-values are one and M is I.

What is more if it is not true then there are non-unitary length preserving matrices... proof would spoil the homework so I'll PM it to you.

[EDIT] Its true and usible if M is normal or particularly Hermitian which is sufficient for the exercise here. I requested a counter example via PM but feel free to present it here. I'm stumped.
 
Last edited:
jambaugh said:
... I'm stumped.

Arrrg! I got it now. 1 + shift operator.
 
  • #10
jambaugh said:
Arrrg! I got it now. 1 + shift operator.

Do you mean 1+skew symmetric?
 
  • #11
Dick said:
Do you mean 1+skew symmetric?
No, that would still be a normal operator. Shift operator is nilpotent.
(e.g. S+ = Sx+iSy Pauli spin raising operator)

General counter example is 1 + any nilpotent matrix N^k = 0 for some k. Thus (x,Nx)=0 since (x,N^k x)=0, (x,(1+N)x) = (x,1x).
Yes I am an idgit!
 
  • #12
jambaugh said:
No, that would still be a normal operator. Shift operator is nilpotent.
(e.g. S+ = Sx+iSy Pauli spin raising operator)

General counter example is 1 + any nilpotent matrix N^k = 0 for some k. Thus (x,Nx)=0 since (x,N^k x)=0, (x,(1+N)x) = (x,1x).
Yes I am an idgit!

Sure, it's normal, but what's wrong with that? We are working over a real space. The eigenvectors might not be real. M=[[1,-1],[1,1]] satisfies (x^T)Mx=(x^T)x for all x. The antisymmetric part drops out. I don't see what nilpotent buys you.
 
  • #13
Dick said:
Sure, it's normal, but what's wrong with that? We are working over a real space. The eigenvectors might not be real. M=[[1,-1],[1,1]] satisfies (x^T)Mx=(x^T)x for all x. The antisymmetric part drops out. I don't see what nilpotent buys you.

So it does (in real space) And my example doesn't!
Arrrg!
I'm a bigger igit than I thought! (My excuse is I'm trying to absorb CUDA programming right now and its taking up all my neural resources!)

I assumed (since the OP referred to a Unitary rather than Orthogonal matrx) that we were talking general complex Hilbert spaces and not real (or complex orthogonal) inner product space. But I see that he did specify real space.

Your example clearly wouldn't be true for the eigen-vectors e.g. x=(1,i)^T. (Trying to save a little face here!)

Back to the OP's problem...
I can see my suggestions were all wrong.
Perhaps the trick of (A(x+y) | A(x+y) ) expanded.
Ahhh yes that will do the trick and use only the properties of the inner product!
Hope I didn't give too much away there.
 
  • #14
Aha! You never said that you were defining M = ATA -- I thought you were using M to refer to what the OP called A, so I misunderstood what you were getting at.
 
  • #15
jambaugh said:
So it does (in real space) And my example doesn't!
Arrrg!
I'm a bigger igit than I thought! (My excuse is I'm trying to absorb CUDA programming right now and its taking up all my neural resources!)

I assumed (since the OP referred to a Unitary rather than Orthogonal matrx) that we were talking general complex Hilbert spaces and not real (or complex orthogonal) inner product space. But I see that he did specify real space.

Your example clearly wouldn't be true for the eigen-vectors e.g. x=(1,i)^T. (Trying to save a little face here!)

Back to the OP's problem...
I can see my suggestions were all wrong.
Perhaps the trick of (A(x+y) | A(x+y) ) expanded.
Ahhh yes that will do the trick and use only the properties of the inner product!
Hope I didn't give too much away there.

My example would be true for (1,i)^T (though it's not part of the real space)! Just plug it in. Remember the inner product is the REAL inner product (x,x)=x^Tx. Not the COMPLEX inner product x^(T*)x. If you promote the whole problem to a complex space, then yes, you are right. But be careful when you do that. What's true in the complex space is not necessarily true in the real space.
 
Last edited:
  • #16
Dick said:
My example would be true for (1,i)^T (though it's not part of the real space)! Just plug it in. Remember the inner product is the REAL inner product (x,x)=x^Tx. Not the COMPLEX inner product x^(T*)x. If you promote the whole problem to a complex space, then yes, you are right. But be careful when you do that. What's true in the complex space is not necessarily true in the real space.

If we're talking unitary operators in the complex extension then the inner product ( | ) they preserve must be the hermitian inner product of the Hilbert space (2nd case). The promotion to complex space is ambiguous with either R-orthogonal -> C-orthogonal or
R-orthogonal -> unitary. I was keying of the "unitary" term in the OP.

w.r.t. termonology I think of your "COMPLEX" one as the "real" inner product in the sense that it yields real norms on all (complex) vectors. Or more precisely that the invariance group is a real Lie group U(V) rather than the complex group SO(V;C).
But I think we understand each other beyond choice of terms.
 
  • #17
Hurkyl said:
It may be useful to recall this trick that is useful in a lot of problems:

If you have an expression (like ATA), and you think it should be equal to some other expression (like I) then it is useful to try and study how the two expressions are different.

e.g. one might try and look at the properties of ATA-I.

You say I should study xTATAx = xTx, then xTATAx - xTx = O, then xT(ATA - I)x = O?

The problem with that is that x could belong to the kernel of ATA - I, and I don't see how I could manage to disregard that (the other possibilities: that ATA is I, or that x is the O of its vectorial space).
 
  • #18
libelec,
I think you'll find a simpler demonstration of b.) starting with the fact that A will preserve the length of (x+y) for arbitrary x and y. Expand and see what happens.
 
  • #19
You really ought to use that A^TA is self-adjoint.
 
  • #20
libelec said:
You say I should study xTATAx = xTx, then xTATAx - xTx = O, then xT(ATA - I)x = O?

The problem with that is that x could belong to the kernel of ATA - I, and I don't see how I could manage to disregard that (the other possibilities: that ATA is I, or that x is the O of its vectorial space).

If x belongs to the kernel of ATA - I for all vectors x then the kernel of ATA - I is all of the vector space. Undoubtedly you have seen that the only map with this property is the null map O.
 
  • #21
libelec said:
You say I should study xTATAx = xTx, then xTATAx - xTx = O, then xT(ATA - I)x = O?

The problem with that is that x could belong to the kernel of ATA - I, and I don't see how I could manage to disregard that (the other possibilities: that ATA is I, or that x is the O of its vectorial space).

Here's another possibility. Let N be the matrix [[0,1],[-1,0]] in R^2. Then x^TNx=0 for all x. Why can't A^TA-I be that kind of map? I really don't understand why SOMEONE doesn't bring up the notion of self-adjointness.
 
  • #22
jambaugh said:
libelec,
I think you'll find a simpler demonstration of b.) starting with the fact that A will preserve the length of (x+y) for arbitrary x and y. Expand and see what happens.

I haven't seen anything that would help me:

<A(x+y)|A(x+y)> = <Ax + Ay|Ax + Ay> = xTATAx + xTATAy + yTATAx + yTATAy = <(x + y)|(x + y)> = xTx + xTy + yTx + yTy. So:

xTATAx + xTATAy + yTATAx + yTATAy = xTx + xTy + yTx + yTy.

And then I could only figure out the same thing I tried before with <Ax|Ax>.

Dick said:
You really ought to use that A^TA is self-adjoint.

How? I don't see how that helps. ATA = (ATA)T, then what?
 
  • #23
libelec said:
I haven't seen anything that would help me:

<A(x+y)|A(x+y)> = <Ax + Ay|Ax + Ay> = xTATAx + xTATAy + yTATAx + yTATAy = <(x + y)|(x + y)> = xTx + xTy + yTx + yTy. So:

xTATAx + xTATAy + yTATAx + yTATAy = xTx + xTy + yTx + yTy.

And then I could only figure out the same thing I tried before with <Ax|Ax>.



How? I don't see how that helps. ATA = (ATA)T, then what?
You almost have it! Remember your assumption that A is norm preserving and remember your conclusion is the definition of Unitarity i.e. inner product preserving. Apply your assumption and seek your conclusion.
 
  • #24
libelec said:
I haven't seen anything that would help me:

<A(x+y)|A(x+y)> = <Ax + Ay|Ax + Ay> = xTATAx + xTATAy + yTATAx + yTATAy = <(x + y)|(x + y)> = xTx + xTy + yTx + yTy. So:

xTATAx + xTATAy + yTATAx + yTATAy = xTx + xTy + yTx + yTy.

And then I could only figure out the same thing I tried before with <Ax|Ax>.



How? I don't see how that helps. ATA = (ATA)T, then what?

You have been getting a lot of bad advice and I'm not sure why. Some of these people should know better. You have <x|Mx>=<x|x> where M=A^TA, right? So M is self adjoint. A self adjoint operator has a complete set of eigenvectors. Use that.
 
  • #25
Dick said:
You have been getting a lot of bad advice and I'm not sure why. Some of these people should know better. You have <x|Mx>=<x|x> where M=A^TA, right? So M is self adjoint. A self adjoint operator has a complete set of eigenvectors. Use that.

OK, let me know if I just made a mistake, please:

<Ax|Ax> = xTATAx. Since ATA is always a symmetric matrix in R, then it's diagonalizable by an orthogonal matrix P. Then

<Ax|Ax> = xTATAx = xTPDATAPTx = xTx, with DATA the diagonal matrix of ATA that loads its eigenvalues.

Then, each column i of the product matrix is xTiPi\sigmaiPTixi, with \sigmai an eigenvalue of ATA. Since it's a scalar number, this is the same as \sigmaixTiPiPTixi, and since PPT = I, for it's orthogonal, then this is equal to \sigmaixTixi = xiTxi, iff \sigmai = 1 for every i. Then ATA has to be I, iff A is orthogonal (unitary in C).

I think there's something wrong (especially when I consider each column, I commute the eigenvalue), I don't know if this is what you meant?
 
  • #26
libelec said:
OK, let me know if I just made a mistake, please:

<Ax|Ax> = xTATAx. Since ATA is always a symmetric matrix in R, then it's diagonalizable by an orthogonal matrix P. Then

<Ax|Ax> = xTATAx = xTPDATAPTx = xTx, with DATA the diagonal matrix of ATA that loads its eigenvalues.

Then, each column i of the product matrix is xTiPi\sigmaiPTixi, with \sigmai an eigenvalue of ATA. Since it's a scalar number, this is the same as \sigmaixTiPiPTixi, and since PPT = I, for it's orthogonal, then this is equal to \sigmaixTixi = xiTxi, iff \sigmai = 1 for every i. Then ATA has to be I, iff A is orthogonal (unitary in C).

I think there's something wrong (especially when I consider each column, I commute the eigenvalue), I don't know if this is what you meant?

That's what I meant all right. But you are overcomplicating the notation. The eigenvectors ei of A^TA span the vector space (that's what diagonalizable means). If (A^TA)ei=ri*ei (ri is the eigenvalue of ei), then ei^T(A^TA)ei=ri*ei^Tei=ei^Tei. (Since x^T(A^TA)x=x^Tx). So, sure, all of the eigenvalues are 1. But now any vector v can be written as a linear combination of those eigenvectors. So A^TAv=v for all v. So v=I.
 
  • #27
Dick said:
That's what I meant all right. But you are overcomplicating the notation. The eigenvectors ei of A^TA span the vector space (that's what diagonalizable means). If (A^TA)ei=ri*ei (ri is the eigenvalue of ei), then ei^T(A^TA)ei=ri*ei^Tei=ei^Tei. (Since x^T(A^TA)x=x^Tx). So, sure, all of the eigenvalues are 1. But now any vector v can be written as a linear combination of those eigenvectors. So A^TAv=v for all v. So v=I.

Thanks, finally I understood.
 
Back
Top