Did I Make a Mistake in Computing the Gradient on a Unitary Group?

  • Context: Graduate 
  • Thread starter Thread starter Kreizhn
  • Start date Start date
  • Tags Tags
    Coordinates Gradient
Click For Summary

Discussion Overview

The discussion revolves around the computation of the gradient on a unitary group, specifically focusing on the mathematical framework of Riemannian manifolds and the implications of using complex coordinates in this context. Participants explore the properties of tangent vectors, directional derivatives, and the nature of functions defined on the unitary group.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant defines the gradient of a smooth function on a matrix Riemannian manifold and expresses it in terms of local coordinates, raising a question about the nature of the resulting values.
  • Another participant expresses confusion regarding the expectation that terms should be real, questioning the implications of local coordinates on the nature of the tangent vectors.
  • A participant seeks clarification on the representation of tangent space elements as derivations and whether the matrix representation of these elements requires a doubling of the coordinate system due to the complex nature of matrices in the unitary group.
  • One reply suggests embedding the unitary group into a real general linear group to compute the gradient, proposing the use of real tangent vectors.
  • Another participant expresses uncertainty about the term "trivializing" and discusses the use of directional derivatives, indicating a need for clarity on how to represent tangent elements as derivations.
  • A participant acknowledges a mistake in their previous post regarding the representation of tangent vectors and explores the possibility of mixing suggestions to embed the unitary group into a real space while maintaining a complex structure.

Areas of Agreement / Disagreement

Participants express differing views on the nature of the terms involved in the gradient computation, with some asserting that they must be real while others question this assumption. The discussion remains unresolved regarding the best approach to represent tangent elements and compute the gradient.

Contextual Notes

There are unresolved questions regarding the assumptions made about the nature of tangent vectors in complex spaces and the implications of using different representations for computations. The discussion also touches on the limitations of the definitions used in the context of the unitary group.

Kreizhn
Messages
714
Reaction score
1
Hopefully this is a simple enough question.

Let (M,g) be a matrix Riemannian manifold and f: M \to \mathbb R a smooth function. Take p \in M and let \{ X_1,\ldots, X_n \} be a local orthonormal frame for a neighbourhood of p. We can define a gradient of f in a neighbourhood of p as
\nabla f = \sum_{i=1}^n X_i f X_i

Now in my situation, I'm actually trying to compute this is a local coordinate system. Let \{ x^{jk} \} be a local coordinate system and write X_i = a_i^{jk} \frac{\partial}{\partial x^{jk}}. Then our gradient, evaluated at (for simplicity) p becomes
\begin{align*}<br /> (\nabla f)_p &amp;= \sum_{i=1}^n \underbrace{\left( a_i^{jk} \left.\frac{\partial}{\partial x^{jk}}\right|_p \right)}_{X_i} f \underbrace{\left(a_i^{mn} \left.\frac{\partial}{\partial x^{jk}}\right|_p\right)}_{X_i} \\<br /> &amp;= \sum_{i=1}^n a_i^{jk} a_i^{mn} \frac{\partial f}{\partial x^{jk}}(p) \left.\frac{\partial}{\partial x^{jk}}\right|_p<br /> \end{align*}<br />

Now we know that (X_i f)(p) = X_i(p) f \in \mathbb R if we use the derivation definition of tangent vectors. But this term simply corresponds to
a_i^{jk} \frac{\partial f}{\partial x^{jk}}(p) \in \mathbb R

Now here is my problem. Let's take M = U(N) the unitary group of dimension N, and f(p) = \text{tr}[y^\dagger p]\text{tr}[p^\dagger y ]
for some y \in U(N). We know that f(p) \in \mathbb R since \langle y,p \rangle = \text{tr}[y^\dagger p ] is just the Hilbert-Schmidt inner-product, so f(p) is nothing more that f(p) = |\langle y,p \rangle|^2.. But when I compute the terms
a_i^{jk} \frac{\partial f}{\partial x^{jk}}(p)
I get values in \mathbb C. In fact, I can't see any reason why this HAS to be in \mathbb R. Did I just make a silly mistake?
 
Physics news on Phys.org
I'm quite confused. How can any of the terms not be real? By definition of local coordinates, the X_i are R-linear combinations of the basis vectors.
 
Hey zhentil,

I'm glad that you've somehow tracked down my other posts as well and responded to them!

It might be that there is a flaw in my understanding. So if I may, perhaps you could tell me if what I'm doing here is correct:

My issue first came when trying to figure out how elements of the tangent space acted as derivations if we assigned them a matrix representation. I figured that the matrix, like the vector, simply represents the coefficients of the tangent element in a prescribed set of coordinates, so that if I have a matrix V \in T_p M for some p \in M then if V has a representation
V = \begin{pmatrix} v^{11} &amp; v^{12} &amp; \cdots &amp; v^{1n} \\ v^{21} &amp; v^{22} &amp; \cdots &amp; v^{2n} \\<br /> \vdots &amp; \vdots &amp; \ddots &amp; \vdots \\ v^{n1} &amp; v^{n2} &amp; \cdots &amp; v^{nn} \end{pmatrix}
then what I'm really expressing is
V = v^{ij} \frac{\partial}{\partial x^{ij} }
for some local coordinate system in a neighbourhood of p. Is this correct?

If it is correct, then the next issue is what happens when we look at matrices in U(N), whose elements are naturally allowed to be in \mathbb C despite the fact that it is a real-Lie group with real-Lie algeba \mathfrak{u}(N)? Is it that each element v^{ij} = v_{\mathbb R}^{ij} + i v_{i\mathbb R}^{ij} and this requires me to actually double the size of my coordinate system?
 
If you're talking about trivializing the tangent space and computing a Jacobian, you need to use real tangent vectors, i.e. embed U(n) into GL(2n, R), etc.

But why not try computing it using the definition of the directional derivative? I imagine it would be much easier here. You could use complex matrices and avoid the headache.
 
I'm not sure what you mean by "trivializing." Do you mean in the same sense as we assign a trivialization to a fibre-bundle in order to associate each fibre to a vector space?

I'm actually trying to use the "directional derivative" approach here. Namely, I'm using the definition of the tangent space that recognizes tangent elements as derivations. I'm just not certain how to make a matrix representation of a tangent element into a derivation.

I realize I made a mistake in the above post, and the V should be
V = v^{ij} \left. \frac{\partial}{\partial x^{ij}} \right|_p
which we can see will assign a function to it's directional derivative in the direction of V, albeit in this case the v^{ij} are complex.

Can I mix both suggestions and embed U(n) into GL(2n,R) and then use a coordinate basis to make the v^{ij} real? What is the best way to do this. If memory serves, we normally use a symplectic representation of \mathbb C right? Namely
\mathbb C = \left\{ \begin{pmatrix} a &amp; b \\ -b &amp; a \end{pmatrix} : a,b \in \mathbb R \right\}
So that the decomposition a + ib becomes a I + bJ where I is the identity element and J is the symplectic element which endows our space with almost-complex structure. Yes?
 

Similar threads

  • · Replies 15 ·
Replies
15
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 11 ·
Replies
11
Views
4K
  • · Replies 12 ·
Replies
12
Views
5K
  • · Replies 36 ·
2
Replies
36
Views
6K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 0 ·
Replies
0
Views
2K