Did I Make a Mistake in Computing the Gradient on a Unitary Group?

Kreizhn · Mar 21, 2011

Hopefully this is a simple enough question.

Let (M,g) be a matrix Riemannian manifold and [itex]f: M \to \mathbb R[/itex] a smooth function. Take [itex]p \in M[/itex] and let [itex]\{ X_1,\ldots, X_n \}[/itex] be a local orthonormal frame for a neighbourhood of p. We can define a gradient of f in a neighbourhood of p as
[tex]\nabla f = \sum_{i=1}^n X_i f X_i[/tex]

Now in my situation, I'm actually trying to compute this is a local coordinate system. Let [itex]\{ x^{jk} \}[/itex] be a local coordinate system and write [itex]X_i = a_i^{jk} \frac{\partial}{\partial x^{jk}}[/itex]. Then our gradient, evaluated at (for simplicity) p becomes
[tex]\begin{align*}<br /> (\nabla f)_p &= \sum_{i=1}^n \underbrace{\left( a_i^{jk} \left.\frac{\partial}{\partial x^{jk}}\right|_p \right)}_{X_i} f \underbrace{\left(a_i^{mn} \left.\frac{\partial}{\partial x^{jk}}\right|_p\right)}_{X_i} \\<br /> &= \sum_{i=1}^n a_i^{jk} a_i^{mn} \frac{\partial f}{\partial x^{jk}}(p) \left.\frac{\partial}{\partial x^{jk}}\right|_p<br /> \end{align*}[/tex]

Now we know that [itex](X_i f)(p) = X_i(p) f \in \mathbb R[/itex] if we use the derivation definition of tangent vectors. But this term simply corresponds to
[tex]a_i^{jk} \frac{\partial f}{\partial x^{jk}}(p) \in \mathbb R[/tex]

Now here is my problem. Let's take [itex]M = U(N)[/itex] the unitary group of dimension N, and [itex]f(p) = \text{tr}[y^\dagger p]\text{tr}[p^\dagger y ][/itex]
for some [itex]y \in U(N)[/itex]. We know that [itex]f(p) \in \mathbb R[/itex] since [itex]\langle y,p \rangle = \text{tr}[y^\dagger p ][/itex] is just the Hilbert-Schmidt inner-product, so f(p) is nothing more that [itex]f(p) = |\langle y,p \rangle|^2.[/itex]. But when I compute the terms
[tex]a_i^{jk} \frac{\partial f}{\partial x^{jk}}(p)[/tex]
I get values in [itex]\mathbb C[/itex]. In fact, I can't see any reason why this HAS to be in [itex]\mathbb R[/itex]. Did I just make a silly mistake?

zhentil · Mar 21, 2011

I'm quite confused. How can any of the terms not be real? By definition of local coordinates, the X_i are R-linear combinations of the basis vectors.

Kreizhn · Mar 22, 2011

Hey zhentil,

I'm glad that you've somehow tracked down my other posts as well and responded to them!

It might be that there is a flaw in my understanding. So if I may, perhaps you could tell me if what I'm doing here is correct:

My issue first came when trying to figure out how elements of the tangent space acted as derivations if we assigned them a matrix representation. I figured that the matrix, like the vector, simply represents the coefficients of the tangent element in a prescribed set of coordinates, so that if I have a matrix [itex]V \in T_p M[/itex] for some [itex]p \in M[/itex] then if V has a representation
[tex]V = \begin{pmatrix} v^{11} & v^{12} & \cdots & v^{1n} \\ v^{21} & v^{22} & \cdots & v^{2n} \\<br /> \vdots & \vdots & \ddots & \vdots \\ v^{n1} & v^{n2} & \cdots & v^{nn} \end{pmatrix}[/tex]
then what I'm really expressing is
[tex]V = v^{ij} \frac{\partial}{\partial x^{ij} }[/tex]
for some local coordinate system in a neighbourhood of p. Is this correct?

If it is correct, then the next issue is what happens when we look at matrices in [itex]U(N)[/itex], whose elements are naturally allowed to be in [itex]\mathbb C[/itex] despite the fact that it is a real-Lie group with real-Lie algeba [itex]\mathfrak{u}(N)[/itex]? Is it that each element [itex]v^{ij} = v_{\mathbb R}^{ij} + i v_{i\mathbb R}^{ij}[/itex] and this requires me to actually double the size of my coordinate system?

zhentil · Mar 22, 2011

If you're talking about trivializing the tangent space and computing a Jacobian, you need to use real tangent vectors, i.e. embed U(n) into GL(2n, R), etc.

But why not try computing it using the definition of the directional derivative? I imagine it would be much easier here. You could use complex matrices and avoid the headache.

Kreizhn · Mar 23, 2011

I'm not sure what you mean by "trivializing." Do you mean in the same sense as we assign a trivialization to a fibre-bundle in order to associate each fibre to a vector space?

I'm actually trying to use the "directional derivative" approach here. Namely, I'm using the definition of the tangent space that recognizes tangent elements as derivations. I'm just not certain how to make a matrix representation of a tangent element into a derivation.

I realize I made a mistake in the above post, and the V should be
[tex]V = v^{ij} \left. \frac{\partial}{\partial x^{ij}} \right|_p[/tex]
which we can see will assign a function to it's directional derivative in the direction of V, albeit in this case the [itex]v^{ij}[/itex] are complex.

Can I mix both suggestions and embed U(n) into GL(2n,R) and then use a coordinate basis to make the [itex]v^{ij}[/itex] real? What is the best way to do this. If memory serves, we normally use a symplectic representation of [itex]\mathbb C[/itex] right? Namely
[tex]\mathbb C = \left\{ \begin{pmatrix} a & b \\ -b & a \end{pmatrix} : a,b \in \mathbb R \right\}[/tex]
So that the decomposition [itex]a + ib[/itex] becomes [itex]a I + bJ[/itex] where I is the identity element and J is the symplectic element which endows our space with almost-complex structure. Yes?

Did I Make a Mistake in Computing the Gradient on a Unitary Group?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Strain Tensor Based on Clifford Algebra

Graduate Nonautonomous Lie derivative

Graduate Equivalent definitions of tensor field

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight