Formulating a Method of Steepest Ascent on Lie Groups

Click For Summary

Discussion Overview

The discussion revolves around the formulation of a method of steepest ascent for optimizing a differentiable function defined on a compact Lie group. Participants explore the challenges of generalizing traditional optimization techniques, particularly the concept of the gradient, to the context of Lie groups.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant proposes starting with a point on the identity component of a compact Lie group and using the logarithm to facilitate optimization, suggesting the addition of a scaled gradient to this logarithm.
  • Another participant suggests fixing a bi-invariant Riemannian metric to define gradients, indicating that this could help in the optimization process.
  • A participant questions whether defining the gradient using a Riemannian metric would place it in the tangent space at the point of interest rather than at the identity, raising concerns about the implications for the optimization method.
  • Another reply confirms the correct definition of the gradient and suggests using pushforwards on the left-multiplication map to relate vectors in the tangent spaces, noting that the invariance of the metric should maintain consistency with the gradient.

Areas of Agreement / Disagreement

Participants express differing views on the appropriate method for defining the gradient in the context of Lie groups, with some proposing the use of a Riemannian metric while others raise concerns about the implications of this choice. The discussion remains unresolved regarding the best approach to generalizing the method of steepest ascent.

Contextual Notes

There are limitations regarding the assumptions about the properties of the Riemannian metric and its implications for the gradient's placement within the tangent spaces of the Lie group.

Mandelbroth
Messages
610
Reaction score
23
Suppose we have a compact Lie group ##G##, and a differentiable function ##f:G_0\to\mathbb{R}## from the identity component of ##G## to the real numbers. I'm looking to maximize the value of this function.

Being something of a neophyte at optimization, especially of this kind, I decided to stick with something I thought I knew well: the method of steepest ascent. Long story short, I'm having trouble with generalizing the concept to Lie groups.

My idea was fairly simple. We start with some point ##p## on the identity component of ##G##, and then we take the logarithm of this point (that is, we take the inverse of the exponential map) because we want to add something to it. As a note, I justified this by saying that, if the point had more than one value for the logarithm, I could just pick one (every point on ##G_0## has at least one logarithm). Then, I would add some multiple of ##\nabla f_p## to ##\log(p)##, and finish by exponentiating to get ##p'=\exp(c\nabla f_p + \log(p))=\exp(c\nabla f_p)p##. Repeat.

The problem is, I can't figure out what I would use for the analogue of the gradient. Maybe I'm just not seeing something? I don't know. Any nudge in the right direction would be greatly appreciated. Thank you.
 
Last edited:
Physics news on Phys.org
Mandelbroth said:
The problem is, I can't figure out what I would use for the analogue of the gradient.

Not sure if this helps (the question is way outside my area of study), but you can always fix a (bi-invariant) Riemannian metric and define gradients using that.
 
jgens said:
Not sure if this helps (the question is way outside my area of study), but you can always fix a (bi-invariant) Riemannian metric and define gradients using that.
I'm unfamiliar with this concept, but Wikipedia claims to know something about this. To fact-check, you're suggesting that I could introduce a Riemannian metric ##g## and define ##\nabla f_p## by ##g_p(\nabla f_p, X_p)=X_p(f)##?

Wouldn't that put ##\nabla f_p## on ##T_p G## and not ##T_e G## (where ##e## is the identity on ##G##), though?
 
Correct definition. You could also use pushforwards on the relevant left-multiplication map to map vectors TGp into vectors TGe. Invariance of the metric should ensure this is pretty well-behaved with respect to the gradient too.
 
Last edited:

Similar threads

  • · Replies 0 ·
Replies
0
Views
3K
  • · Replies 8 ·
Replies
8
Views
4K
  • · Replies 16 ·
Replies
16
Views
6K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K