Constrained Optimization via Lagrange Multipliers

In summary: Lagrange multiplier method to work?Again, there is no unique solution as the problem is stated (provided c is differentiable). The point of this thread is to find an alternative to the Lagrange multiplier method that does not have this problem with parallel gradients at the optimal point.In summary, the conversation discusses a problem with using Lagrange multipliers in a constrained optimization problem, as the gradients of the objective function and constraint may not be parallel. The participants suggest using a different method, such as penalty functions, and discuss the non-linearity and uniqueness of the solution.
  • #1
Kreizhn
743
1
Hi,

I'm trying to do a constrained optimization problem. I shall omit the details as I don't think they're important to my issue. Let [itex] f:\mathbb R^n \to \mathbb R [/itex] and [itex] c:\mathbb R^n \to \mathbb R^+\cup\{0\}[/itex] be differentiable functions, where [itex] \mathbb R^+ = \left\{ x \in \mathbb R : x> 0 \right\} [/itex]. The problem I want to solve is
[tex] \min_{\vec x\in\mathbb R^n} f(\vec x), \quad \text{ subject to } c(\vec x) = 0 [/tex]

Now all constrained optimization procedures that I know of use Lagrange multipliers. In this case, the Lagrangian would be
[tex] L(\vec x,\lambda) = f(\vec x) - \lambda c(\vec x) [/tex]
and the key to this theory is that at the optimal point [itex] \vec x^* [/itex] the gradients are parallel, so that
[tex] \nabla f(\vec x^*) = \lambda \nabla c(\vec x^*) [/itex]
However, every x that satisfies [itex] c(\vec x)=0[/itex] is a minima of [itex]c(\vec x)[/itex], we have that [itex] \nabla c(\vec x) =\vec 0 [/itex] for all feasible points [itex] \vec x[/itex]. Hence the gradient of the constraint is orthogonal to every feasible point and Lagrange multipliers breaks down.

Is there a way to fix this? To change the constraint so that the gradient is non-zero? Should I switch to a non-Lagrange multiplier method? If such a method exists, what are they called? Any help would be appreciated.
 
Mathematics news on Phys.org
  • #2
Kreizhn said:
Hi,

I'm trying to do a constrained optimization problem. I shall omit the details as I don't think they're important to my issue. Let [itex] f:\mathbb R^n \to \mathbb R [/itex] and [itex] c:\mathbb R^n \to \mathbb R^+\cup\{0\}[/itex] be differentiable functions, where [itex] \mathbb R^+ = \left\{ x \in \mathbb R : x> 0 \right\} [/itex]. The problem I want to solve is
[tex] \min_{\vec x\in\mathbb R^n} f(\vec x), \quad \text{ subject to } c(\vec x) = 0 [/tex]

Now all constrained optimization procedures that I know of use Lagrange multipliers. In this case, the Lagrangian would be
[tex] L(\vec x,\lambda) = f(\vec x) - \lambda c(\vec x) [/tex]
and the key to this theory is that at the optimal point [itex] \vec x^* [/itex] the gradients are parallel, so that
[tex] \nabla f(\vec x^*) = \lambda \nabla c(\vec x^*) [/itex]
However, every x that satisfies [itex] c(\vec x)=0[/itex] is a minima of [itex]c(\vec x)[/itex], we have that [itex] \nabla c(\vec x) =\vec 0 [/itex] for all feasible points [itex] \vec x[/itex].
Are you saying, that, for this particular c, c(x)= 0 for all x such that x is a minimum of c(x)? What about maxima? They will also have [itex]\nabla c= 0[/itex]. And, of course, if both maxima and minima of c occur when c(x)= 0, c is identically 0. That's not much of a constraint!

Hence the gradient of the constraint is orthogonal to every feasible point and Lagrange multipliers breaks down.

Is there a way to fix this? To change the constraint so that the gradient is non-zero? Should I switch to a non-Lagrange multiplier method? If such a method exists, what are they called? Any help would be appreciated.
Well, you could, going back to Calc I methods, solve the constraint equation for one of the variables in terms of the others. That would reduce to a problem in one less dimension with no constraint.
 
  • #3
Hey. Thanks for the reply and sorry it took me so long to reply in turn.

Perhaps I should have mentioned that c(x) is differentiable. In any case, the fact that the set [itex] S = c^{-1}(0) = \{ x \in \mathbb R^n : c(x) = 0 \} [/itex] corresponds to minima of c is found from the codomain of c given by [itex] \mathbb R^+ \cup \{0\} [/itex]. I think it's clear from this and differentiability then that any such [itex] x \in S [/itex] will be a min of c(x) and hence [itex] \nabla c(x) \equiv \vec 0, \forall x \in S[/itex]. I'm not sure why you mentioned maxima when they cannot possibly occur in the feasible set. I can further guarantee that [itex] S \neq \emptyset [/itex].
Thus the feasible set of solutions to
[tex] \min_{\vec x \in \mathbb R^n} f(\vec x)[/tex]
with [itex] c(x) = 0 [/itex] acting as an active constraint implies that the solution must occur in S. This is where the problem with the Lagrange multipliers comes in since the gradient of f and c cannot possibly be parallel.

Also, the constraint is highly non-linear so it is impossible to isolate any variables.

Do you know of a method other than say, penalty functions, that deals with constrained optimization without the use of lagrange multipliers?

I don't think it will help, but here are the functions:
[tex] f(\vec x) = \sum_{i=1}^n x_i^2 [/tex]
[tex] c(\vec x) = 2^{m+1} - 2\Re\text{Tr}\left[X_d^\dagger X_f(x) \right] [/tex]
where [itex] X_d \in \mathfrak{SU}(2^m) [/itex] and [itex] X_f: \mathbb R^n \to \mathfrak{SU}(2^m) [/itex] is defined by
[tex] X_f(x) = \prod_{i=1}^n \exp\left[ -i H_i x_i \right] [/tex]
for [itex] H_i \in \mathfrak{su}(2^m), i=1,\ldots, n [/itex].
 
  • #4
If you have some sort of a feasibility certificate, then every x that is a minimum does not make too much of a sense. Either your function is not convex so you have local minima or your objective is somewhat flat so there is no unique solution. Maybe you need more constraints to make some sense out of that equation constraints. Also put the equation constraints as [itex]x\leq 0, x \geq 0[/itex] if you already didn't.
 
  • #5
Thanks for the reply trambolin. I'm not certain about what you mean by
...then every x that is a minimum does not make too much of a sense.
The set of feasible points are precisely those which satisfy c(x) = 0. Any such x is a minimum of c(x). However, c(x) is not the function we are now trying to minimize. We are trying to minimize f(x) given that c(x) = 0. It's like we're trying to find the best [itex] \vec x [/itex] that simultaneously satisfies
[tex] \min f(\vec x), \min c(\vec x) [/tex]

To my knowledge we do not normally talk about the convexity of the objective function in the case of constrained optimization. The implication that the local min of the function is the global min is meaningless unless we are fortunate enough that the min lies within the feasible set. Whether the uniqueness condition of convexity holds or not for the non-linear case I'm not certain about, but nonetheless the issue is that we are unable to find solutions as L.M methods break down.

Theoretically, there are no more constraints to add. This is equivalent to a fixed arc number, terminal-state time-optimal control problem. The constraint equation is precisely that the Lie-group restricted evolution of the identity matrix driven by the control fields [itex] H_i [/itex] arrive at a desired matrix; that is, that our initial state arrives at the desired final state. We now want to minimize the time in which this happens.
 
  • #6
If you have a feasibility certificate (say first order, Slater, or second order etc. ) you have a feasible set right? So, there are some x such that c(x) = 0. You say every x is a minimum of c(x). Now what I am saying is this. Why do you even bother about this? It is the f(x) that should be the point of interest. You take any x from the feasible set and plug into f(x) and that gives you a number. And you minimize over this. Now if f is not convex OR if your set is not path connected you run into problems such as getting stuck in a local minimum or not having all the feasible points in the optimization scheme. Try Yalmip wiki for some additional technical details. What you write here as a question is the essence of the optimality. Hence the lagrange multiplier does not break down it tells you that you find the optimal (arguably local) point.
 
  • #7
What do you mean "why do you even bother with [c(x)]"? This is the essence of equality constrained optimization. The constraints are what define the feasible set; that is, the feasible set is precisely the level set of the constraint function. Furthermore, the theory of Lagrange multipliers for implementing solutions makes use of the critical fact that the solution will occur at a point at which the gradient of the constraints and the objective function are parallel.

Now here's the thing, it is that theory of Lagrange multipliers from which the Karush-Kuhn-Tucker conditions are derived, and consequently the Slater, first and second order optimality conditions. The fact that the level set if c(x) = 0 also coincides with minima of c(x) means that all such optimality conditions break down. In particular, the statement that if [itex] \mathcal L(x,\lambda) [/itex] is the Lagrangian, then at the optimal point in the state/costate space [itex] (x^*,\lambda^*) [/itex]
[tex] \nabla_x \mathcal L(x^*,\lambda^*) = 0 \Rightarrow \nabla f(x^*) = \lambda^* \nabla c(x^*) [/tex]
This condition will never be satisfied. However, we know a solution exists because not only is the feasible set non-empty (which we can prove), it's also believed (heuristically) to be totally disconnected. Thus while we would be satisfied with an infimizing solution, it seems the set is at most countable within an arbitrary radius of the origin.

Furthermore, since we do not have a priori knowledge of the location of all members of the feasible set, we cannot apply combinatorial optimization techniques to the problem. Hence we rely on relaxation techniques of the feasible domain. Hence why Lagrangian methods have hitherto been employed.
 
  • #8
If you need further evidence, note that the proof of the first-order optimality conditions (the KKT conditions) require that the optimal solution satisfy the linear independence constraint qualification (LICQ). In this case the active constraint gradient is identically zero, which is itself a linearly dependent vector. Hence KKT and anything derived from it cannot be used to solve this problem.
 
  • #10
Also I forgot to ask if the manifold is linear or not that you optimize over. And also if you checked the dual?
 
  • #11
Yes, there seem to be some issues in communicating our points across.

The ambient space of the domain is linear as it is just [itex] \mathbb R^n [/itex]. The zero-level set of the constraint is not linear.

I have not checked the dual, but to my understanding maximizing the dual will not give you the solution unless you have a constraint qualification. LICQ most certainly does not hold, so that seems to be moot.
 

1. What is constrained optimization?

Constrained optimization is a mathematical technique used to find the maximum or minimum value of a function subject to certain constraints. These constraints can be in the form of equations or inequalities.

2. What are Lagrange multipliers?

Lagrange multipliers are mathematical tools used to solve constrained optimization problems. They allow us to convert a constrained optimization problem into an unconstrained one, which is easier to solve.

3. How do Lagrange multipliers work?

Lagrange multipliers work by introducing a new variable, known as the multiplier, into the objective function of a constrained optimization problem. This allows us to take into account the constraints while finding the optimal solution.

4. What is the benefit of using Lagrange multipliers?

The use of Lagrange multipliers simplifies the process of solving constrained optimization problems by converting them into unconstrained problems. This makes it easier to find the optimal solution using traditional optimization techniques.

5. What are some real-world applications of constrained optimization via Lagrange multipliers?

Constrained optimization via Lagrange multipliers is used in a variety of fields, including economics, physics, engineering, and operations research. Some examples of real-world applications include finding the most efficient way to allocate resources in a company, optimizing production processes, and maximizing profits while adhering to certain constraints.

Similar threads

Replies
7
Views
1K
  • General Math
Replies
2
Views
739
  • General Math
Replies
1
Views
625
  • Calculus and Beyond Homework Help
Replies
1
Views
559
  • Calculus and Beyond Homework Help
Replies
8
Views
468
  • Calculus and Beyond Homework Help
Replies
4
Views
837
  • Calculus and Beyond Homework Help
Replies
6
Views
1K
  • General Math
Replies
10
Views
1K
  • Math POTW for Graduate Students
Replies
2
Views
654
  • General Math
Replies
10
Views
2K
Back
Top