maistral said:
You have no idea how teary-eyed and satisfied I am right now.
View attachment 205244
At the bare minimum, I believe, I know how to execute the method and I have an idea how the method works, even a little bit. Thank you very much. Now I'm trying to mix the ALM with barrier functions

and I'll see what happens. Thank you very much again!
Why does the method work?
Look at solving ##\min f(x)##, subject to ##g(x) =0## by a quadratic penalty method. For a fixed, finite penalty parameter ##m > 0## the unconstrained penalized problem is ##\min Z_m = f(x) + m[g(x)]^2.## In general, for finite ##m## the solution ##x_m## will not satisfy ##g(x) = 0##, but instead will have ##g(x_m) = r_m##, some non-zero value. The optimality condition is ##\nabla_x Z_m = 0##, so ##\nabla_x f(x_m) + 2 m r_m \nabla_x g(x_m) = 0.## This is the first-order condition for the problem ##\min f(x)## subject to ##g(x) = r_m## for the nonzero value ##r_m##; thus, we have the optimal solution of a "nearby" problem.
Now comes the clincher: suppose we look at the penalty method for a modified problem ##\max f(x)## subject to ##g(x) = a##. The penalized problem would be ##\max Z_{m,a} = f(x) + m [g(x)-a]^2##. For finite ##m > 0## the solution would not satisfy ##g(x) = a##, but if we are clever and lucky it might satisfy ##g(x) = 0##! In other words, we could get the solution of the ##g = 0## problem by using the penalty problem for ##g=a##, if we somehow choose the right value of ##a##.
Note that ##f(x) + m[g(x)-a]^2 = f(x) + m[g(x)]^2 - 2ma g(x) + m a^2##. Apart from the additive constant ##m a^2##, this is the augmented Lagrangian with ##\lambda = 2 m a##. That means that if we can choose the right value of ##\lambda## the augmented penalty problem will get the exact solution right away. We do NOT need to make ##m## large for this to happen. In practice, one may choose to increase ##m## by some modest amount from one step to the next, while at the same time updating the value of ##\lambda##. Workable algorithms along this line can be very fast and efficient.
As an example, consider your problem ##\min x^2## s.t. ##x-1 = 0##. I will denote the Lagrange multiplier by ##u## instead of ##\lambda## (a common choice in the optimization literature), so the augmented penalty problem is
$$\min Z_m = x^2 + m(x-1)^2 + u(x-1)$$
For fixed ##m > 0## and ##u## the solution is
$$x = x_{m,u} = \frac{2m-u}{2m+2}$$.
This gives a new Lagrange multiplier equal to
$$u_{\text{new}} = u + 2m (x_{\text{new}} -1).$$
Fixing ##m## at ##m = 1## and starting from ##u_0 = 0## the sequence of solution values ##x_n## and ##u_n## are:
$$ \begin{array}{rcc}
n & x_n & u_n \\
1 &0.500000 & -1.000000\\
2 &0.750000 &-1.500000 \\
3 &0.875000 &-1.750000 \\
4 & 0.937500 & -1.875000 \\
5 & 0.968750 &-1.937500 \\
6 & 0.984375 & -1.968750 \\
7 & 0.992188 &-1.984375 \\
8 & 0.996094 & -1.992188 \\
9 & 0.998047 & -1.996094 \\
10 & 0.999023 & -1.998047 \\
11 & 0.999512 & -1.999023 \\
12 &0.999756 &-1.999512 \\
\end{array}
$$
We see that the ##x_n## and ##u_n## are converging to the correct optimal values ##x = 1## and ##u = -2##, and all that is happening while fixing ##m > 0## at the quite small value ##m = 1##. Faster convergence would obtained by increasing ##m## as well from one step to the next.