What is the Nelder-Mead algorithm and how can it be implemented in pure Python?

  • Context: Python 
  • Thread starter Thread starter m4r35n357
  • Start date Start date
  • Tags Tags
    Pure Python
Click For Summary

Discussion Overview

The discussion revolves around the Nelder-Mead algorithm, focusing on its implementation in pure Python without dependencies like Numpy. Participants explore the algorithm's structure, performance considerations, and code refinements, as well as the implications of removing certain dependencies.

Discussion Character

  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant shares a code implementation of the Nelder-Mead algorithm, highlighting its removal of Numpy for simplicity and compactness.
  • Another participant questions the removal of Numpy, suggesting it typically enhances performance for large arrays.
  • A response clarifies that Numpy is not beneficial for this specific algorithm due to its inefficacy in high-dimensional spaces, and emphasizes a preference for list comprehensions for clarity.
  • Further refinements to the algorithm are discussed, including improvements in type safety, progress reporting, and overall understanding of the code.
  • Participants mention fixing issues in the "shrink/reduce" function and express that the minimal code serves as a learning tool for debugging and experimentation.
  • There is a focus on the algorithm's parameters and their roles, such as stuck thresholds and iteration limits, with suggestions for adjustments based on user needs.

Areas of Agreement / Disagreement

Participants express differing views on the necessity of Numpy, with some advocating for its removal while others highlight its potential benefits. The discussion remains unresolved regarding the optimal implementation choices and the effectiveness of the proposed code changes.

Contextual Notes

Some participants note that the algorithm may not perform well in high-dimensional spaces, which could limit its applicability in certain scenarios. Additionally, there are mentions of specific parameter values that may need adjustment based on the problem context.

m4r35n357
Messages
657
Reaction score
148
Adapted from this code, which is an implementation of the algorithm described here.
Python:
from copy import copy
from sys import stderr, argv

def replace_worst(data, new):
    del data[-1]
    data.append(new)

def nelder_mead(f, x_0, x_δ, stuck_threshold=10e-12, stuck_break=10, max_iterations=0, α=1.0, γ=2.0, ρ=-0.5, σ=0.5):

    # initial simplex
    dim = len(x_0)
    assert dim == len(x_δ)
    dimensions = range(dim)
    prev_best = f(x_0)
    results = [[x_0, prev_best]]
    for i in dimensions:
        x = copy(x_0)
        x[i] += x_δ[i]
        score = f(x)
        results.append([x, score])
    iterations = no_improvement = nt = nr = ne = nc = ns = 0

    while True:
        nt += 1

        # 1. order
        results.sort(key=lambda z: z[1])
        best = results[0][1]

        # break after max_iter
        if max_iterations and iterations >= max_iterations:
            return results[0], nt - 1, nr, ne, nc, ns
        iterations += 1

        # break after no_improvement iterations with no improvement
        print('...best so far:', best, file=stderr)
        if best < prev_best - stuck_threshold:
            no_improvement = 0
            prev_best = best
        else:
            no_improvement += 1
        if no_improvement >= stuck_break:
            return results[0], nt - 1, nr, ne, nc, ns

        # 2. centroid
        x0 = [0.0] * dim
        for result in results[:-1]:
            for i, c in enumerate(result[0]):
                x0[i] += c / (len(results)-1)

        # 3. reflect
        xr = [x0[i] + α * (x0[i] - results[-1][0][i]) for i in dimensions]
        r_score = f(xr)
        if results[0][1] <= r_score < results[-2][1]:
            nr += 1
            replace_worst(results, [xr, r_score])
            continue

        # 4. expand
        if r_score < results[0][1]:
            xe = [x0[i] + γ * (x0[i] - results[-1][0][i]) for i in dimensions]
            e_score = f(xe)
            ne += 1
            replace_worst(results, [xe, e_score] if e_score < r_score else [xr, r_score])
            continue

        # 5. contract
        xc = [x0[i] + ρ * (x0[i] - results[-1][0][i]) for i in dimensions]
        c_score = f(xc)
        if c_score < results[-1][1]:
            nc += 1
            replace_worst(results, [xc, c_score])
            continue

        # 6. shrink
        x1 = results[0][0]
        reduced = []
        for result in results:
            xs = [x1[i] + σ * (result[0][i] - x1[i]) for i in dimensions]
            score = f(xs)
            ns += 1
            reduced.append([xs, score])
        results = reduced

if __name__ == "__main__":
    # Example: python NelderMead.py 3.0 3.0 0.1 0.1
    def himmelblau(z):
        x, y = z
        return (x ** 2 + y - 11) ** 2 + (x + y ** 2 - 7) ** 2

    print(nelder_mead(himmelblau, [float(argv[1]), float(argv[2])], [float(argv[3]), float(argv[4])]))
else:
    print(__name__ + " module loaded", file=stderr)
 
Technology news on Phys.org
Why did you remove numpy? Usually that can accelerate an algorithm right?
 
For working with large arrays, yes, but not in this case (Nelder-Mead fails with many dimensions). Pretty much all my work is "embarrassingly unparallellizable" (or something like that). In my case it is a colossal dependency that I never use. In practice, the list comprehensions and sums keep it compact. BTW I've refactored it for more type safety, better progress and result reporting, and easier understanding, first the test library:
Python:
from sys import stderr
from copy import copy
from collections import namedtuple

Point = namedtuple('PointType', ['x', 'f'])

def replace_worst(data, new):
    del data[-1]
    data.append(new)

def nelder_mead(f, x_0, x_δ, ε=10e-12, stuck_break=100, max_iterations=1000, α=1.0, γ=2.0, ρ=-0.5, σ=0.5):

    dim = len(x_0)
    assert dim == len(x_δ)
    dimensions = range(dim)
    prev_best = f(x_0)
    simplex = [Point(x=x_0, f=prev_best)]
    for i in dimensions:
        x = copy(x_0)
        x[i] += x_δ[i]
        simplex.append(Point(x=x, f=f(x)))
    iterations = stuck_counter = nt = nr = ne = nc = ns = 0
    latest = ""

    while True:
        nt += 1

        simplex.sort(key=lambda z: z.f)
        best = simplex[0]
        worst = simplex[-1]
        second_worst = simplex[-2]
        best_value = best.f

        data = '{} {} {}'.format(iterations, simplex, latest)
        if best_value < prev_best:
            stuck_counter = 0
            prev_best = best_value
        else:
            stuck_counter += 1
        if stuck_counter >= stuck_break:
            raise RuntimeError("STUCK for {} steps! ".format(stuck_counter) + data)
        if max_iterations and iterations >= max_iterations:
            raise RuntimeError("UNFINISHED! " + data)
        print(data, file=stderr)
        if sum((best.x[i] - worst.x[i])**2 for i in dimensions) < ε and (best.f - worst.f)**2 < ε:
            return best, nt - 1, nr, ne, nc, ns
        iterations += 1

        centroid = [0.0] * dim
        for result in simplex[:-1]:
            for i, c in enumerate(result.x):
                centroid[i] += c / (len(simplex)-1)

        xr = [centroid[i] + α * (centroid[i] - worst.x[i]) for i in dimensions]
        r_score = f(xr)
        if best.f <= r_score < second_worst.f:
            nr += 1
            replace_worst(simplex, Point(x=xr, f=r_score))
            latest = "reflection"
            continue

        if r_score < best.f:
            xe = [centroid[i] + γ * (centroid[i] - worst.x[i]) for i in dimensions]
            e_score = f(xe)
            ne += 1
            replace_worst(simplex, Point(x=xe, f=e_score) if e_score < r_score else Point(x=xr, f=r_score))
            latest = "expansion" + ("(e)" if e_score < r_score else "(r)")
            continue

        xc = [centroid[i] + ρ * (centroid[i] - worst.x[i]) for i in dimensions]
        c_score = f(xc)
        if c_score < worst.f:
            nc += 1
            replace_worst(simplex, Point(x=xc, f=c_score))
            latest = "contraction"
            continue

        reduced = []
        for vertex in simplex[1:]:
            xs = [best.x[i] + σ * (vertex.x[i] - best.x[i]) for i in dimensions]
            ns += 1
            reduced.append(Point(x=xs, f=f(xs)))
        simplex = reduced
        latest = "reduction"print(__name__ + " module loaded", file=stderr)
and the test program:
Python:
from sys import argv
from NelderMead import nelder_mead

class Rosenbrock(object):

    def __init__(self, a, b):
        self.a = a
        self.b = b

    def func(self, z):
        x, y = z
        return (self.a - x)**2 + self.b * (y - x**2)**2

def himmelblau(z):
    x, y = z
    return (x**2 + y - 11)**2 + (x + y**2 - 7)**2

if __name__ == "__main__":
    # Example: python Rosenbrock.py -3.0 -3.0 0.1 0.1
    init = [float(argv[1]), float(argv[2])]
    deltas = [float(argv[3]), float(argv[4])]
    print(nelder_mead(Rosenbrock(1.0, 100.0).func, init, deltas))
    print(nelder_mead(himmelblau, init, deltas))
 
Found and fixed the "shrink/reduce" function which was not being exercised until now. Also refactored and significantly shortened. I think the termination code makes more sense now too.

BTW this minimal code is just a bit of fun for those who want to kick the tyres, learn and debug etc. without messy dependencies.

Python:
from sys import stderr

def nelder_mead(f, x_0, x_δ, ε, stuck_limit=100, limit=1000, α=1.0, γ=2.0, ρ=-0.5, σ=0.5):
    n = len(x_0)
    assert n == len(x_δ)
    dim = range(n)
    best = f(x_0)
    s = [[x_0, best]]
    for i in dim:
        v = [x for x in x_0]
        v[i] += x_δ[i]
        s.append([v, f(v)])
    count = stuck_count = nr = ne = nc = ns = 0
    latest = ""

    while True:
        s.sort(key=lambda z: z[1])
        c = [0.0] * n
        for v in s[:-1]:
            for i, x in enumerate(v[0]):
                c[i] += x / n

        data = '{} {} {}'.format(count, s, latest)
        if s[0][1] < best:
            stuck_count = 0
            best = s[0][1]
        else:
            stuck_count += 1
        if stuck_count > stuck_limit:
            raise RuntimeError("STUCK for {} steps! ".format(stuck_count) + data)
        if count > limit:
            raise RuntimeError("ABANDONED after {} steps! ".format(count) + data)
        print(data, file=stderr)
        if max([abs(s[0][0][i] - c[i]) for i in dim]) < ε and abs(s[0][1] - s[-1][1]) < ε:
            return s[0], count, nr, ne, nc, ns
        count += 1

        xr = [c[i] + α * (c[i] - s[-1][0][i]) for i in dim]
        fr = f(xr)
        if s[0][1] <= fr < s[-2][1]:
            nr += 1
            del s[-1]
            s.append([xr, fr])
            latest = "reflection"
            continue

        if fr < s[0][1]:
            xe = [c[i] + γ * (c[i] - s[-1][0][i]) for i in dim]
            fe = f(xe)
            ne += 1
            del s[-1]
            s.append([xe, fe] if fe < fr else [xr, fr])
            latest = "expansion" + ("(e)" if fe < fr else "(r)")
            continue

        xc = [c[i] + ρ * (c[i] - s[-1][0][i]) for i in dim]
        fc = f(xc)
        if fc < s[-1][1]:
            nc += 1
            del s[-1]
            s.append([xc, fc])
            latest = "contraction"
            continue

        reduced = [s[0]]
        for v in s[1:]:
            xs = [s[0][0][i] + σ * (v[0][i] - s[0][0][i]) for i in dim]
            reduced.append([xs, f(xs)])
        ns += 1
        s = reduced
        latest = "reduction"

print(__name__ + " module loaded", file=stderr)
Here is the test code, now including a simple constrained optimization at the end:
Python:
from sys import argv, float_info
from NelderMead import nelder_mead

class Rosenbrock(object):

    def __init__(self, a, b):
        self.a = a
        self.b = b

    def func(self, z):
        x, y = z
        return (self.a - x)**2 + self.b * (y - x**2)**2

def himmelblau(z):
    x, y = z
    return (x**2 + y - 11)**2 + (x + y**2 - 7)**2

def basic(z):
    x, y = z
    f = x**2 + y**2 - 4.0
    c = 0.0 if x + y >= 2.0 else float_info.max
    return f + c

if __name__ == "__main__":
    # Example: python Rosenbrock.py 3.0 3.0 0.1 0.1
    init = [float(argv[1]), float(argv[2])]
    deltas = [float(argv[3]), float(argv[4])]
    print(nelder_mead(Rosenbrock(1.0, 100.0).func, init, deltas, 1.0e-9))
    print(nelder_mead(himmelblau, init, deltas, 1.0e-9))
    print(nelder_mead(basic, init, deltas, 1.0e-9))