Scipy minimizer function and function executions

member 428835 · Aug 19, 2022

Hi PF

I'm trying to minimize a function func via scypi's minimiz function, as shown below.

Python:

import numpy as np
import scipy.optimize as optimize
def func(x):
    y = x[0]**2 + (x[1]-5)**2
    print('hi')
    return y
bnds = [(1, None), (-0.5, 4)]
result = optimize.minimize(func, method='TNC', bounds=bnds, x0=2 * np.ones(2))
print(result.x)

The issue is for every descent toward the minimum the entire func is being evaluated, evidenced by 'hi' being printed several times. Is there a way to avoid this, so that y is evaluated and stored externally so that it does not get executed every descent?

Thanks so much!

pbuk · Aug 20, 2022

No, the minimize function (and optimization in general) works by evaluating the function at multiple points. How else could it possibly work?

pasmith · Aug 20, 2022

In this example it's obvious by inspection that the minimum is at (0,5), but that's not how numerical methods work; you need to evaluate the function (and possibly its derivative, or a numerical approximation of it which requires at least one further function evaluation) at multiple points in order to home in on the minimum.

I would also recommend not feeding any of these iterative methods a function which does anything other than calculate values from its arguments. If you need to do extensive calculations which don't depend on the argument you are optimizing with respect to then you should do those somewhere else before you start the iteration. Extensive calculations which do depend on the argument you are optimizing with respect to must, sadly, be done at each step of the iteration.

member 428835 · Aug 20, 2022

pasmith said:

In this example it's obvious by inspection that the minimum is at (0,5), but that's not how numerical methods work; you need to evaluate the function (and possibly its derivative, or a numerical approximation of it which requires at least one further function evaluation) at multiple points in order to home in on the minimum.

I would also recommend not feeding any of these iterative methods a function which does anything other than calculate values from its arguments. If you need to do extensive calculations which don't depend on the argument you are optimizing with respect to then you should do those somewhere else before you start the iteration. Extensive calculations which do depend on the argument you are optimizing with respect to must, sadly, be done at each step of the iteration.

This is what I'm trying to do (calculate the output of the function and store it as a vector). See, there's data sheets that the function has to use, but it only needs to look at those data sheets once. I'm worried it's looking them up multiple times. How would you write it to ensure it's only looking up the data once?

anorlunda · Aug 20, 2022

joshmccraney said:

How would you write it to ensure it's only looking up the data once?

Have a separate function that populates the vector, and use optimization on the function that looks up a result from the vector.

pbuk · Aug 20, 2022

anorlunda said:

use optimization on the function that looks up a result from the vector.

Note that there are at least 3 ways to get the vector in the scope of the target function:

Use a global variable - ugly
Pass the vector in a tuple as the args parameter to scipy.optimize.minimize, this will then be passed as a second parameter to every call of the objective function - this also works with other scipy methods that take a function such as scipy.integrate.solve_ivp
Return the objective function from a wrapper function that calculates the vector - this is perhaps the most Pythonesqe way and doesn't depend on support from the method you are passing the function to.

member 428835 · Aug 20, 2022

pbuk said:

Return the objective function from a wrapper function that calculates the vector - this is perhaps the most Pythonesqe way and doesn't depend on support from the method you are passing the function to.

Can you elaborate please? I'm intersted

pbuk · Aug 20, 2022

joshmccraney said:

Can you elaborate please? I'm intersted

Something like this

Python:

import numpy as np
from scipy.optimize import minimize
from random import randrange

def getFunc():
    # Set parameters for optimization function.
    a = randrange(5)
    b = randrange(5)
    print ('Parameters', a, b, '\n')

    # Create and return the function to optimize.
    def func(x):
        return x[0] ** 2 + (a * x[1] - b) ** 2
    return func

bounds = [(1, None), (-0.5, 4)]

func = getFunc()

result = minimize(func, method='TNC', bounds=bounds, x0=2 * np.ones(2))

print('Result\n', result)

Ibix · Aug 21, 2022

Or you could do

Python:

class FuncWrapper:
    def __init__(self):
        # Presumably you'd actually pass in a filename and load these
        self.a = 1
        self.b = 2
    def func(self,x):
        return x[0] ** 2 + (a * x[1] - b) ** 2

bounds = [(1, None), (-0.5, 4)]

funcWrapper = FuncWrapper()

result = minimize(funcWrapper.func, method='TNC', bounds=bounds, x0=2 * np.ones(2))

print('Result\n', result)

Also, FuncWrapper is my next band name.

member 428835 · Aug 21, 2022

pbuk said:

Something like this

Python:

import numpy as np
from scipy.optimize import minimize
from random import randrange

def getFunc():
    # Set parameters for optimization function.
    a = randrange(5)
    b = randrange(5)
    print ('Parameters', a, b, '\n')

    # Create and return the function to optimize.
    def func(x):
        return x[0] ** 2 + (a * x[1] - b) ** 2
    return func

bounds = [(1, None), (-0.5, 4)]

func = getFunc()

result = minimize(func, method='TNC', bounds=bounds, x0=2 * np.ones(2))

print('Result\n', result)

This is great! Can you explain the logic a little, as I think I'm missing something. It seems if I minimize getFunc() instead of func the parameters are being printed twice. I'm wondering why parameters a and b are not being printed multiple times as you've written it. Where's the magic happening?

member 428835 · Aug 21, 2022

Ibix said:

Or you could do

Python:

class FuncWrapper:
    def __init__(self):
        # Presumably you'd actually pass in a filename and load these
        self.a = 1
        self.b = 2
    def func(self,x):
        return x[0] ** 2 + (a * x[1] - b) ** 2

bounds = [(1, None), (-0.5, 4)]

funcWrapper = FuncWrapper()

result = minimize(funcWrapper.func, method='TNC', bounds=bounds, x0=2 * np.ones(2))

print('Result\n', result)

Also, FuncWrapper is my next band name.

This actually doesn't run as is. I'm getting an error saying name 'a' is not defined. But I can't figure out why.

Ibix · Aug 21, 2022

Remember that in python class member variables must be referred to as self.whatever in method definitions and then note that I forgot to make a and b self.a and self.b in the func method. Been programming too much Java recently...

member 428835 · Aug 21, 2022

Ibix said:

Or you could do

Python:

class FuncWrapper:
    def __init__(self):
        # Presumably you'd actually pass in a filename and load these
        self.a = 1
        self.b = 2
    def func(self,x):
        return x[0] ** 2 + (a * x[1] - b) ** 2

bounds = [(1, None), (-0.5, 4)]

funcWrapper = FuncWrapper()

result = minimize(funcWrapper.func, method='TNC', bounds=bounds, x0=2 * np.ones(2))

print('Result\n', result)

Also, FuncWrapper is my next band name.

So I have another question for you. Let's say there's a summation that needs to be performed with a, b, and x. Something like ##c = \sum_{i=0}^{100000} a*x_0##. Is there a way to define that calculation before the function func, so that I don't have to compute the sum but can instead have this stored as an element?

For reference, the optimization I'm looking at is here

Python:

def test_optimize_arry(data_num, M = 250, lmda = 0.97):
    # for data_num in range(0, 4):
    def L(F):
        # extension = 'csv'
        # path = '/Users/joshmccraney/Desktop/ewma/'
        # rsk_fact_ID = "DCN"

        # 1) DEFINE sigma_0
        sig_0 = F[0]
        for i in range(1, 12):
            sig_0 += F[i]

        # 2) DEFINE sigma^2
        data, time = Import_data(path, extension, rsk_fact_ID)

        # DATA METRICS
        number_of_data = data.shape[1]
        number_of_days = time.shape[0]

        # CHANGE np TIME int ARRAY TO LIST OF STRINGS
        time = time.astype(str).tolist()

        # CALC INITIAL sigma^2, EQ 4 FROM IRM 2.0
        risk_factor_squared = np.square(data[:][0:M])
        initial_sigma_squared = risk_factor_squared.sum(axis=0)/M  # EQ 4

        # CALC sigma^2
        sig = np.vstack( [np.zeros([M-1, number_of_data]), initial_sigma_squared, np.zeros([number_of_days - M, number_of_data])] ) # PREALLOCATE        for day in range(M, number_of_days):
            # THE CORRECT F TO RETRIEVE (time format: YYYYMMDD)
            i = int(time[day][4:6]) - 1
            # Seasonal sigma
            sig[day,:] = lmda * sig[day - 1,:] + (1 - lmda) * np.square(data[day,:])/F[i]*sig_0

        # 3) DEFINE v
        # CALCULATE v
        v = np.vstack( [np.zeros([M-1, number_of_data]), initial_sigma_squared, np.zeros([number_of_days - M, number_of_data])] ) # PREALLOCATE
        for day in range(M, number_of_days):
            # THE CORRECT F TO RETRIEVE (time format: YYYYMMDD)
            i = int(time[day][4:6]) - 1

            # Seasonal variance
            v[day, :] = sig[day, :]*F[i]/sig_0

        # CLEAR THE FIRST M DAYS SINCE NO VAR AVAILABLE
        data_days = np.delete(data, range(M - 1), 0)
        v_days = np.delete(v, range(M - 1), 0)

        # 4) define L
        L_mat = np.log( t.pdf(data_days/v_days, df = 3) )
        L_sum = L_mat.sum(axis=0)

        # 5) calculate Reg
        # Penalty
        mu = F[12]
        Reg = (F[11] - 2*F[0] + F[1])**2 + (F[10] - 2*F[11] + F[0])**2
        for i in range(1, 11):
            Reg += (F[i-1] - 2*F[i] + F[i+1])**2
        Reg *= -mu/sig_0**2
        Reg += 12/2*np.log(mu)

        print(F)

        # 6) sum the results (see return statement pg 2)
        final_L = L_sum + Reg
        return final_L[data_num]

    # minimizer = ScipyMinimizer(method="BFGS")
    # result = minimizer.minimize(residuals_func=L, bounds=None, x0=2*np.ones(13))
    bnds = [(None, None), (None, None), (None, None), (None, None), (None, None), (None, None), (None, None),
            (None, None), (None, None), (None, None), (None, None), (None, None), (0.5, None)]
    result = optimize.minimize(L, method='TNC', bounds=bnds, x0=2 * np.ones(13))
    print(result.F)

It would be great if I could store line 67 somewhere so that I can have the function immediately call this rather than performing the data retrieval and summations every time. Does that make sense?

anorlunda · Aug 21, 2022

It is not really optimization, but there is a fantastic python package that can automatically cache past results of any of your functions without modifying your source code other than adding a decorator.

https://towardsdatascience.com/10-fabulous-python-decorators-ab674a732871

I saw that package used on a demo program to calculate all factorials up to 1000. The speed up was amazing.

pbuk · Aug 21, 2022

joshmccraney said:

It seems if I minimize getFunc() instead of func the parameters are being printed twice.

The result of getFunc() is not something that is meaningful to minimize.

joshmccraney said:

I'm wondering why parameters a and b are not being printed multiple times as you've written it. Where's the magic happening?

There is no magic, the code containing the print(a, b) instruction is only executed once as is the code that defines the function you want to minimize. The function you have defined is called by scipy.optimize.minimize as many times as it needs to.

pbuk · Aug 21, 2022

joshmccraney said:

##c = \sum_{i=0}^{100000} a*x_0##

For this summation to make sense ## a*x_0 ## would have to depend on ## i ##, which it clearly does not.

joshmccraney said:

For reference, the optimization I'm looking at is here

The function you are trying to optimize is L(F). You can move outside that function anything that does not depend on F, such as

Python:

        data, time = Import_data(path, extension, rsk_fact_ID)

and this code will only be executed once, however anything that depends on F needs to stay inside the function, for instance your line 67

Python:

        final_L = L_sum + Reg

depends on F because of the line

Python:

        Reg = (F[11] - 2*F[0] + F[1])**2 + (F[10] - 2*F[11] + F[0])**2

(it also depends on F because L_sum depends on F).

member 428835 · Aug 21, 2022

pbuk said:
For this summation to make sense ## a*x_0 ## would have to depend on ## i ##, which it clearly does not.The function you are trying to optimize is L(F). You can move outside that function anything that does not depend on F, such as
Python:
        data, time = Import_data(path, extension, rsk_fact_ID)
and this code will only be executed once, however anything that depends on F needs to stay inside the function, for instance your line 67
Python:
        final_L = L_sum + Reg
depends on F because of the line
Python:
        Reg = (F[11] - 2*F[0] + F[1])**2 + (F[10] - 2*F[11] + F[0])**2
(it also depends on F because L_sum depends on F).

Okay, I thought so. Thanks fore this; it's helpful!

Scipy minimizer function and function executions

What is the Scipy minimizer function?

How do I use the Scipy minimizer function?

What are the available optimization algorithms in the Scipy minimizer function?

Can I use the Scipy minimizer function for non-linear optimization?

How can I improve the performance of the Scipy minimizer function?

Similar threads

Hot Threads

Recent Insights