Python Scipy minimizer function and function executions

  • Thread starter Thread starter member 428835
  • Start date Start date
  • Tags Tags
    Function
AI Thread Summary
The discussion revolves around using SciPy's minimize function for optimizing a function while minimizing unnecessary evaluations. The original poster seeks a way to avoid repeated calculations in their function, which is evaluated multiple times during the optimization process. It is clarified that numerical optimization inherently requires multiple evaluations to find the minimum, and extensive calculations that do not depend on the optimization variable should be performed outside the optimization function. Suggestions include using a wrapper function or class to store pre-calculated values, ensuring that only necessary computations are executed during optimization. The conversation emphasizes the importance of structuring the optimization function to minimize redundant calculations effectively.
member 428835
Hi PF

I'm trying to minimize a function func via scypi's minimiz function, as shown below.

Python:
import numpy as np
import scipy.optimize as optimize
def func(x):
    y = x[0]**2 + (x[1]-5)**2
    print('hi')
    return y
bnds = [(1, None), (-0.5, 4)]
result = optimize.minimize(func, method='TNC', bounds=bnds, x0=2 * np.ones(2))
print(result.x)

The issue is for every descent toward the minimum the entire func is being evaluated, evidenced by 'hi' being printed several times. Is there a way to avoid this, so that y is evaluated and stored externally so that it does not get executed every descent?

Thanks so much!
 
Technology news on Phys.org
No, the minimize function (and optimization in general) works by evaluating the function at multiple points. How else could it possibly work?
 
Last edited:
  • Like
Likes pasmith
In this example it's obvious by inspection that the minimum is at (0,5), but that's not how numerical methods work; you need to evaluate the function (and possibly its derivative, or a numerical approximation of it which requires at least one further function evaluation) at multiple points in order to home in on the minimum.

I would also recommend not feeding any of these iterative methods a function which does anything other than calculate values from its arguments. If you need to do extensive calculations which don't depend on the argument you are optimizing with respect to then you should do those somewhere else before you start the iteration. Extensive calculations which do depend on the argument you are optimizing with respect to must, sadly, be done at each step of the iteration.
 
Last edited:
pasmith said:
In this example it's obvious by inspection that the minimum is at (0,5), but that's not how numerical methods work; you need to evaluate the function (and possibly its derivative, or a numerical approximation of it which requires at least one further function evaluation) at multiple points in order to home in on the minimum.

I would also recommend not feeding any of these iterative methods a function which does anything other than calculate values from its arguments. If you need to do extensive calculations which don't depend on the argument you are optimizing with respect to then you should do those somewhere else before you start the iteration. Extensive calculations which do depend on the argument you are optimizing with respect to must, sadly, be done at each step of the iteration.
This is what I'm trying to do (calculate the output of the function and store it as a vector). See, there's data sheets that the function has to use, but it only needs to look at those data sheets once. I'm worried it's looking them up multiple times. How would you write it to ensure it's only looking up the data once?
 
joshmccraney said:
How would you write it to ensure it's only looking up the data once?
Have a separate function that populates the vector, and use optimization on the function that looks up a result from the vector.
 
anorlunda said:
use optimization on the function that looks up a result from the vector.
Note that there are at least 3 ways to get the vector in the scope of the target function:
  1. Use a global variable - ugly
  2. Pass the vector in a tuple as the args parameter to scipy.optimize.minimize, this will then be passed as a second parameter to every call of the objective function - this also works with other scipy methods that take a function such as scipy.integrate.solve_ivp
  3. Return the objective function from a wrapper function that calculates the vector - this is perhaps the most Pythonesqe way and doesn't depend on support from the method you are passing the function to.
 
Last edited:
  • Like
Likes anorlunda
pbuk said:
Return the objective function from a wrapper function that calculates the vector - this is perhaps the most Pythonesqe way and doesn't depend on support from the method you are passing the function to.
Can you elaborate please? I'm intersted
 
joshmccraney said:
Can you elaborate please? I'm intersted
Something like this
Python:
import numpy as np
from scipy.optimize import minimize
from random import randrange

def getFunc():
    # Set parameters for optimization function.
    a = randrange(5)
    b = randrange(5)
    print ('Parameters', a, b, '\n')

    # Create and return the function to optimize.
    def func(x):
        return x[0] ** 2 + (a * x[1] - b) ** 2
    return func

bounds = [(1, None), (-0.5, 4)]

func = getFunc()

result = minimize(func, method='TNC', bounds=bounds, x0=2 * np.ones(2))

print('Result\n', result)
 
  • Like
Likes member 428835
Or you could do
Python:
class FuncWrapper:
    def __init__(self):
        # Presumably you'd actually pass in a filename and load these
        self.a = 1
        self.b = 2
    def func(self,x):
        return x[0] ** 2 + (a * x[1] - b) ** 2

bounds = [(1, None), (-0.5, 4)]

funcWrapper = FuncWrapper()

result = minimize(funcWrapper.func, method='TNC', bounds=bounds, x0=2 * np.ones(2))

print('Result\n', result)
Also, FuncWrapper is my next band name.
 
  • Like
Likes member 428835
  • #10
pbuk said:
Something like this
Python:
import numpy as np
from scipy.optimize import minimize
from random import randrange

def getFunc():
    # Set parameters for optimization function.
    a = randrange(5)
    b = randrange(5)
    print ('Parameters', a, b, '\n')

    # Create and return the function to optimize.
    def func(x):
        return x[0] ** 2 + (a * x[1] - b) ** 2
    return func

bounds = [(1, None), (-0.5, 4)]

func = getFunc()

result = minimize(func, method='TNC', bounds=bounds, x0=2 * np.ones(2))

print('Result\n', result)
This is great! Can you explain the logic a little, as I think I'm missing something. It seems if I minimize getFunc() instead of func the parameters are being printed twice. I'm wondering why parameters a and b are not being printed multiple times as you've written it. Where's the magic happening?
 
  • #11
Ibix said:
Or you could do
Python:
class FuncWrapper:
    def __init__(self):
        # Presumably you'd actually pass in a filename and load these
        self.a = 1
        self.b = 2
    def func(self,x):
        return x[0] ** 2 + (a * x[1] - b) ** 2

bounds = [(1, None), (-0.5, 4)]

funcWrapper = FuncWrapper()

result = minimize(funcWrapper.func, method='TNC', bounds=bounds, x0=2 * np.ones(2))

print('Result\n', result)
Also, FuncWrapper is my next band name.
This actually doesn't run as is. I'm getting an error saying name 'a' is not defined. But I can't figure out why.
 
  • #12
Remember that in python class member variables must be referred to as self.whatever in method definitions and then note that I forgot to make a and b self.a and self.b in the func method. Been programming too much Java recently...
 
  • Like
Likes member 428835
  • #13
Ibix said:
Or you could do
Python:
class FuncWrapper:
    def __init__(self):
        # Presumably you'd actually pass in a filename and load these
        self.a = 1
        self.b = 2
    def func(self,x):
        return x[0] ** 2 + (a * x[1] - b) ** 2

bounds = [(1, None), (-0.5, 4)]

funcWrapper = FuncWrapper()

result = minimize(funcWrapper.func, method='TNC', bounds=bounds, x0=2 * np.ones(2))

print('Result\n', result)
Also, FuncWrapper is my next band name.
So I have another question for you. Let's say there's a summation that needs to be performed with a, b, and x. Something like ##c = \sum_{i=0}^{100000} a*x_0##. Is there a way to define that calculation before the function func, so that I don't have to compute the sum but can instead have this stored as an element?

For reference, the optimization I'm looking at is here

Python:
def test_optimize_arry(data_num, M = 250, lmda = 0.97):
    # for data_num in range(0, 4):
    def L(F):
        # extension = 'csv'
        # path = '/Users/joshmccraney/Desktop/ewma/'
        # rsk_fact_ID = "DCN"

        # 1) DEFINE sigma_0
        sig_0 = F[0]
        for i in range(1, 12):
            sig_0 += F[i]

        # 2) DEFINE sigma^2
        data, time = Import_data(path, extension, rsk_fact_ID)

        # DATA METRICS
        number_of_data = data.shape[1]
        number_of_days = time.shape[0]

        # CHANGE np TIME int ARRAY TO LIST OF STRINGS
        time = time.astype(str).tolist()

        # CALC INITIAL sigma^2, EQ 4 FROM IRM 2.0
        risk_factor_squared = np.square(data[:][0:M])
        initial_sigma_squared = risk_factor_squared.sum(axis=0)/M  # EQ 4

        # CALC sigma^2
        sig = np.vstack( [np.zeros([M-1, number_of_data]), initial_sigma_squared, np.zeros([number_of_days - M, number_of_data])] ) # PREALLOCATE        for day in range(M, number_of_days):
            # THE CORRECT F TO RETRIEVE (time format: YYYYMMDD)
            i = int(time[day][4:6]) - 1
            # Seasonal sigma
            sig[day,:] = lmda * sig[day - 1,:] + (1 - lmda) * np.square(data[day,:])/F[i]*sig_0

        # 3) DEFINE v
        # CALCULATE v
        v = np.vstack( [np.zeros([M-1, number_of_data]), initial_sigma_squared, np.zeros([number_of_days - M, number_of_data])] ) # PREALLOCATE
        for day in range(M, number_of_days):
            # THE CORRECT F TO RETRIEVE (time format: YYYYMMDD)
            i = int(time[day][4:6]) - 1

            # Seasonal variance
            v[day, :] = sig[day, :]*F[i]/sig_0

        # CLEAR THE FIRST M DAYS SINCE NO VAR AVAILABLE
        data_days = np.delete(data, range(M - 1), 0)
        v_days = np.delete(v, range(M - 1), 0)

        # 4) define L
        L_mat = np.log( t.pdf(data_days/v_days, df = 3) )
        L_sum = L_mat.sum(axis=0)

        # 5) calculate Reg
        # Penalty
        mu = F[12]
        Reg = (F[11] - 2*F[0] + F[1])**2 + (F[10] - 2*F[11] + F[0])**2
        for i in range(1, 11):
            Reg += (F[i-1] - 2*F[i] + F[i+1])**2
        Reg *= -mu/sig_0**2
        Reg += 12/2*np.log(mu)

        print(F)

        # 6) sum the results (see return statement pg 2)
        final_L = L_sum + Reg
        return final_L[data_num]

    # minimizer = ScipyMinimizer(method="BFGS")
    # result = minimizer.minimize(residuals_func=L, bounds=None, x0=2*np.ones(13))
    bnds = [(None, None), (None, None), (None, None), (None, None), (None, None), (None, None), (None, None),
            (None, None), (None, None), (None, None), (None, None), (None, None), (0.5, None)]
    result = optimize.minimize(L, method='TNC', bounds=bnds, x0=2 * np.ones(13))
    print(result.F)

It would be great if I could store line 67 somewhere so that I can have the function immediately call this rather than performing the data retrieval and summations every time. Does that make sense?
 
  • #14
It is not really optimization, but there is a fantastic python package that can automatically cache past results of any of your functions without modifying your source code other than adding a decorator.

https://towardsdatascience.com/10-fabulous-python-decorators-ab674a732871

I saw that package used on a demo program to calculate all factorials up to 1000. The speed up was amazing.
 
  • #15
joshmccraney said:
It seems if I minimize getFunc() instead of func the parameters are being printed twice.
The result of getFunc() is not something that is meaningful to minimize.

joshmccraney said:
I'm wondering why parameters a and b are not being printed multiple times as you've written it. Where's the magic happening?
There is no magic, the code containing the print(a, b) instruction is only executed once as is the code that defines the function you want to minimize. The function you have defined is called by scipy.optimize.minimize as many times as it needs to.
 
  • #16
joshmccraney said:
##c = \sum_{i=0}^{100000} a*x_0##
For this summation to make sense ## a*x_0 ## would have to depend on ## i ##, which it clearly does not.

joshmccraney said:
For reference, the optimization I'm looking at is here
The function you are trying to optimize is L(F). You can move outside that function anything that does not depend on F, such as
Python:
        data, time = Import_data(path, extension, rsk_fact_ID)
and this code will only be executed once, however anything that depends on F needs to stay inside the function, for instance your line 67
Python:
        final_L = L_sum + Reg
depends on F because of the line
Python:
        Reg = (F[11] - 2*F[0] + F[1])**2 + (F[10] - 2*F[11] + F[0])**2
(it also depends on F because L_sum depends on F).
 
  • Like
Likes Ibix and member 428835
  • #17
pbuk said:
For this summation to make sense ## a*x_0 ## would have to depend on ## i ##, which it clearly does not.The function you are trying to optimize is L(F). You can move outside that function anything that does not depend on F, such as
Python:
        data, time = Import_data(path, extension, rsk_fact_ID)
and this code will only be executed once, however anything that depends on F needs to stay inside the function, for instance your line 67
Python:
        final_L = L_sum + Reg
depends on F because of the line
Python:
        Reg = (F[11] - 2*F[0] + F[1])**2 + (F[10] - 2*F[11] + F[0])**2
(it also depends on F because L_sum depends on F).
Okay, I thought so. Thanks fore this; it's helpful!
 

Similar threads

Replies
3
Views
2K
Replies
15
Views
2K
Replies
15
Views
3K
Replies
2
Views
3K
Replies
6
Views
7K
Replies
2
Views
1K
Replies
4
Views
2K
Back
Top