Optimization with Newton's method

AI Thread Summary
The discussion focuses on optimizing a system of equations solved using Newton's method, specifically to minimize the sum of square residuals (SSR) with respect to parameters K1 and K2. The current approach involves a time-consuming binary search or brute force method to explore K1 and K2 values, taking about 40 minutes per experiment. Participants suggest using gradient descent for optimization, which requires calculating derivatives of the SSR function concerning K1 and K2. There is a debate on the feasibility of deriving these partial derivatives from the outputs of Newton's method. Customizing Newton's method or exploring alternative optimization techniques, such as quasi-random search methods, is also recommended to improve efficiency.
fahraynk
Messages
185
Reaction score
6
I have a system of equations which I solved with Newtons method.
Call Newtons method a function NM=f(K1,K2). K1 and K2 are input and a vector of x=x1,x2,x3,x4 is output.

I have another function, SSR, the sum of square residuals. It looks like this :
$$\sum (\frac{v_0-v_1X_1-v_3X_3-2v_3X_4}{H_t})^2$$
v1 and v2 are constants, v0 and H_t are experimental values which are known for several experiments. I sum over all the experiments.

Right now, I solve the system by Newtons method for a given k1 and k2. I then use a binary search or brute force to check every K1 and K2 over a range to find the minimum SSR.

The problem is it takes a long time, about 40 minutes per experiment and I have maybe 100 experiments to calculate. I would rather use something like gradient decent.

To do gradient decent I would need to take the derivative of the SSR function with respect to K1 and K2.

$$\frac{dSSR}{dK_1}=\frac{dSSR}{dX_1}\frac{dX_1}{dK_1}+ . . . \frac{dSSR}{dX_4}\frac{dX_4}{dK_1} \\\\ \frac{dX_n}{dK1}=\frac{dX_n}{dNM}\frac{dNM}{dK_1}$$$$

I can approximate the derivative of Newtons method with respect to K1 with NM(k1)-NM(K1+dK1)/dK1, but I have no idea if it is even possible to take the derivative of ##X_n## with respect to Newtons method!

Is it possible ? Or does anyone know a different way I can optimize this other than brute force and binary search?
 
Physics news on Phys.org
fahraynk said:
I have a system of equations which I solved with Newtons method.
Call Newtons method a function NM=f(K1,K2). K1 and K2 are input and a vector of x=x1,x2,x3,x4 is output.

I have another function, SSR, the sum of square residuals. It looks like this :
$$\sum (\frac{v_0-v_1X_1-v_3X_3-2v_3X_4}{H_t})^2$$
v1 and v2 are constants, v0 and H_t are experimental values which are known for several experiments. I sum over all the experiments.

Let's name the function and state it's arguments. Do you mean ##S(X1,X3,X4) = \sum_i (\frac{v_{0_i}-v_1X_1-v_3X_3-2v_3X_4}{H_{t_i}})^2## ?

Right now, I solve the system by Newtons method for a given k1 and k2.
It would be better say say that you solve the system "using " a given k1 and k2 for the values x1,x2,x3,x4, if that's what you mean.

I then use a binary search or brute force to check every K1 and K2 over a range to find the minimum SSR.

It's best to state optimization problems in the standard form: That would be "Minimize ... with respect to choices of ... subject to the contraints:,,,,,".

In your case, the problem appears to be:

Minimize ##S(X1,X3,X4) = \sum_i (\frac{v_{0_i}-v_1X_1-v_3X_3-2v_3X_4}{H_{t_i}})^2##
with repect to choices of ##K_1,K_2##
subject to the contraints:
##(X_1,X_2,X_3,X_4) = NM(K_1,K_2)##
where ##NM(K_1,K_2)## is the result of solving some system of equations by Newtons method for variables ##(X_1,X_2,X_3,X_4)##.

To do gradient decent I would need to take the derivative of the SSR function with respect to K1 and K2.

$$\frac{dSSR}{dK_1}=\frac{dSSR}{dX_1}\frac{dX_1}{dK_1}+ . . . \frac{dSSR}{dX_4}\frac{dX_4}{dK_1} \\\\ \frac{dX_n}{dK1}=\frac{dX_n}{dNM}\frac{dNM}{dK_1}$$$$

The correct notation would use partial derivatives - even though these are a pain-in-the-neck to write in LaTex.

##\frac{\partial S}{\partial K_1} = \frac{\partial S}{\partial X_1} \frac{\partial X_1}{\partial K_1} + ...##

I can approximate the derivative of Newtons method with respect to K1 with NM(k1)-NM(K1+dK1)/dK1, but I have no idea if it is even possible to take the derivative of ##X_n## with respect to Newtons method!

The notation "##\frac{dX_n}{dK1}=\frac{dX_n}{dNM}\frac{dNM}{dK_1}##" makes no sense in the context of your problem.

The vector valued function ##NM(K1,K2)## can be represented as 4 component functions.
##X_1 = NM_1(K_1,K_2)##
##X_2 = NM_2(K_1,K_2)##
##X3 = NM_3(K_1,K_2)##
##X4 = NM_4(K_1,K_2)##

##\frac{\partial X_1}{\partial K_1} = \frac{\partial NM_1}{\partial K_1}##
The problem is it takes a long time, about 40 minutes per experiment
Which step of the problem takes a long time? - using Newtons method? There are many tricks than can be used to speed up programs that solve particular sets of equations. For example, sometimes the final result of 2 steps of Newtons method can sometimes be implemented more concisely as a single sequence of code. Custom-made programs for solving particular sets of equations can be much faster than using a general-purpose equation solver.

Is it possible ? Or does anyone know a different way I can optimize this other than brute force and binary search?
There are quasi-random search methods, such as Simulated Annealing.

 
  • Like
Likes fahraynk
Stephen Tashi said:
##\frac{\partial S}{\partial K_1} = \frac{\partial S}{\partial X_1} \frac{\partial X_1}{\partial K_1} + ...##

The notation "##\frac{dX_n}{dK1}=\frac{dX_n}{dNM}\frac{dNM}{dK_1}##" makes no sense in the context of your problem.

The vector valued function ##NM(K1,K2)## can be represented as 4 component functions.
##X_1 = NM_1(K_1,K_2)##
##X_2 = NM_2(K_1,K_2)##
##X3 = NM_3(K_1,K_2)##
##X4 = NM_4(K_1,K_2)##

##\frac{\partial X_1}{\partial K_1} = \frac{\partial NM_1}{\partial K_1}##

.
Thanks for your help!

Ah! I see that I made a big mistake. I think the partial of ssr should be
$$\frac{\partial{ssr}}{\partial{k_1}}=\frac{\partial{ssr}}{\partial{x_1}}\frac{\partial{x_1}}{\partial{k_1}}+ . . . \frac{\partial{ssr}}{\partial{x_4}}\frac{\partial{x_4}}{\partial{k_1}}$$
Originally I had the partial of ##x_1## with respect to NM, which was weird like you said.

Newtons method looks like this : $$x=x_{0}-G(H_t,G_t,k_1,k_2)*J(k_1,K_2)^-1$$
G is a vector consisting of 4 equations, and ##J^-1## is the inverse of the Jacobean matrix.
The output for a particular ##x_n## for ##n \in [1,4]## is a large equation, ##x_n=fn(x_1,x_2,x_3,x_4,k_1,k_2,H_t,G_t)##. I think I could either compute the derivative of this function, a giant mess, or I can approximate the derivative by taking the value of ##x_n## from Newtons method at two values for ##k_1## close together, and divide by ##dk_1##.

You asked "Which step of the problem takes a long time?"
Newtons method takes the longest, because for whatever initial guess I input, a certain percentage of the data does not converge. I ended up making a systematic method of guessing that does every possible combination of guesses, but it is a maximum of 100 guesses worse case for a choice of K1 and k2. Plus for every k1 I have to check a ton of K2 with this method, so it just takes a long time. I could try to develop better guesses, or lower the amount of guessed guesses to 50 or so, but I would rather just use gradient decent if possible because its a cooler algorithm.
 
The rope is tied into the person (the load of 200 pounds) and the rope goes up from the person to a fixed pulley and back down to his hands. He hauls the rope to suspend himself in the air. What is the mechanical advantage of the system? The person will indeed only have to lift half of his body weight (roughly 100 pounds) because he now lessened the load by that same amount. This APPEARS to be a 2:1 because he can hold himself with half the force, but my question is: is that mechanical...
Some physics textbook writer told me that Newton's first law applies only on bodies that feel no interactions at all. He said that if a body is on rest or moves in constant velocity, there is no external force acting on it. But I have heard another form of the law that says the net force acting on a body must be zero. This means there is interactions involved after all. So which one is correct?
Thread 'Beam on an inclined plane'
Hello! I have a question regarding a beam on an inclined plane. I was considering a beam resting on two supports attached to an inclined plane. I was almost sure that the lower support must be more loaded. My imagination about this problem is shown in the picture below. Here is how I wrote the condition of equilibrium forces: $$ \begin{cases} F_{g\parallel}=F_{t1}+F_{t2}, \\ F_{g\perp}=F_{r1}+F_{r2} \end{cases}. $$ On the other hand...

Similar threads

Back
Top