Undergrad How to solve a small system of non-linear equations?

  • Thread starter Thread starter Hypatio
  • Start date Start date
  • Tags Tags
    Non-linear System
Click For Summary
To solve a small system of non-linear equations involving variables x, y, and z, the Newton-Raphson method is suggested, although concerns about its convergence are raised. The discussion highlights the importance of choosing appropriate starting guesses to avoid issues with denominators approaching zero. An analytical approach is preferred, with suggestions to rearrange the equations into a linear form in terms of the reciprocals of the variables. The complexity increases when extending the problem to eight unknowns and six knowns, indicating the need for constraints on the unknowns for a unique solution. Overall, the conversation emphasizes iterative methods and careful selection of initial values to improve convergence in solving these equations.
Hypatio
Messages
147
Reaction score
1
If we have the following relations between x, y, and z:

M_{xy}=\frac{2xy}{x+y}
M_{xz}=\frac{2xz}{x+z}
M_{yz}=\frac{2yz}{y+z}

where M_{xy}, M_{xz}, and M_{yz} are known constants, what technique can be used to determine the values of x, y, and z?
 
Physics news on Phys.org
I would use Newton-Raphson.
Guess, make small changes and see whether they get you closer (or calculate the derivatives directly), then calculate a new guess based on Newton-Raphson.
 
Dear @Hypatio,

Do you understand your question is ambiguous ? In the sense it is totally unclear what you are searching: an analytical approach, or a numerical one ? Makes a differnce !
 
BvU said:
Dear @Hypatio,

Do you understand your question is ambiguous ? In the sense it is totally unclear what you are searching: an analytical approach, or a numerical one ? Makes a differnce !
Sorry. The technique is something I will program. I prefer an analytical approach unless it is overly complicated.
 
.Scott said:
I would use Newton-Raphson.
Guess, make small changes and see whether they get you closer (or calculate the derivatives directly), then calculate a new guess based on Newton-Raphson.
Maybe I am making a mistake, but using a Newton-Raphson like method has extremely poor convergence, or will explode.
 
Using the method I suggested, I posed the problem with Mxy,Mxa,Myx = (-5,5,2) and my starting guess was (x,y,z) = (-6,-3,2). I tried to choose a starting guess that would avoid crossing and denominators through zero.

I didn't work. On the 9th iteration, it blew up. "y" got very large as Myz got near 2.4. Perhaps the solution was imaginary.

So I tried it with Mxy,Mxa,Myx = (-12,11,-14) - which I knew had a solution. It seemed to like starting guesses that were in the ball park for magnitude - so I guessed x,y,z=(10,-9,10). It correctly closed on (12.657,-4.07048,9.7363).

So it's a potentially workable approach. I used Excel. I needed to invert a 3x3 matrix (MINVERSE) and multiply a 1x3 with a 3x3 (MMULT).
 
Hypatio said:
Maybe I am making a mistake, but using a Newton-Raphson like method has extremely poor convergence.
Yes. When there's stuff in the denominator that can go to zero - your starting guess needs to be chosen to avoid that. So you might have to produce a look-up table based on the initial Mxy:Myz:Mxz ratios to seed the method.
I would use Matlab or something similar to draw the functions out - so I could see where the problems were.
 
It is also possible that there is a straight solution.
 
You can solve for ##x## in the first equation: ##2xy=M_{xy}(x+y)=xM_{xy}+yM_{xy}## so ##2xy-xM_{xy}=yM_{xy}##. Factoring the lefthandside as ##x(2y-M_{xy})## gives ##x=\frac{yM_{xy}}{2y-M_{xy}}##.

Similarly you can solve for ##z## in the last equation ##z=\frac{yM_{yz}}{2y-M_{yz}}##.

Now substitute both of these into the second equation to get a single equation for ##y##.
 
  • Like
Likes zinq, Hypatio and sysprog
  • #10
Isn't this just an ordinary set of linear equations in the reciprocals, 1/x, 1/y, 1/z ?
 
  • Like
Likes FactChecker, Hypatio, .Scott and 2 others
  • #11
epenguin said:
Isn't this just an ordinary set of linear equations in the reciprocals, 1/x, 1/y, 1/z ?
Yes.
(1/Mxy) is the average of 1/x and 1/y; etc.
 
  • Like
Likes Hypatio
  • #12
Thanks for the helpful comments. However, I was looking at my problem more thoroughly and it looks like what I actually need to solve are the equations:

ab/(a+b) + ef/(e+f)=U \\ ac/(a+c) + eg/(e+g)=V \\ ad/(a+d) + eh/(e+h)=W \\ bc/(b+c) + fg/(f+g)=X \\ bd/(b+d) + fh/(f+h)=Y \\ cd/(c+d) + gh/(g+h)=Z

where U,V,W,X,Y,Z are known and a, b, c, d, e, f, g, h are unknown. These can also be rearranged as

abe+abf+aef+bef-U(ae+af+be+bf)=0 \\<br /> ace+acg+aeg+ceg-V(ae+ag+ce+cg)=0 \\<br /> ade+adh+aeh+deh-W(ae+ah+de+dh)=0 \\<br /> bcf+bcg+bfg+cfg-X(bf+bg+cf+cg)=0 \\<br /> bdf+bdh+bfh+dfh-Y(bf+bh+df+dh)=0 \\<br /> cdg+cdh+cgh+dgh-Z(cg+ch+dg+dh)=0<br />

Infrared showed that my original problem could be solved easily, but I'm not so sure about this one.

Is it possible to simply represent this system as a matrix of coefficients and just invert? But it is not clear to me how to represent the many products of the unknowns in such a system. I am using MATLAB.

My one concern is that there are 8 unknowns and 6 knowns, so perhaps I must first constrain two of the unknowns before it is solveable.
 
  • #14
Once you find a solution for a, b, c, d, e, f, g, and h; there will be another solution by swapping a, b, c, d with e, f, g, h.
With ##a_i = 1/a \space\space##, ##\space b_i=1/b \space\space##, etc, you get the slightly simpler:

##\space\space (a_i+b_i)(e_i+f_i)U = a_i + b_i +e_i + f_i##
##\space\space (a_i+c_i)(e_i+g_i)V = a_i + c_i +e_i + g_i##
##\space\space (a_i+d_i)(e_i+h_i)W = a_i + d_i +e_i + h_i##
##\space\space (b_i+c_i)(f_i+g_i)X= b_i + c_i + f_i + g_i##
##\space\space (b_i+d_i)(f_i+h_i)Y= b_i + d_i +f_i + h_i##
##\space\space (c_i+d_i)(g_i+h_i)Z= c_i + d_i +g_i + h_i##

If I decided on an iterative solution that attempted to close in on the solution, I would make another change:
##\space\space r=\sqrt(U^2+V^2+W^2+X^2+Y^2+Z^2)##
##\space\space U_j = U/r\space; V_j = V/r\space; etc##
##\space\space a_j = a_i r\space; b_j = b_i r\space; etc##

The equation above then follow this pattern:
##\space\space (a_j+b_j)(e_j+f_j)U_j = a_j + b_j +e_j + f_j##

This, combined with sorting the U, V, W, X, Y, Z by values would help identify and select good seed values for the iterations.
 
  • Like
Likes Hypatio and DaveE
  • #15
Hypatio said:
My one concern is that there are 8 unknowns and 6 knowns, so perhaps I must first constrain two of the unknowns before it is solveable.

Anyone would have to know what problem gave rise to the equations to identify suitable constraints if a unique solution is required.
 
  • #16
I was able to make an Newton-Raphson solver for the system

##\space\space (a+b)(e+f)M_1 = a + b +e + f##
##\space\space (a+c)(e+g)M_2 = a + c +e + g##
##\space\space (a+d)(e+h)M_3 = a + d +e + h##
##\space\space (b+c)(f+g)M_4= b + c + f + g##
##\space\space (b+d)(f+h)M_5= b + d +f + h##
##\space\space (c+d)(g+h)M_6= c + d +g + h##

but there seem to be some bad convergence properties. A straight Newton-Raphson often has poor convergence but I've found that if you modulate the stepping rate by a coefficient p

##\space\space x_{n+1} = x_n-p\frac{f(n_n)}{f'(x_n)}##

with p<1 then convergence can improve. However, I've tried a few different sets of ##M_1##...##M_6## values and the optimal choices for p change, and I don't understand why, so I don't know how to make a universal solver.

The following MATLAB code solves it with p=1/2, and two choices of M coefficients. In the uncommented set, the solution is found in about 20 steps, although oscillations persist. The second set of M coefficients (commented) can be partially solved in about 200 steps if p~0.03, but ##M_6## fails to converge. I've also experimented with letting p decrease with timestep, which can help.

It is also worth mentioning that, for some reason, bounding the coefficients in the way that I have is critical for convergence.

Any idea how to make this better? Thanks for all the help.
Code:
maxsteps          =  200; % max timesteps
p                 =  0.5;    % starting rate
dec               =  1;    % decrease rate by this fraction each step

Mest              =  zeros(6,1);  % initialize
dM                =  zeros(6,8);  % initialize

% known input coefficients (M)
M                 =  [32.5;        3.65384615384615; 3.65384615384615; 3.65384615384615; 3.65384615384615; 2.5];
% M                 =  [4.975388788; 0.083993056;      0.013540861;      0.082389867;      0.013327759;      0.013806649];

% min/max/avg unknowns (a,b,c,d,e,f,g,h)
lo                =   min(1./M)./1000;  % min coefficient value
hi                =   max(1./M).*1000;  % max coefficient value
avg               =  mean(1./M);        % avg coefficient value

% unknown ouput coefficients (a,b,c,d,e,f,g,h)
[a,b,c,d,e,f,g,h] =  deal(avg);         % average 1/M to starting guess

% recording
Crecs             =  zeros(maxsteps,8);
Mrecs             =  zeros(maxsteps,6);
Crecs(1,:)        =  [a b c d e f g h];

for t=1:maxsteps
    % current M estimates
    Mest(1)   =  1/(a+b)+1/(e+f);
    Mest(2)   =  1/(a+c)+1/(e+g);
    Mest(3)   =  1/(a+d)+1/(e+h);
    Mest(4)   =  1/(b+c)+1/(f+g);
    Mest(5)   =  1/(b+d)+1/(f+h);
    Mest(6)   =  1/(c+d)+1/(g+h);
    
    % distance to correct M
    deltaM    =  abs(Mest-M);
    
    % average distance to correct M, for change in a coefficient value
    f0(1)     =  (deltaM(1)+deltaM(2)+deltaM(3))/3;   % average distance at new a
    f0(2)     =  (deltaM(1)+deltaM(4)+deltaM(5))/3;   % average distance at new b
    f0(3)     =  (deltaM(2)+deltaM(4)+deltaM(6))/3;   % average distance at new c
    f0(4)     =  (deltaM(3)+deltaM(5)+deltaM(6))/3;   % average distance at new d
    f0(5)     =  (deltaM(1)+deltaM(2)+deltaM(3))/3;   % average distance at new e
    f0(6)     =  (deltaM(1)+deltaM(4)+deltaM(5))/3;   % average distance at new f
    f0(7)     =  (deltaM(2)+deltaM(4)+deltaM(6))/3;   % average distance at new g
    f0(8)     =  (deltaM(3)+deltaM(5)+deltaM(6))/3;   % average distance at new h
    
    % M derivatives for each coefficient
    % a                       b                         c                         d                        e                         f                         g                         h
    dM(1,1)   = -1/(a+b)^2;   dM(1,2)   = -1/(a+b)^2;   dM(1,3)   =  0;           dM(1,4)   =  0;          dM(1,5)   = -1/(e+f)^2;   dM(1,6)   = -1/(e+f)^2;   dM(1,7)   =  0;           dM(1,8)   =  0;
    dM(2,1)   = -1/(a+c)^2;   dM(2,2)   =  0;           dM(2,3)   = -1/(a+c)^2;   dM(2,4)   =  0;          dM(2,5)   = -1/(e+g)^2;   dM(2,6)   =  0;           dM(2,7)   = -1/(e+g)^2;   dM(2,8)   =  0;
    dM(3,1)   = -1/(a+d)^2;   dM(3,2)   =  0;           dM(3,3)   =  0;           dM(3,4)   = -1/(a+d)^2;  dM(3,5)   = -1/(e+h)^2;   dM(3,6)   =  0;           dM(3,7)   =  0;           dM(3,8)   = -1/(e+h)^2;
    dM(4,1)   =  0;           dM(4,2)   = -1/(b+c)^2;   dM(4,3)   = -1/(b+c)^2;   dM(4,4)   =  0;          dM(4,5)   =  0;           dM(4,6)   = -1/(f+g)^2;   dM(4,7)   = -1/(f+g)^2;   dM(4,8)   =  0;
    dM(5,1)   =  0;           dM(5,2)   = -1/(b+d)^2;   dM(5,3)   =  0;           dM(5,4)   = -1/(b+d)^2;  dM(5,5)   =  0;           dM(5,6)   = -1/(f+h)^2;   dM(5,7)   =  0;           dM(5,8)   = -1/(f+h)^2;
    dM(6,1)   =  0;           dM(6,2)   =  0;           dM(6,3)   = -1/(c+d)^2;   dM(6,4)   = -1/(c+d)^2;  dM(6,5)   =  0;           dM(6,6)   =  0;           dM(6,7)   = -1/(g+h)^2;   dM(6,8)   = -1/(g+h)^2;
    
    % positive on negative side is the direction of the root (0)
    for i=1:6
        if Mest(i)<M(i)
            dM(i,:)   = -dM(i,:);
        end
    end
    
    % net derivative
    df        =  sum(dM);

    % Newton-Raphson step
    an        =  a-p*f0(1)/df(1);
    bn        =  b-p*f0(2)/df(2);
    cn        =  c-p*f0(3)/df(3);
    dn        =  d-p*f0(4)/df(4);
    en        =  e-p*f0(5)/df(5);
    fn        =  f-p*f0(6)/df(6);
    gn        =  g-p*f0(7)/df(7);
    hn        =  h-p*f0(8)/df(8);
    
    % bound results: do not allow steps to very high or very low values
    if an<lo; a = (a+lo)/2;   elseif an>hi; a = (a+hi)/2;  else; a = an;  end
    if bn<lo; b = (b+lo)/2;   elseif bn>hi; b = (b+hi)/2;  else; b = bn;  end
    if cn<lo; c = (c+lo)/2;   elseif cn>hi; c = (c+hi)/2;  else; c = cn;  end
    if dn<lo; d = (d+lo)/2;   elseif dn>hi; d = (d+hi)/2;  else; d = dn;  end
    if en<lo; e = (e+lo)/2;   elseif en>hi; e = (e+hi)/2;  else; e = en;  end
    if fn<lo; f = (f+lo)/2;   elseif fn>hi; f = (f+hi)/2;  else; f = fn;  end
    if gn<lo; g = (g+lo)/2;   elseif gn>hi; g = (g+hi)/2;  else; g = gn;  end
    if hn<lo; h = (h+lo)/2;   elseif hn>hi; h = (h+hi)/2;  else; h = hn;  end
    
    % decrease convergence rate
    p      =  p.*dec;
    
    % record
    Crecs(t+1,:)   =  [a b c d e f g h];
    Mrecs(t,:)     =  Mest;
end

fprintf('residual: %f',Mest./M-1)

for i=1:8
    figure(1); plot(1:maxsteps+1,Crecs(:,i)); hold on;
end
title('a, b, c, d, e, f, g, h'); xlabel('steps'); hold off;

for i=1:6
    figure(2); plot(1:maxsteps,Mrecs(:,i)./M(i)); hold on;
end
title('M_{est}/M_{actual}'); xlabel('steps'); hold off;
 
  • #17
There is nothing ambiguous about asking how to solve a system of equations for the variables. Nothing was mentioned about approximate solutions.

I'm not sure why this problem isn't as easy as was first pointed out by Infrared in #9: Solve for x in terms of y in the first equation, then plug in for x in the second equation, leaving just two equations in y and z that can be easily solved for y and z, then x.

Or the startling insight of epenguin in #10, showing that the three equations can be seen as much simpler equations in 1/x, 1y, 1/z. Substituting say U = 1/x, V = 1/y, W = 1/z you get three linear equations in the three unknowns U, V, W and once solved then you know what x, y, z are.

(All the while realizing that this last approach assumes none of x, y, z is equal to 0. But x = y = z = 0 is arguably one (very boring) solution to the original equations.)

There's no upside to making this much more difficult than necessary!
 
Last edited:
  • #18
Each of the 6 equations, from post # 16, has 4 unknowns. I'm not sure how to perform the manipulations to get a straight solution if a,b,c,d,e,f,g, and h are unique, positive, and non-zero.
 
  • #19
I apologize for not reading the entire thread thoroughly, spending some time offline to see that the original question was definitely solvable. (Maybe a new question deserves a separate thread.)
 
  • #20
Hypatio said:
Thanks for the helpful comments. However, I was looking at my problem more thoroughly and it looks like what I actually need to solve are the equations:

ab/(a+b)+ef/(e+f)=Uac/(a+c)+eg/(e+g)=Vad/(a+d)+eh/(e+h)=Wbc/(b+c)+fg/(f+g)=Xbd/(b+d)+fh/(f+h)=Ycd/(c+d)+gh/(g+h)=Z​
Looks like a set of equations for parallel connections of resistors!

From that standpoint you get the following facts:
\frac{ab}{a+b}\leq min(a, b)
if (b\gg a) then \frac{ab}{a+b}\approx a
 

Similar threads

Replies
4
Views
2K
  • · Replies 30 ·
2
Replies
30
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 6 ·
Replies
6
Views
1K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K