How to solve a small system of non-linear equations?

Hypatio · Mar 24, 2020

If we have the following relations between x, y, and z:

[tex]M_{xy}=\frac{2xy}{x+y}[/tex]
[tex]M_{xz}=\frac{2xz}{x+z}[/tex]
[tex]M_{yz}=\frac{2yz}{y+z}[/tex]

where [itex]M_{xy}[/itex], [itex]M_{xz}[/itex], and [itex]M_{yz}[/itex] are known constants, what technique can be used to determine the values of x, y, and z?

.Scott · Mar 24, 2020

I would use Newton-Raphson.
Guess, make small changes and see whether they get you closer (or calculate the derivatives directly), then calculate a new guess based on Newton-Raphson.

BvU · Mar 24, 2020

Dear @Hypatio,

Do you understand your question is ambiguous ? In the sense it is totally unclear what you are searching: an analytical approach, or a numerical one ? Makes a differnce !

Hypatio · Mar 24, 2020

BvU said:

Dear @Hypatio,

Do you understand your question is ambiguous ? In the sense it is totally unclear what you are searching: an analytical approach, or a numerical one ? Makes a differnce !

Sorry. The technique is something I will program. I prefer an analytical approach unless it is overly complicated.

Hypatio · Mar 24, 2020

.Scott said:

I would use Newton-Raphson.
Guess, make small changes and see whether they get you closer (or calculate the derivatives directly), then calculate a new guess based on Newton-Raphson.

Maybe I am making a mistake, but using a Newton-Raphson like method has extremely poor convergence, or will explode.

.Scott · Mar 24, 2020

Using the method I suggested, I posed the problem with Mxy,Mxa,Myx = (-5,5,2) and my starting guess was (x,y,z) = (-6,-3,2). I tried to choose a starting guess that would avoid crossing and denominators through zero.

I didn't work. On the 9th iteration, it blew up. "y" got very large as Myz got near 2.4. Perhaps the solution was imaginary.

So I tried it with Mxy,Mxa,Myx = (-12,11,-14) - which I knew had a solution. It seemed to like starting guesses that were in the ball park for magnitude - so I guessed x,y,z=(10,-9,10). It correctly closed on (12.657,-4.07048,9.7363).

So it's a potentially workable approach. I used Excel. I needed to invert a 3x3 matrix (MINVERSE) and multiply a 1x3 with a 3x3 (MMULT).

.Scott · Mar 24, 2020

Hypatio said:

Maybe I am making a mistake, but using a Newton-Raphson like method has extremely poor convergence.

Yes. When there's stuff in the denominator that can go to zero - your starting guess needs to be chosen to avoid that. So you might have to produce a look-up table based on the initial Mxy:Myz:Mxz ratios to seed the method.
I would use Matlab or something similar to draw the functions out - so I could see where the problems were.

.Scott · Mar 24, 2020

It is also possible that there is a straight solution.

Infrared · Mar 24, 2020

You can solve for ##x## in the first equation: ##2xy=M_{xy}(x+y)=xM_{xy}+yM_{xy}## so ##2xy-xM_{xy}=yM_{xy}##. Factoring the lefthandside as ##x(2y-M_{xy})## gives ##x=\frac{yM_{xy}}{2y-M_{xy}}##.

Similarly you can solve for ##z## in the last equation ##z=\frac{yM_{yz}}{2y-M_{yz}}##.

Now substitute both of these into the second equation to get a single equation for ##y##.

epenguin · Mar 24, 2020

Isn't this just an ordinary set of linear equations in the reciprocals, 1/x, 1/y, 1/z ?

.Scott · Mar 25, 2020

epenguin said:

Isn't this just an ordinary set of linear equations in the reciprocals, 1/x, 1/y, 1/z ?

Yes.
(1/Mxy) is the average of 1/x and 1/y; etc.

Hypatio · Mar 28, 2020

Thanks for the helpful comments. However, I was looking at my problem more thoroughly and it looks like what I actually need to solve are the equations:

[tex]ab/(a+b) + ef/(e+f)=U \\ ac/(a+c) + eg/(e+g)=V \\ ad/(a+d) + eh/(e+h)=W \\ bc/(b+c) + fg/(f+g)=X \\ bd/(b+d) + fh/(f+h)=Y \\ cd/(c+d) + gh/(g+h)=Z[/tex]

where U,V,W,X,Y,Z are known and a, b, c, d, e, f, g, h are unknown. These can also be rearranged as

[tex]abe+abf+aef+bef-U(ae+af+be+bf)=0 \\
ace+acg+aeg+ceg-V(ae+ag+ce+cg)=0 \\
ade+adh+aeh+deh-W(ae+ah+de+dh)=0 \\
bcf+bcg+bfg+cfg-X(bf+bg+cf+cg)=0 \\
bdf+bdh+bfh+dfh-Y(bf+bh+df+dh)=0 \\
cdg+cdh+cgh+dgh-Z(cg+ch+dg+dh)=0
[/tex]

Infrared showed that my original problem could be solved easily, but I'm not so sure about this one.

Is it possible to simply represent this system as a matrix of coefficients and just invert? But it is not clear to me how to represent the many products of the unknowns in such a system. I am using MATLAB.

My one concern is that there are 8 unknowns and 6 knowns, so perhaps I must first constrain two of the unknowns before it is solveable.

Stephen Tashi · Mar 28, 2020

The theoretical way to solve polynomial equations is: http://www.scholarpedia.org/article/Groebner_basis

Perhaps you can find MATLAB library functions that implement that technique.

.Scott · Mar 28, 2020

Once you find a solution for a, b, c, d, e, f, g, and h; there will be another solution by swapping a, b, c, d with e, f, g, h.
With ##a_i = 1/a \space\space##, ##\space b_i=1/b \space\space##, etc, you get the slightly simpler:

##\space\space (a_i+b_i)(e_i+f_i)U = a_i + b_i +e_i + f_i##
##\space\space (a_i+c_i)(e_i+g_i)V = a_i + c_i +e_i + g_i##
##\space\space (a_i+d_i)(e_i+h_i)W = a_i + d_i +e_i + h_i##
##\space\space (b_i+c_i)(f_i+g_i)X= b_i + c_i + f_i + g_i##
##\space\space (b_i+d_i)(f_i+h_i)Y= b_i + d_i +f_i + h_i##
##\space\space (c_i+d_i)(g_i+h_i)Z= c_i + d_i +g_i + h_i##

If I decided on an iterative solution that attempted to close in on the solution, I would make another change:
##\space\space r=\sqrt(U^2+V^2+W^2+X^2+Y^2+Z^2)##
##\space\space U_j = U/r\space; V_j = V/r\space; etc##
##\space\space a_j = a_i r\space; b_j = b_i r\space; etc##

The equation above then follow this pattern:
##\space\space (a_j+b_j)(e_j+f_j)U_j = a_j + b_j +e_j + f_j##

This, combined with sorting the U, V, W, X, Y, Z by values would help identify and select good seed values for the iterations.

epenguin · Mar 29, 2020

Hypatio said:

My one concern is that there are 8 unknowns and 6 knowns, so perhaps I must first constrain two of the unknowns before it is solveable.

Anyone would have to know what problem gave rise to the equations to identify suitable constraints if a unique solution is required.

Hypatio · Apr 2, 2020

I was able to make an Newton-Raphson solver for the system

##\space\space (a+b)(e+f)M_1 = a + b +e + f##
##\space\space (a+c)(e+g)M_2 = a + c +e + g##
##\space\space (a+d)(e+h)M_3 = a + d +e + h##
##\space\space (b+c)(f+g)M_4= b + c + f + g##
##\space\space (b+d)(f+h)M_5= b + d +f + h##
##\space\space (c+d)(g+h)M_6= c + d +g + h##

but there seem to be some bad convergence properties. A straight Newton-Raphson often has poor convergence but I've found that if you modulate the stepping rate by a coefficient p

##\space\space x_{n+1} = x_n-p\frac{f(n_n)}{f'(x_n)}##

with p<1 then convergence can improve. However, I've tried a few different sets of ##M_1##...##M_6## values and the optimal choices for p change, and I don't understand why, so I don't know how to make a universal solver.

The following MATLAB code solves it with p=1/2, and two choices of M coefficients. In the uncommented set, the solution is found in about 20 steps, although oscillations persist. The second set of M coefficients (commented) can be partially solved in about 200 steps if p~0.03, but ##M_6## fails to converge. I've also experimented with letting p decrease with timestep, which can help.

It is also worth mentioning that, for some reason, bounding the coefficients in the way that I have is critical for convergence.

Any idea how to make this better? Thanks for all the help.

Code:

maxsteps          =  200; % max timesteps
p                 =  0.5;    % starting rate
dec               =  1;    % decrease rate by this fraction each step

Mest              =  zeros(6,1);  % initialize
dM                =  zeros(6,8);  % initialize

% known input coefficients (M)
M                 =  [32.5;        3.65384615384615; 3.65384615384615; 3.65384615384615; 3.65384615384615; 2.5];
% M                 =  [4.975388788; 0.083993056;      0.013540861;      0.082389867;      0.013327759;      0.013806649];

% min/max/avg unknowns (a,b,c,d,e,f,g,h)
lo                =   min(1./M)./1000;  % min coefficient value
hi                =   max(1./M).*1000;  % max coefficient value
avg               =  mean(1./M);        % avg coefficient value

% unknown ouput coefficients (a,b,c,d,e,f,g,h)
[a,b,c,d,e,f,g,h] =  deal(avg);         % average 1/M to starting guess

% recording
Crecs             =  zeros(maxsteps,8);
Mrecs             =  zeros(maxsteps,6);
Crecs(1,:)        =  [a b c d e f g h];

for t=1:maxsteps
    % current M estimates
    Mest(1)   =  1/(a+b)+1/(e+f);
    Mest(2)   =  1/(a+c)+1/(e+g);
    Mest(3)   =  1/(a+d)+1/(e+h);
    Mest(4)   =  1/(b+c)+1/(f+g);
    Mest(5)   =  1/(b+d)+1/(f+h);
    Mest(6)   =  1/(c+d)+1/(g+h);
    
    % distance to correct M
    deltaM    =  abs(Mest-M);
    
    % average distance to correct M, for change in a coefficient value
    f0(1)     =  (deltaM(1)+deltaM(2)+deltaM(3))/3;   % average distance at new a
    f0(2)     =  (deltaM(1)+deltaM(4)+deltaM(5))/3;   % average distance at new b
    f0(3)     =  (deltaM(2)+deltaM(4)+deltaM(6))/3;   % average distance at new c
    f0(4)     =  (deltaM(3)+deltaM(5)+deltaM(6))/3;   % average distance at new d
    f0(5)     =  (deltaM(1)+deltaM(2)+deltaM(3))/3;   % average distance at new e
    f0(6)     =  (deltaM(1)+deltaM(4)+deltaM(5))/3;   % average distance at new f
    f0(7)     =  (deltaM(2)+deltaM(4)+deltaM(6))/3;   % average distance at new g
    f0(8)     =  (deltaM(3)+deltaM(5)+deltaM(6))/3;   % average distance at new h
    
    % M derivatives for each coefficient
    % a                       b                         c                         d                        e                         f                         g                         h
    dM(1,1)   = -1/(a+b)^2;   dM(1,2)   = -1/(a+b)^2;   dM(1,3)   =  0;           dM(1,4)   =  0;          dM(1,5)   = -1/(e+f)^2;   dM(1,6)   = -1/(e+f)^2;   dM(1,7)   =  0;           dM(1,8)   =  0;
    dM(2,1)   = -1/(a+c)^2;   dM(2,2)   =  0;           dM(2,3)   = -1/(a+c)^2;   dM(2,4)   =  0;          dM(2,5)   = -1/(e+g)^2;   dM(2,6)   =  0;           dM(2,7)   = -1/(e+g)^2;   dM(2,8)   =  0;
    dM(3,1)   = -1/(a+d)^2;   dM(3,2)   =  0;           dM(3,3)   =  0;           dM(3,4)   = -1/(a+d)^2;  dM(3,5)   = -1/(e+h)^2;   dM(3,6)   =  0;           dM(3,7)   =  0;           dM(3,8)   = -1/(e+h)^2;
    dM(4,1)   =  0;           dM(4,2)   = -1/(b+c)^2;   dM(4,3)   = -1/(b+c)^2;   dM(4,4)   =  0;          dM(4,5)   =  0;           dM(4,6)   = -1/(f+g)^2;   dM(4,7)   = -1/(f+g)^2;   dM(4,8)   =  0;
    dM(5,1)   =  0;           dM(5,2)   = -1/(b+d)^2;   dM(5,3)   =  0;           dM(5,4)   = -1/(b+d)^2;  dM(5,5)   =  0;           dM(5,6)   = -1/(f+h)^2;   dM(5,7)   =  0;           dM(5,8)   = -1/(f+h)^2;
    dM(6,1)   =  0;           dM(6,2)   =  0;           dM(6,3)   = -1/(c+d)^2;   dM(6,4)   = -1/(c+d)^2;  dM(6,5)   =  0;           dM(6,6)   =  0;           dM(6,7)   = -1/(g+h)^2;   dM(6,8)   = -1/(g+h)^2;
    
    % positive on negative side is the direction of the root (0)
    for i=1:6
        if Mest(i)<M(i)
            dM(i,:)   = -dM(i,:);
        end
    end
    
    % net derivative
    df        =  sum(dM);

    % Newton-Raphson step
    an        =  a-p*f0(1)/df(1);
    bn        =  b-p*f0(2)/df(2);
    cn        =  c-p*f0(3)/df(3);
    dn        =  d-p*f0(4)/df(4);
    en        =  e-p*f0(5)/df(5);
    fn        =  f-p*f0(6)/df(6);
    gn        =  g-p*f0(7)/df(7);
    hn        =  h-p*f0(8)/df(8);
    
    % bound results: do not allow steps to very high or very low values
    if an<lo; a = (a+lo)/2;   elseif an>hi; a = (a+hi)/2;  else; a = an;  end
    if bn<lo; b = (b+lo)/2;   elseif bn>hi; b = (b+hi)/2;  else; b = bn;  end
    if cn<lo; c = (c+lo)/2;   elseif cn>hi; c = (c+hi)/2;  else; c = cn;  end
    if dn<lo; d = (d+lo)/2;   elseif dn>hi; d = (d+hi)/2;  else; d = dn;  end
    if en<lo; e = (e+lo)/2;   elseif en>hi; e = (e+hi)/2;  else; e = en;  end
    if fn<lo; f = (f+lo)/2;   elseif fn>hi; f = (f+hi)/2;  else; f = fn;  end
    if gn<lo; g = (g+lo)/2;   elseif gn>hi; g = (g+hi)/2;  else; g = gn;  end
    if hn<lo; h = (h+lo)/2;   elseif hn>hi; h = (h+hi)/2;  else; h = hn;  end
    
    % decrease convergence rate
    p      =  p.*dec;
    
    % record
    Crecs(t+1,:)   =  [a b c d e f g h];
    Mrecs(t,:)     =  Mest;
end

fprintf('residual: %f',Mest./M-1)

for i=1:8
    figure(1); plot(1:maxsteps+1,Crecs(:,i)); hold on;
end
title('a, b, c, d, e, f, g, h'); xlabel('steps'); hold off;

for i=1:6
    figure(2); plot(1:maxsteps,Mrecs(:,i)./M(i)); hold on;
end
title('M_{est}/M_{actual}'); xlabel('steps'); hold off;

zinq · Apr 4, 2020

There is nothing ambiguous about asking how to solve a system of equations for the variables. Nothing was mentioned about approximate solutions.

I'm not sure why this problem isn't as easy as was first pointed out by Infrared in #9: Solve for x in terms of y in the first equation, then plug in for x in the second equation, leaving just two equations in y and z that can be easily solved for y and z, then x.

Or the startling insight of epenguin in #10, showing that the three equations can be seen as much simpler equations in 1/x, 1y, 1/z. Substituting say U = 1/x, V = 1/y, W = 1/z you get three linear equations in the three unknowns U, V, W and once solved then you know what x, y, z are.

(All the while realizing that this last approach assumes none of x, y, z is equal to 0. But x = y = z = 0 is arguably one (very boring) solution to the original equations.)

There's no upside to making this much more difficult than necessary!

Hypatio · Apr 4, 2020

Each of the 6 equations, from post # 16, has 4 unknowns. I'm not sure how to perform the manipulations to get a straight solution if a,b,c,d,e,f,g, and h are unique, positive, and non-zero.

zinq · Apr 4, 2020

I apologize for not reading the entire thread thoroughly, spending some time offline to see that the original question was definitely solvable. (Maybe a new question deserves a separate thread.)

Svein · Apr 28, 2020

Hypatio said:

Thanks for the helpful comments. However, I was looking at my problem more thoroughly and it looks like what I actually need to solve are the equations:

ab/(a+b)+ef/(e+f)=Uac/(a+c)+eg/(e+g)=Vad/(a+d)+eh/(e+h)=Wbc/(b+c)+fg/(f+g)=Xbd/(b+d)+fh/(f+h)=Ycd/(c+d)+gh/(g+h)=Z

Looks like a set of equations for parallel connections of resistors!

From that standpoint you get the following facts:
[itex] \frac{ab}{a+b}\leq min(a, b)[/itex]
[itex] if (b\gg a) then \frac{ab}{a+b}\approx a[/itex]

How to solve a small system of non-linear equations?

What is a non-linear equation?

How do I solve a small system of non-linear equations?

What are some common challenges when solving non-linear equations?

Can I use a calculator to solve non-linear equations?

What real-life applications use non-linear equations?

Similar threads

Hot Threads

Recent Insights