# How to solve a small system of non-linear equations?

• I
Hypatio
If we have the following relations between x, y, and z:

$$M_{xy}=\frac{2xy}{x+y}$$
$$M_{xz}=\frac{2xz}{x+z}$$
$$M_{yz}=\frac{2yz}{y+z}$$

where $M_{xy}$, $M_{xz}$, and $M_{yz}$ are known constants, what technique can be used to determine the values of x, y, and z?

Homework Helper
I would use Newton-Raphson.
Guess, make small changes and see whether they get you closer (or calculate the derivatives directly), then calculate a new guess based on Newton-Raphson.

Homework Helper
Dear @Hypatio,

Do you understand your question is ambiguous ? In the sense it is totally unclear what you are searching: an analytical approach, or a numerical one ? Makes a differnce !

Hypatio
Dear @Hypatio,

Do you understand your question is ambiguous ? In the sense it is totally unclear what you are searching: an analytical approach, or a numerical one ? Makes a differnce !
Sorry. The technique is something I will program. I prefer an analytical approach unless it is overly complicated.

Hypatio
I would use Newton-Raphson.
Guess, make small changes and see whether they get you closer (or calculate the derivatives directly), then calculate a new guess based on Newton-Raphson.
Maybe I am making a mistake, but using a Newton-Raphson like method has extremely poor convergence, or will explode.

Homework Helper
Using the method I suggested, I posed the problem with Mxy,Mxa,Myx = (-5,5,2) and my starting guess was (x,y,z) = (-6,-3,2). I tried to choose a starting guess that would avoid crossing and denominators through zero.

I didn't work. On the 9th iteration, it blew up. "y" got very large as Myz got near 2.4. Perhaps the solution was imaginary.

So I tried it with Mxy,Mxa,Myx = (-12,11,-14) - which I knew had a solution. It seemed to like starting guesses that were in the ball park for magnitude - so I guessed x,y,z=(10,-9,10). It correctly closed on (12.657,-4.07048,9.7363).

So it's a potentially workable approach. I used Excel. I needed to invert a 3x3 matrix (MINVERSE) and multiply a 1x3 with a 3x3 (MMULT).

Homework Helper
Maybe I am making a mistake, but using a Newton-Raphson like method has extremely poor convergence.
Yes. When there's stuff in the denominator that can go to zero - your starting guess needs to be chosen to avoid that. So you might have to produce a look-up table based on the initial Mxy:Myz:Mxz ratios to seed the method.
I would use Matlab or something similar to draw the functions out - so I could see where the problems were.

Homework Helper
It is also possible that there is a straight solution.

Gold Member
You can solve for ##x## in the first equation: ##2xy=M_{xy}(x+y)=xM_{xy}+yM_{xy}## so ##2xy-xM_{xy}=yM_{xy}##. Factoring the lefthandside as ##x(2y-M_{xy})## gives ##x=\frac{yM_{xy}}{2y-M_{xy}}##.

Similarly you can solve for ##z## in the last equation ##z=\frac{yM_{yz}}{2y-M_{yz}}##.

Now substitute both of these into the second equation to get a single equation for ##y##.

zinq, Hypatio and sysprog
Homework Helper
Gold Member
Isn't this just an ordinary set of linear equations in the reciprocals, 1/x, 1/y, 1/z ?

FactChecker, Hypatio, .Scott and 2 others
Homework Helper
Isn't this just an ordinary set of linear equations in the reciprocals, 1/x, 1/y, 1/z ?
Yes.
(1/Mxy) is the average of 1/x and 1/y; etc.

Hypatio
Hypatio
Thanks for the helpful comments. However, I was looking at my problem more thoroughly and it looks like what I actually need to solve are the equations:

$$ab/(a+b) + ef/(e+f)=U \\ ac/(a+c) + eg/(e+g)=V \\ ad/(a+d) + eh/(e+h)=W \\ bc/(b+c) + fg/(f+g)=X \\ bd/(b+d) + fh/(f+h)=Y \\ cd/(c+d) + gh/(g+h)=Z$$

where U,V,W,X,Y,Z are known and a, b, c, d, e, f, g, h are unknown. These can also be rearranged as

$$abe+abf+aef+bef-U(ae+af+be+bf)=0 \\ ace+acg+aeg+ceg-V(ae+ag+ce+cg)=0 \\ ade+adh+aeh+deh-W(ae+ah+de+dh)=0 \\ bcf+bcg+bfg+cfg-X(bf+bg+cf+cg)=0 \\ bdf+bdh+bfh+dfh-Y(bf+bh+df+dh)=0 \\ cdg+cdh+cgh+dgh-Z(cg+ch+dg+dh)=0$$

Infrared showed that my original problem could be solved easily, but I'm not so sure about this one.

Is it possible to simply represent this system as a matrix of coefficients and just invert? But it is not clear to me how to represent the many products of the unknowns in such a system. I am using MATLAB.

My one concern is that there are 8 unknowns and 6 knowns, so perhaps I must first constrain two of the unknowns before it is solveable.

Homework Helper
Once you find a solution for a, b, c, d, e, f, g, and h; there will be another solution by swapping a, b, c, d with e, f, g, h.
With ##a_i = 1/a \space\space##, ##\space b_i=1/b \space\space##, etc, you get the slightly simpler:

##\space\space (a_i+b_i)(e_i+f_i)U = a_i + b_i +e_i + f_i##
##\space\space (a_i+c_i)(e_i+g_i)V = a_i + c_i +e_i + g_i##
##\space\space (a_i+d_i)(e_i+h_i)W = a_i + d_i +e_i + h_i##
##\space\space (b_i+c_i)(f_i+g_i)X= b_i + c_i + f_i + g_i##
##\space\space (b_i+d_i)(f_i+h_i)Y= b_i + d_i +f_i + h_i##
##\space\space (c_i+d_i)(g_i+h_i)Z= c_i + d_i +g_i + h_i##

If I decided on an iterative solution that attempted to close in on the solution, I would make another change:
##\space\space r=\sqrt(U^2+V^2+W^2+X^2+Y^2+Z^2)##
##\space\space U_j = U/r\space; V_j = V/r\space; etc##
##\space\space a_j = a_i r\space; b_j = b_i r\space; etc##

The equation above then follow this pattern:
##\space\space (a_j+b_j)(e_j+f_j)U_j = a_j + b_j +e_j + f_j##

This, combined with sorting the U, V, W, X, Y, Z by values would help identify and select good seed values for the iterations.

Hypatio and DaveE
Homework Helper
Gold Member
My one concern is that there are 8 unknowns and 6 knowns, so perhaps I must first constrain two of the unknowns before it is solveable.

Anyone would have to know what problem gave rise to the equations to identify suitable constraints if a unique solution is required.

Hypatio
I was able to make an Newton-Raphson solver for the system

##\space\space (a+b)(e+f)M_1 = a + b +e + f##
##\space\space (a+c)(e+g)M_2 = a + c +e + g##
##\space\space (a+d)(e+h)M_3 = a + d +e + h##
##\space\space (b+c)(f+g)M_4= b + c + f + g##
##\space\space (b+d)(f+h)M_5= b + d +f + h##
##\space\space (c+d)(g+h)M_6= c + d +g + h##

but there seem to be some bad convergence properties. A straight Newton-Raphson often has poor convergence but I've found that if you modulate the stepping rate by a coefficient p

##\space\space x_{n+1} = x_n-p\frac{f(n_n)}{f'(x_n)}##

with p<1 then convergence can improve. However, I've tried a few different sets of ##M_1##...##M_6## values and the optimal choices for p change, and I don't understand why, so I don't know how to make a universal solver.

The following MATLAB code solves it with p=1/2, and two choices of M coefficients. In the uncommented set, the solution is found in about 20 steps, although oscillations persist. The second set of M coefficients (commented) can be partially solved in about 200 steps if p~0.03, but ##M_6## fails to converge. I've also experimented with letting p decrease with timestep, which can help.

It is also worth mentioning that, for some reason, bounding the coefficients in the way that I have is critical for convergence.

Any idea how to make this better? Thanks for all the help.

Code:
maxsteps          =  200; % max timesteps
p                 =  0.5;    % starting rate
dec               =  1;    % decrease rate by this fraction each step

Mest              =  zeros(6,1);  % initialize
dM                =  zeros(6,8);  % initialize

% known input coefficients (M)
M                 =  [32.5;        3.65384615384615; 3.65384615384615; 3.65384615384615; 3.65384615384615; 2.5];
% M                 =  [4.975388788; 0.083993056;      0.013540861;      0.082389867;      0.013327759;      0.013806649];

% min/max/avg unknowns (a,b,c,d,e,f,g,h)
lo                =   min(1./M)./1000;  % min coefficient value
hi                =   max(1./M).*1000;  % max coefficient value
avg               =  mean(1./M);        % avg coefficient value

% unknown ouput coefficients (a,b,c,d,e,f,g,h)
[a,b,c,d,e,f,g,h] =  deal(avg);         % average 1/M to starting guess

% recording
Crecs             =  zeros(maxsteps,8);
Mrecs             =  zeros(maxsteps,6);
Crecs(1,:)        =  [a b c d e f g h];

for t=1:maxsteps
% current M estimates
Mest(1)   =  1/(a+b)+1/(e+f);
Mest(2)   =  1/(a+c)+1/(e+g);
Mest(3)   =  1/(a+d)+1/(e+h);
Mest(4)   =  1/(b+c)+1/(f+g);
Mest(5)   =  1/(b+d)+1/(f+h);
Mest(6)   =  1/(c+d)+1/(g+h);

% distance to correct M
deltaM    =  abs(Mest-M);

% average distance to correct M, for change in a coefficient value
f0(1)     =  (deltaM(1)+deltaM(2)+deltaM(3))/3;   % average distance at new a
f0(2)     =  (deltaM(1)+deltaM(4)+deltaM(5))/3;   % average distance at new b
f0(3)     =  (deltaM(2)+deltaM(4)+deltaM(6))/3;   % average distance at new c
f0(4)     =  (deltaM(3)+deltaM(5)+deltaM(6))/3;   % average distance at new d
f0(5)     =  (deltaM(1)+deltaM(2)+deltaM(3))/3;   % average distance at new e
f0(6)     =  (deltaM(1)+deltaM(4)+deltaM(5))/3;   % average distance at new f
f0(7)     =  (deltaM(2)+deltaM(4)+deltaM(6))/3;   % average distance at new g
f0(8)     =  (deltaM(3)+deltaM(5)+deltaM(6))/3;   % average distance at new h

% M derivatives for each coefficient
% a                       b                         c                         d                        e                         f                         g                         h
dM(1,1)   = -1/(a+b)^2;   dM(1,2)   = -1/(a+b)^2;   dM(1,3)   =  0;           dM(1,4)   =  0;          dM(1,5)   = -1/(e+f)^2;   dM(1,6)   = -1/(e+f)^2;   dM(1,7)   =  0;           dM(1,8)   =  0;
dM(2,1)   = -1/(a+c)^2;   dM(2,2)   =  0;           dM(2,3)   = -1/(a+c)^2;   dM(2,4)   =  0;          dM(2,5)   = -1/(e+g)^2;   dM(2,6)   =  0;           dM(2,7)   = -1/(e+g)^2;   dM(2,8)   =  0;
dM(3,1)   = -1/(a+d)^2;   dM(3,2)   =  0;           dM(3,3)   =  0;           dM(3,4)   = -1/(a+d)^2;  dM(3,5)   = -1/(e+h)^2;   dM(3,6)   =  0;           dM(3,7)   =  0;           dM(3,8)   = -1/(e+h)^2;
dM(4,1)   =  0;           dM(4,2)   = -1/(b+c)^2;   dM(4,3)   = -1/(b+c)^2;   dM(4,4)   =  0;          dM(4,5)   =  0;           dM(4,6)   = -1/(f+g)^2;   dM(4,7)   = -1/(f+g)^2;   dM(4,8)   =  0;
dM(5,1)   =  0;           dM(5,2)   = -1/(b+d)^2;   dM(5,3)   =  0;           dM(5,4)   = -1/(b+d)^2;  dM(5,5)   =  0;           dM(5,6)   = -1/(f+h)^2;   dM(5,7)   =  0;           dM(5,8)   = -1/(f+h)^2;
dM(6,1)   =  0;           dM(6,2)   =  0;           dM(6,3)   = -1/(c+d)^2;   dM(6,4)   = -1/(c+d)^2;  dM(6,5)   =  0;           dM(6,6)   =  0;           dM(6,7)   = -1/(g+h)^2;   dM(6,8)   = -1/(g+h)^2;

% positive on negative side is the direction of the root (0)
for i=1:6
if Mest(i)<M(i)
dM(i,:)   = -dM(i,:);
end
end

% net derivative
df        =  sum(dM);

% Newton-Raphson step
an        =  a-p*f0(1)/df(1);
bn        =  b-p*f0(2)/df(2);
cn        =  c-p*f0(3)/df(3);
dn        =  d-p*f0(4)/df(4);
en        =  e-p*f0(5)/df(5);
fn        =  f-p*f0(6)/df(6);
gn        =  g-p*f0(7)/df(7);
hn        =  h-p*f0(8)/df(8);

% bound results: do not allow steps to very high or very low values
if an<lo; a = (a+lo)/2;   elseif an>hi; a = (a+hi)/2;  else; a = an;  end
if bn<lo; b = (b+lo)/2;   elseif bn>hi; b = (b+hi)/2;  else; b = bn;  end
if cn<lo; c = (c+lo)/2;   elseif cn>hi; c = (c+hi)/2;  else; c = cn;  end
if dn<lo; d = (d+lo)/2;   elseif dn>hi; d = (d+hi)/2;  else; d = dn;  end
if en<lo; e = (e+lo)/2;   elseif en>hi; e = (e+hi)/2;  else; e = en;  end
if fn<lo; f = (f+lo)/2;   elseif fn>hi; f = (f+hi)/2;  else; f = fn;  end
if gn<lo; g = (g+lo)/2;   elseif gn>hi; g = (g+hi)/2;  else; g = gn;  end
if hn<lo; h = (h+lo)/2;   elseif hn>hi; h = (h+hi)/2;  else; h = hn;  end

% decrease convergence rate
p      =  p.*dec;

% record
Crecs(t+1,:)   =  [a b c d e f g h];
Mrecs(t,:)     =  Mest;
end

fprintf('residual: %f',Mest./M-1)

for i=1:8
figure(1); plot(1:maxsteps+1,Crecs(:,i)); hold on;
end
title('a, b, c, d, e, f, g, h'); xlabel('steps'); hold off;

for i=1:6
figure(2); plot(1:maxsteps,Mrecs(:,i)./M(i)); hold on;
end
title('M_{est}/M_{actual}'); xlabel('steps'); hold off;

zinq
There is nothing ambiguous about asking how to solve a system of equations for the variables. Nothing was mentioned about approximate solutions.

I'm not sure why this problem isn't as easy as was first pointed out by Infrared in #9: Solve for x in terms of y in the first equation, then plug in for x in the second equation, leaving just two equations in y and z that can be easily solved for y and z, then x.

Or the startling insight of epenguin in #10, showing that the three equations can be seen as much simpler equations in 1/x, 1y, 1/z. Substituting say U = 1/x, V = 1/y, W = 1/z you get three linear equations in the three unknowns U, V, W and once solved then you know what x, y, z are.

(All the while realizing that this last approach assumes none of x, y, z is equal to 0. But x = y = z = 0 is arguably one (very boring) solution to the original equations.)

There's no upside to making this much more difficult than necessary!

Last edited:
Hypatio
Each of the 6 equations, from post # 16, has 4 unknowns. I'm not sure how to perform the manipulations to get a straight solution if a,b,c,d,e,f,g, and h are unique, positive, and non-zero.

zinq
I apologize for not reading the entire thread thoroughly, spending some time offline to see that the original question was definitely solvable. (Maybe a new question deserves a separate thread.)

$\frac{ab}{a+b}\leq min(a, b)$
$if (b\gg a) then \frac{ab}{a+b}\approx a$