How to solve a small system of non-linear equations?

In summary: Also, you could add a step to reverse the order of the a_i, b_i, etc in the last four equations to get a better cross section of the solution.In summary, there are multiple techniques that can be used to solve for the values of x, y, and z in a system of equations. These include using Newton-Raphson, analytical methods, or iterative approaches. It is important to carefully consider the problem and choose appropriate starting values to avoid issues such as division by zero. Additionally, techniques such as finding a Groebner basis or using a matrix of coefficients can be helpful in solving more complex systems of equations.
  • #1
Hypatio
151
1
If we have the following relations between x, y, and z:

[tex]M_{xy}=\frac{2xy}{x+y}[/tex]
[tex]M_{xz}=\frac{2xz}{x+z}[/tex]
[tex]M_{yz}=\frac{2yz}{y+z}[/tex]

where [itex]M_{xy}[/itex], [itex]M_{xz}[/itex], and [itex]M_{yz}[/itex] are known constants, what technique can be used to determine the values of x, y, and z?
 
Physics news on Phys.org
  • #2
I would use Newton-Raphson.
Guess, make small changes and see whether they get you closer (or calculate the derivatives directly), then calculate a new guess based on Newton-Raphson.
 
  • #3
Dear @Hypatio,

Do you understand your question is ambiguous ? In the sense it is totally unclear what you are searching: an analytical approach, or a numerical one ? Makes a differnce !
 
  • #4
BvU said:
Dear @Hypatio,

Do you understand your question is ambiguous ? In the sense it is totally unclear what you are searching: an analytical approach, or a numerical one ? Makes a differnce !
Sorry. The technique is something I will program. I prefer an analytical approach unless it is overly complicated.
 
  • #5
.Scott said:
I would use Newton-Raphson.
Guess, make small changes and see whether they get you closer (or calculate the derivatives directly), then calculate a new guess based on Newton-Raphson.
Maybe I am making a mistake, but using a Newton-Raphson like method has extremely poor convergence, or will explode.
 
  • #6
Using the method I suggested, I posed the problem with Mxy,Mxa,Myx = (-5,5,2) and my starting guess was (x,y,z) = (-6,-3,2). I tried to choose a starting guess that would avoid crossing and denominators through zero.

I didn't work. On the 9th iteration, it blew up. "y" got very large as Myz got near 2.4. Perhaps the solution was imaginary.

So I tried it with Mxy,Mxa,Myx = (-12,11,-14) - which I knew had a solution. It seemed to like starting guesses that were in the ball park for magnitude - so I guessed x,y,z=(10,-9,10). It correctly closed on (12.657,-4.07048,9.7363).

So it's a potentially workable approach. I used Excel. I needed to invert a 3x3 matrix (MINVERSE) and multiply a 1x3 with a 3x3 (MMULT).
 
  • #7
Hypatio said:
Maybe I am making a mistake, but using a Newton-Raphson like method has extremely poor convergence.
Yes. When there's stuff in the denominator that can go to zero - your starting guess needs to be chosen to avoid that. So you might have to produce a look-up table based on the initial Mxy:Myz:Mxz ratios to seed the method.
I would use Matlab or something similar to draw the functions out - so I could see where the problems were.
 
  • #8
It is also possible that there is a straight solution.
 
  • #9
You can solve for ##x## in the first equation: ##2xy=M_{xy}(x+y)=xM_{xy}+yM_{xy}## so ##2xy-xM_{xy}=yM_{xy}##. Factoring the lefthandside as ##x(2y-M_{xy})## gives ##x=\frac{yM_{xy}}{2y-M_{xy}}##.

Similarly you can solve for ##z## in the last equation ##z=\frac{yM_{yz}}{2y-M_{yz}}##.

Now substitute both of these into the second equation to get a single equation for ##y##.
 
  • Like
Likes zinq, Hypatio and sysprog
  • #10
Isn't this just an ordinary set of linear equations in the reciprocals, 1/x, 1/y, 1/z ?
 
  • Like
Likes FactChecker, Hypatio, .Scott and 2 others
  • #11
epenguin said:
Isn't this just an ordinary set of linear equations in the reciprocals, 1/x, 1/y, 1/z ?
Yes.
(1/Mxy) is the average of 1/x and 1/y; etc.
 
  • Like
Likes Hypatio
  • #12
Thanks for the helpful comments. However, I was looking at my problem more thoroughly and it looks like what I actually need to solve are the equations:

[tex]ab/(a+b) + ef/(e+f)=U \\ ac/(a+c) + eg/(e+g)=V \\ ad/(a+d) + eh/(e+h)=W \\ bc/(b+c) + fg/(f+g)=X \\ bd/(b+d) + fh/(f+h)=Y \\ cd/(c+d) + gh/(g+h)=Z[/tex]

where U,V,W,X,Y,Z are known and a, b, c, d, e, f, g, h are unknown. These can also be rearranged as

[tex]abe+abf+aef+bef-U(ae+af+be+bf)=0 \\
ace+acg+aeg+ceg-V(ae+ag+ce+cg)=0 \\
ade+adh+aeh+deh-W(ae+ah+de+dh)=0 \\
bcf+bcg+bfg+cfg-X(bf+bg+cf+cg)=0 \\
bdf+bdh+bfh+dfh-Y(bf+bh+df+dh)=0 \\
cdg+cdh+cgh+dgh-Z(cg+ch+dg+dh)=0
[/tex]

Infrared showed that my original problem could be solved easily, but I'm not so sure about this one.

Is it possible to simply represent this system as a matrix of coefficients and just invert? But it is not clear to me how to represent the many products of the unknowns in such a system. I am using MATLAB.

My one concern is that there are 8 unknowns and 6 knowns, so perhaps I must first constrain two of the unknowns before it is solveable.
 
  • #14
Once you find a solution for a, b, c, d, e, f, g, and h; there will be another solution by swapping a, b, c, d with e, f, g, h.
With ##a_i = 1/a \space\space##, ##\space b_i=1/b \space\space##, etc, you get the slightly simpler:

##\space\space (a_i+b_i)(e_i+f_i)U = a_i + b_i +e_i + f_i##
##\space\space (a_i+c_i)(e_i+g_i)V = a_i + c_i +e_i + g_i##
##\space\space (a_i+d_i)(e_i+h_i)W = a_i + d_i +e_i + h_i##
##\space\space (b_i+c_i)(f_i+g_i)X= b_i + c_i + f_i + g_i##
##\space\space (b_i+d_i)(f_i+h_i)Y= b_i + d_i +f_i + h_i##
##\space\space (c_i+d_i)(g_i+h_i)Z= c_i + d_i +g_i + h_i##

If I decided on an iterative solution that attempted to close in on the solution, I would make another change:
##\space\space r=\sqrt(U^2+V^2+W^2+X^2+Y^2+Z^2)##
##\space\space U_j = U/r\space; V_j = V/r\space; etc##
##\space\space a_j = a_i r\space; b_j = b_i r\space; etc##

The equation above then follow this pattern:
##\space\space (a_j+b_j)(e_j+f_j)U_j = a_j + b_j +e_j + f_j##

This, combined with sorting the U, V, W, X, Y, Z by values would help identify and select good seed values for the iterations.
 
  • Like
Likes Hypatio and DaveE
  • #15
Hypatio said:
My one concern is that there are 8 unknowns and 6 knowns, so perhaps I must first constrain two of the unknowns before it is solveable.

Anyone would have to know what problem gave rise to the equations to identify suitable constraints if a unique solution is required.
 
  • #16
I was able to make an Newton-Raphson solver for the system

##\space\space (a+b)(e+f)M_1 = a + b +e + f##
##\space\space (a+c)(e+g)M_2 = a + c +e + g##
##\space\space (a+d)(e+h)M_3 = a + d +e + h##
##\space\space (b+c)(f+g)M_4= b + c + f + g##
##\space\space (b+d)(f+h)M_5= b + d +f + h##
##\space\space (c+d)(g+h)M_6= c + d +g + h##

but there seem to be some bad convergence properties. A straight Newton-Raphson often has poor convergence but I've found that if you modulate the stepping rate by a coefficient p

##\space\space x_{n+1} = x_n-p\frac{f(n_n)}{f'(x_n)}##

with p<1 then convergence can improve. However, I've tried a few different sets of ##M_1##...##M_6## values and the optimal choices for p change, and I don't understand why, so I don't know how to make a universal solver.

The following MATLAB code solves it with p=1/2, and two choices of M coefficients. In the uncommented set, the solution is found in about 20 steps, although oscillations persist. The second set of M coefficients (commented) can be partially solved in about 200 steps if p~0.03, but ##M_6## fails to converge. I've also experimented with letting p decrease with timestep, which can help.

It is also worth mentioning that, for some reason, bounding the coefficients in the way that I have is critical for convergence.

Any idea how to make this better? Thanks for all the help.
Code:
maxsteps          =  200; % max timesteps
p                 =  0.5;    % starting rate
dec               =  1;    % decrease rate by this fraction each step

Mest              =  zeros(6,1);  % initialize
dM                =  zeros(6,8);  % initialize

% known input coefficients (M)
M                 =  [32.5;        3.65384615384615; 3.65384615384615; 3.65384615384615; 3.65384615384615; 2.5];
% M                 =  [4.975388788; 0.083993056;      0.013540861;      0.082389867;      0.013327759;      0.013806649];

% min/max/avg unknowns (a,b,c,d,e,f,g,h)
lo                =   min(1./M)./1000;  % min coefficient value
hi                =   max(1./M).*1000;  % max coefficient value
avg               =  mean(1./M);        % avg coefficient value

% unknown ouput coefficients (a,b,c,d,e,f,g,h)
[a,b,c,d,e,f,g,h] =  deal(avg);         % average 1/M to starting guess

% recording
Crecs             =  zeros(maxsteps,8);
Mrecs             =  zeros(maxsteps,6);
Crecs(1,:)        =  [a b c d e f g h];

for t=1:maxsteps
    % current M estimates
    Mest(1)   =  1/(a+b)+1/(e+f);
    Mest(2)   =  1/(a+c)+1/(e+g);
    Mest(3)   =  1/(a+d)+1/(e+h);
    Mest(4)   =  1/(b+c)+1/(f+g);
    Mest(5)   =  1/(b+d)+1/(f+h);
    Mest(6)   =  1/(c+d)+1/(g+h);
    
    % distance to correct M
    deltaM    =  abs(Mest-M);
    
    % average distance to correct M, for change in a coefficient value
    f0(1)     =  (deltaM(1)+deltaM(2)+deltaM(3))/3;   % average distance at new a
    f0(2)     =  (deltaM(1)+deltaM(4)+deltaM(5))/3;   % average distance at new b
    f0(3)     =  (deltaM(2)+deltaM(4)+deltaM(6))/3;   % average distance at new c
    f0(4)     =  (deltaM(3)+deltaM(5)+deltaM(6))/3;   % average distance at new d
    f0(5)     =  (deltaM(1)+deltaM(2)+deltaM(3))/3;   % average distance at new e
    f0(6)     =  (deltaM(1)+deltaM(4)+deltaM(5))/3;   % average distance at new f
    f0(7)     =  (deltaM(2)+deltaM(4)+deltaM(6))/3;   % average distance at new g
    f0(8)     =  (deltaM(3)+deltaM(5)+deltaM(6))/3;   % average distance at new h
    
    % M derivatives for each coefficient
    % a                       b                         c                         d                        e                         f                         g                         h
    dM(1,1)   = -1/(a+b)^2;   dM(1,2)   = -1/(a+b)^2;   dM(1,3)   =  0;           dM(1,4)   =  0;          dM(1,5)   = -1/(e+f)^2;   dM(1,6)   = -1/(e+f)^2;   dM(1,7)   =  0;           dM(1,8)   =  0;
    dM(2,1)   = -1/(a+c)^2;   dM(2,2)   =  0;           dM(2,3)   = -1/(a+c)^2;   dM(2,4)   =  0;          dM(2,5)   = -1/(e+g)^2;   dM(2,6)   =  0;           dM(2,7)   = -1/(e+g)^2;   dM(2,8)   =  0;
    dM(3,1)   = -1/(a+d)^2;   dM(3,2)   =  0;           dM(3,3)   =  0;           dM(3,4)   = -1/(a+d)^2;  dM(3,5)   = -1/(e+h)^2;   dM(3,6)   =  0;           dM(3,7)   =  0;           dM(3,8)   = -1/(e+h)^2;
    dM(4,1)   =  0;           dM(4,2)   = -1/(b+c)^2;   dM(4,3)   = -1/(b+c)^2;   dM(4,4)   =  0;          dM(4,5)   =  0;           dM(4,6)   = -1/(f+g)^2;   dM(4,7)   = -1/(f+g)^2;   dM(4,8)   =  0;
    dM(5,1)   =  0;           dM(5,2)   = -1/(b+d)^2;   dM(5,3)   =  0;           dM(5,4)   = -1/(b+d)^2;  dM(5,5)   =  0;           dM(5,6)   = -1/(f+h)^2;   dM(5,7)   =  0;           dM(5,8)   = -1/(f+h)^2;
    dM(6,1)   =  0;           dM(6,2)   =  0;           dM(6,3)   = -1/(c+d)^2;   dM(6,4)   = -1/(c+d)^2;  dM(6,5)   =  0;           dM(6,6)   =  0;           dM(6,7)   = -1/(g+h)^2;   dM(6,8)   = -1/(g+h)^2;
    
    % positive on negative side is the direction of the root (0)
    for i=1:6
        if Mest(i)<M(i)
            dM(i,:)   = -dM(i,:);
        end
    end
    
    % net derivative
    df        =  sum(dM);

    % Newton-Raphson step
    an        =  a-p*f0(1)/df(1);
    bn        =  b-p*f0(2)/df(2);
    cn        =  c-p*f0(3)/df(3);
    dn        =  d-p*f0(4)/df(4);
    en        =  e-p*f0(5)/df(5);
    fn        =  f-p*f0(6)/df(6);
    gn        =  g-p*f0(7)/df(7);
    hn        =  h-p*f0(8)/df(8);
    
    % bound results: do not allow steps to very high or very low values
    if an<lo; a = (a+lo)/2;   elseif an>hi; a = (a+hi)/2;  else; a = an;  end
    if bn<lo; b = (b+lo)/2;   elseif bn>hi; b = (b+hi)/2;  else; b = bn;  end
    if cn<lo; c = (c+lo)/2;   elseif cn>hi; c = (c+hi)/2;  else; c = cn;  end
    if dn<lo; d = (d+lo)/2;   elseif dn>hi; d = (d+hi)/2;  else; d = dn;  end
    if en<lo; e = (e+lo)/2;   elseif en>hi; e = (e+hi)/2;  else; e = en;  end
    if fn<lo; f = (f+lo)/2;   elseif fn>hi; f = (f+hi)/2;  else; f = fn;  end
    if gn<lo; g = (g+lo)/2;   elseif gn>hi; g = (g+hi)/2;  else; g = gn;  end
    if hn<lo; h = (h+lo)/2;   elseif hn>hi; h = (h+hi)/2;  else; h = hn;  end
    
    % decrease convergence rate
    p      =  p.*dec;
    
    % record
    Crecs(t+1,:)   =  [a b c d e f g h];
    Mrecs(t,:)     =  Mest;
end

fprintf('residual: %f',Mest./M-1)

for i=1:8
    figure(1); plot(1:maxsteps+1,Crecs(:,i)); hold on;
end
title('a, b, c, d, e, f, g, h'); xlabel('steps'); hold off;

for i=1:6
    figure(2); plot(1:maxsteps,Mrecs(:,i)./M(i)); hold on;
end
title('M_{est}/M_{actual}'); xlabel('steps'); hold off;
 
  • #17
There is nothing ambiguous about asking how to solve a system of equations for the variables. Nothing was mentioned about approximate solutions.

I'm not sure why this problem isn't as easy as was first pointed out by Infrared in #9: Solve for x in terms of y in the first equation, then plug in for x in the second equation, leaving just two equations in y and z that can be easily solved for y and z, then x.

Or the startling insight of epenguin in #10, showing that the three equations can be seen as much simpler equations in 1/x, 1y, 1/z. Substituting say U = 1/x, V = 1/y, W = 1/z you get three linear equations in the three unknowns U, V, W and once solved then you know what x, y, z are.

(All the while realizing that this last approach assumes none of x, y, z is equal to 0. But x = y = z = 0 is arguably one (very boring) solution to the original equations.)

There's no upside to making this much more difficult than necessary!
 
Last edited:
  • #18
Each of the 6 equations, from post # 16, has 4 unknowns. I'm not sure how to perform the manipulations to get a straight solution if a,b,c,d,e,f,g, and h are unique, positive, and non-zero.
 
  • #19
I apologize for not reading the entire thread thoroughly, spending some time offline to see that the original question was definitely solvable. (Maybe a new question deserves a separate thread.)
 
  • #20
Hypatio said:
Thanks for the helpful comments. However, I was looking at my problem more thoroughly and it looks like what I actually need to solve are the equations:

ab/(a+b)+ef/(e+f)=Uac/(a+c)+eg/(e+g)=Vad/(a+d)+eh/(e+h)=Wbc/(b+c)+fg/(f+g)=Xbd/(b+d)+fh/(f+h)=Ycd/(c+d)+gh/(g+h)=Z​
Looks like a set of equations for parallel connections of resistors!

From that standpoint you get the following facts:
[itex] \frac{ab}{a+b}\leq min(a, b)[/itex]
[itex] if (b\gg a) then \frac{ab}{a+b}\approx a[/itex]
 

FAQ: How to solve a small system of non-linear equations?

What is a non-linear equation?

A non-linear equation is an equation in which the variables do not have a constant rate of change. This means that the graph of the equation is not a straight line.

How do I solve a small system of non-linear equations?

To solve a small system of non-linear equations, you can use techniques such as substitution, elimination, or graphing. These techniques involve manipulating the equations to isolate one variable and then substituting it into the other equations to solve for the remaining variables.

What are some common challenges when solving non-linear equations?

Some common challenges when solving non-linear equations include finding the correct solution, dealing with multiple solutions or no solution, and dealing with complex or irrational solutions.

Can I use a calculator to solve non-linear equations?

Yes, you can use a graphing calculator or a scientific calculator with a solve function to help you solve non-linear equations. However, it is important to understand the underlying concepts and techniques for solving these equations rather than relying solely on a calculator.

What real-life applications use non-linear equations?

Non-linear equations are used in many fields of science and engineering, such as physics, chemistry, economics, and biology. They can be used to model complex systems and phenomena, such as population growth, chemical reactions, and motion of objects under the influence of gravity.

Back
Top