Undergrad How to solve a small system of non-linear equations?

Hypatio · Mar 24, 2020

If we have the following relations between x, y, and z:

M_{xy}=\frac{2xy}{x+y}
M_{xz}=\frac{2xz}{x+z}
M_{yz}=\frac{2yz}{y+z}

where M_{xy}, M_{xz}, and M_{yz} are known constants, what technique can be used to determine the values of x, y, and z?

.Scott · Mar 24, 2020

I would use Newton-Raphson.
Guess, make small changes and see whether they get you closer (or calculate the derivatives directly), then calculate a new guess based on Newton-Raphson.

BvU · Mar 24, 2020

Dear @Hypatio,

Do you understand your question is ambiguous ? In the sense it is totally unclear what you are searching: an analytical approach, or a numerical one ? Makes a differnce !

Hypatio · Mar 24, 2020

BvU said:

Dear @Hypatio,

Do you understand your question is ambiguous ? In the sense it is totally unclear what you are searching: an analytical approach, or a numerical one ? Makes a differnce !

Sorry. The technique is something I will program. I prefer an analytical approach unless it is overly complicated.

Hypatio · Mar 24, 2020

.Scott said:

I would use Newton-Raphson.
Guess, make small changes and see whether they get you closer (or calculate the derivatives directly), then calculate a new guess based on Newton-Raphson.

Maybe I am making a mistake, but using a Newton-Raphson like method has extremely poor convergence, or will explode.

.Scott · Mar 24, 2020

Using the method I suggested, I posed the problem with Mxy,Mxa,Myx = (-5,5,2) and my starting guess was (x,y,z) = (-6,-3,2). I tried to choose a starting guess that would avoid crossing and denominators through zero.

I didn't work. On the 9th iteration, it blew up. "y" got very large as Myz got near 2.4. Perhaps the solution was imaginary.

So I tried it with Mxy,Mxa,Myx = (-12,11,-14) - which I knew had a solution. It seemed to like starting guesses that were in the ball park for magnitude - so I guessed x,y,z=(10,-9,10). It correctly closed on (12.657,-4.07048,9.7363).

So it's a potentially workable approach. I used Excel. I needed to invert a 3x3 matrix (MINVERSE) and multiply a 1x3 with a 3x3 (MMULT).

.Scott · Mar 24, 2020

Hypatio said:

Maybe I am making a mistake, but using a Newton-Raphson like method has extremely poor convergence.

Yes. When there's stuff in the denominator that can go to zero - your starting guess needs to be chosen to avoid that. So you might have to produce a look-up table based on the initial Mxy:Myz:Mxz ratios to seed the method.
I would use Matlab or something similar to draw the functions out - so I could see where the problems were.

.Scott · Mar 24, 2020

It is also possible that there is a straight solution.

Infrared · Mar 24, 2020

You can solve for ##x## in the first equation: ##2xy=M_{xy}(x+y)=xM_{xy}+yM_{xy}## so ##2xy-xM_{xy}=yM_{xy}##. Factoring the lefthandside as ##x(2y-M_{xy})## gives ##x=\frac{yM_{xy}}{2y-M_{xy}}##.

Similarly you can solve for ##z## in the last equation ##z=\frac{yM_{yz}}{2y-M_{yz}}##.

Now substitute both of these into the second equation to get a single equation for ##y##.

epenguin · Mar 24, 2020

Isn't this just an ordinary set of linear equations in the reciprocals, 1/x, 1/y, 1/z ?

.Scott · Mar 25, 2020

epenguin said:

Isn't this just an ordinary set of linear equations in the reciprocals, 1/x, 1/y, 1/z ?

Yes.
(1/Mxy) is the average of 1/x and 1/y; etc.

Hypatio · Mar 28, 2020

Thanks for the helpful comments. However, I was looking at my problem more thoroughly and it looks like what I actually need to solve are the equations:

ab/(a+b) + ef/(e+f)=U \\ ac/(a+c) + eg/(e+g)=V \\ ad/(a+d) + eh/(e+h)=W \\ bc/(b+c) + fg/(f+g)=X \\ bd/(b+d) + fh/(f+h)=Y \\ cd/(c+d) + gh/(g+h)=Z

where U,V,W,X,Y,Z are known and a, b, c, d, e, f, g, h are unknown. These can also be rearranged as

abe+abf+aef+bef-U(ae+af+be+bf)=0 \\ ace+acg+aeg+ceg-V(ae+ag+ce+cg)=0 \\ ade+adh+aeh+deh-W(ae+ah+de+dh)=0 \\ bcf+bcg+bfg+cfg-X(bf+bg+cf+cg)=0 \\ bdf+bdh+bfh+dfh-Y(bf+bh+df+dh)=0 \\ cdg+cdh+cgh+dgh-Z(cg+ch+dg+dh)=0 

Infrared showed that my original problem could be solved easily, but I'm not so sure about this one.

Is it possible to simply represent this system as a matrix of coefficients and just invert? But it is not clear to me how to represent the many products of the unknowns in such a system. I am using MATLAB.

My one concern is that there are 8 unknowns and 6 knowns, so perhaps I must first constrain two of the unknowns before it is solveable.

Stephen Tashi · Mar 28, 2020

The theoretical way to solve polynomial equations is: http://www.scholarpedia.org/article/Groebner_basis

Perhaps you can find MATLAB library functions that implement that technique.

.Scott · Mar 28, 2020

Once you find a solution for a, b, c, d, e, f, g, and h; there will be another solution by swapping a, b, c, d with e, f, g, h.
With ##a_i = 1/a \space\space##, ##\space b_i=1/b \space\space##, etc, you get the slightly simpler:

##\space\space (a_i+b_i)(e_i+f_i)U = a_i + b_i +e_i + f_i##
##\space\space (a_i+c_i)(e_i+g_i)V = a_i + c_i +e_i + g_i##
##\space\space (a_i+d_i)(e_i+h_i)W = a_i + d_i +e_i + h_i##
##\space\space (b_i+c_i)(f_i+g_i)X= b_i + c_i + f_i + g_i##
##\space\space (b_i+d_i)(f_i+h_i)Y= b_i + d_i +f_i + h_i##
##\space\space (c_i+d_i)(g_i+h_i)Z= c_i + d_i +g_i + h_i##

If I decided on an iterative solution that attempted to close in on the solution, I would make another change:
##\space\space r=\sqrt(U^2+V^2+W^2+X^2+Y^2+Z^2)##
##\space\space U_j = U/r\space; V_j = V/r\space; etc##
##\space\space a_j = a_i r\space; b_j = b_i r\space; etc##

The equation above then follow this pattern:
##\space\space (a_j+b_j)(e_j+f_j)U_j = a_j + b_j +e_j + f_j##

This, combined with sorting the U, V, W, X, Y, Z by values would help identify and select good seed values for the iterations.

epenguin · Mar 29, 2020

Hypatio said:

My one concern is that there are 8 unknowns and 6 knowns, so perhaps I must first constrain two of the unknowns before it is solveable.

Anyone would have to know what problem gave rise to the equations to identify suitable constraints if a unique solution is required.

Hypatio · Apr 2, 2020

I was able to make an Newton-Raphson solver for the system

##\space\space (a+b)(e+f)M_1 = a + b +e + f##
##\space\space (a+c)(e+g)M_2 = a + c +e + g##
##\space\space (a+d)(e+h)M_3 = a + d +e + h##
##\space\space (b+c)(f+g)M_4= b + c + f + g##
##\space\space (b+d)(f+h)M_5= b + d +f + h##
##\space\space (c+d)(g+h)M_6= c + d +g + h##

but there seem to be some bad convergence properties. A straight Newton-Raphson often has poor convergence but I've found that if you modulate the stepping rate by a coefficient p

##\space\space x_{n+1} = x_n-p\frac{f(n_n)}{f'(x_n)}##

with p<1 then convergence can improve. However, I've tried a few different sets of ##M_1##...##M_6## values and the optimal choices for p change, and I don't understand why, so I don't know how to make a universal solver.

The following MATLAB code solves it with p=1/2, and two choices of M coefficients. In the uncommented set, the solution is found in about 20 steps, although oscillations persist. The second set of M coefficients (commented) can be partially solved in about 200 steps if p~0.03, but ##M_6## fails to converge. I've also experimented with letting p decrease with timestep, which can help.

It is also worth mentioning that, for some reason, bounding the coefficients in the way that I have is critical for convergence.

Any idea how to make this better? Thanks for all the help.

Code:

maxsteps          =  200; % max timesteps
p                 =  0.5;    % starting rate
dec               =  1;    % decrease rate by this fraction each step

Mest              =  zeros(6,1);  % initialize
dM                =  zeros(6,8);  % initialize

% known input coefficients (M)
M                 =  [32.5;        3.65384615384615; 3.65384615384615; 3.65384615384615; 3.65384615384615; 2.5];
% M                 =  [4.975388788; 0.083993056;      0.013540861;      0.082389867;      0.013327759;      0.013806649];

% min/max/avg unknowns (a,b,c,d,e,f,g,h)
lo                =   min(1./M)./1000;  % min coefficient value
hi                =   max(1./M).*1000;  % max coefficient value
avg               =  mean(1./M);        % avg coefficient value

% unknown ouput coefficients (a,b,c,d,e,f,g,h)
[a,b,c,d,e,f,g,h] =  deal(avg);         % average 1/M to starting guess

% recording
Crecs             =  zeros(maxsteps,8);
Mrecs             =  zeros(maxsteps,6);
Crecs(1,:)        =  [a b c d e f g h];

for t=1:maxsteps
    % current M estimates
    Mest(1)   =  1/(a+b)+1/(e+f);
    Mest(2)   =  1/(a+c)+1/(e+g);
    Mest(3)   =  1/(a+d)+1/(e+h);
    Mest(4)   =  1/(b+c)+1/(f+g);
    Mest(5)   =  1/(b+d)+1/(f+h);
    Mest(6)   =  1/(c+d)+1/(g+h);
    
    % distance to correct M
    deltaM    =  abs(Mest-M);
    
    % average distance to correct M, for change in a coefficient value
    f0(1)     =  (deltaM(1)+deltaM(2)+deltaM(3))/3;   % average distance at new a
    f0(2)     =  (deltaM(1)+deltaM(4)+deltaM(5))/3;   % average distance at new b
    f0(3)     =  (deltaM(2)+deltaM(4)+deltaM(6))/3;   % average distance at new c
    f0(4)     =  (deltaM(3)+deltaM(5)+deltaM(6))/3;   % average distance at new d
    f0(5)     =  (deltaM(1)+deltaM(2)+deltaM(3))/3;   % average distance at new e
    f0(6)     =  (deltaM(1)+deltaM(4)+deltaM(5))/3;   % average distance at new f
    f0(7)     =  (deltaM(2)+deltaM(4)+deltaM(6))/3;   % average distance at new g
    f0(8)     =  (deltaM(3)+deltaM(5)+deltaM(6))/3;   % average distance at new h
    
    % M derivatives for each coefficient
    % a                       b                         c                         d                        e                         f                         g                         h
    dM(1,1)   = -1/(a+b)^2;   dM(1,2)   = -1/(a+b)^2;   dM(1,3)   =  0;           dM(1,4)   =  0;          dM(1,5)   = -1/(e+f)^2;   dM(1,6)   = -1/(e+f)^2;   dM(1,7)   =  0;           dM(1,8)   =  0;
    dM(2,1)   = -1/(a+c)^2;   dM(2,2)   =  0;           dM(2,3)   = -1/(a+c)^2;   dM(2,4)   =  0;          dM(2,5)   = -1/(e+g)^2;   dM(2,6)   =  0;           dM(2,7)   = -1/(e+g)^2;   dM(2,8)   =  0;
    dM(3,1)   = -1/(a+d)^2;   dM(3,2)   =  0;           dM(3,3)   =  0;           dM(3,4)   = -1/(a+d)^2;  dM(3,5)   = -1/(e+h)^2;   dM(3,6)   =  0;           dM(3,7)   =  0;           dM(3,8)   = -1/(e+h)^2;
    dM(4,1)   =  0;           dM(4,2)   = -1/(b+c)^2;   dM(4,3)   = -1/(b+c)^2;   dM(4,4)   =  0;          dM(4,5)   =  0;           dM(4,6)   = -1/(f+g)^2;   dM(4,7)   = -1/(f+g)^2;   dM(4,8)   =  0;
    dM(5,1)   =  0;           dM(5,2)   = -1/(b+d)^2;   dM(5,3)   =  0;           dM(5,4)   = -1/(b+d)^2;  dM(5,5)   =  0;           dM(5,6)   = -1/(f+h)^2;   dM(5,7)   =  0;           dM(5,8)   = -1/(f+h)^2;
    dM(6,1)   =  0;           dM(6,2)   =  0;           dM(6,3)   = -1/(c+d)^2;   dM(6,4)   = -1/(c+d)^2;  dM(6,5)   =  0;           dM(6,6)   =  0;           dM(6,7)   = -1/(g+h)^2;   dM(6,8)   = -1/(g+h)^2;
    
    % positive on negative side is the direction of the root (0)
    for i=1:6
        if Mest(i)<M(i)
            dM(i,:)   = -dM(i,:);
        end
    end
    
    % net derivative
    df        =  sum(dM);

    % Newton-Raphson step
    an        =  a-p*f0(1)/df(1);
    bn        =  b-p*f0(2)/df(2);
    cn        =  c-p*f0(3)/df(3);
    dn        =  d-p*f0(4)/df(4);
    en        =  e-p*f0(5)/df(5);
    fn        =  f-p*f0(6)/df(6);
    gn        =  g-p*f0(7)/df(7);
    hn        =  h-p*f0(8)/df(8);
    
    % bound results: do not allow steps to very high or very low values
    if an<lo; a = (a+lo)/2;   elseif an>hi; a = (a+hi)/2;  else; a = an;  end
    if bn<lo; b = (b+lo)/2;   elseif bn>hi; b = (b+hi)/2;  else; b = bn;  end
    if cn<lo; c = (c+lo)/2;   elseif cn>hi; c = (c+hi)/2;  else; c = cn;  end
    if dn<lo; d = (d+lo)/2;   elseif dn>hi; d = (d+hi)/2;  else; d = dn;  end
    if en<lo; e = (e+lo)/2;   elseif en>hi; e = (e+hi)/2;  else; e = en;  end
    if fn<lo; f = (f+lo)/2;   elseif fn>hi; f = (f+hi)/2;  else; f = fn;  end
    if gn<lo; g = (g+lo)/2;   elseif gn>hi; g = (g+hi)/2;  else; g = gn;  end
    if hn<lo; h = (h+lo)/2;   elseif hn>hi; h = (h+hi)/2;  else; h = hn;  end
    
    % decrease convergence rate
    p      =  p.*dec;
    
    % record
    Crecs(t+1,:)   =  [a b c d e f g h];
    Mrecs(t,:)     =  Mest;
end

fprintf('residual: %f',Mest./M-1)

for i=1:8
    figure(1); plot(1:maxsteps+1,Crecs(:,i)); hold on;
end
title('a, b, c, d, e, f, g, h'); xlabel('steps'); hold off;

for i=1:6
    figure(2); plot(1:maxsteps,Mrecs(:,i)./M(i)); hold on;
end
title('M_{est}/M_{actual}'); xlabel('steps'); hold off;

zinq · Apr 4, 2020

There is nothing ambiguous about asking how to solve a system of equations for the variables. Nothing was mentioned about approximate solutions.

I'm not sure why this problem isn't as easy as was first pointed out by Infrared in #9: Solve for x in terms of y in the first equation, then plug in for x in the second equation, leaving just two equations in y and z that can be easily solved for y and z, then x.

Or the startling insight of epenguin in #10, showing that the three equations can be seen as much simpler equations in 1/x, 1y, 1/z. Substituting say U = 1/x, V = 1/y, W = 1/z you get three linear equations in the three unknowns U, V, W and once solved then you know what x, y, z are.

(All the while realizing that this last approach assumes none of x, y, z is equal to 0. But x = y = z = 0 is arguably one (very boring) solution to the original equations.)

There's no upside to making this much more difficult than necessary!

Hypatio · Apr 4, 2020

Each of the 6 equations, from post # 16, has 4 unknowns. I'm not sure how to perform the manipulations to get a straight solution if a,b,c,d,e,f,g, and h are unique, positive, and non-zero.

zinq · Apr 4, 2020

I apologize for not reading the entire thread thoroughly, spending some time offline to see that the original question was definitely solvable. (Maybe a new question deserves a separate thread.)

Svein · Apr 28, 2020

Hypatio said:

Thanks for the helpful comments. However, I was looking at my problem more thoroughly and it looks like what I actually need to solve are the equations:

ab/(a+b)+ef/(e+f)=Uac/(a+c)+eg/(e+g)=Vad/(a+d)+eh/(e+h)=Wbc/(b+c)+fg/(f+g)=Xbd/(b+d)+fh/(f+h)=Ycd/(c+d)+gh/(g+h)=Z

Looks like a set of equations for parallel connections of resistors!

From that standpoint you get the following facts:
\frac{ab}{a+b}\leq min(a, b)
if (b\gg a) then \frac{ab}{a+b}\approx a

Undergrad How to solve a small system of non-linear equations?

Similar threads

Undergrad About the existence of Hamel basis for vector spaces

Undergrad How to define a vector field?

Undergrad Can one find a matrix that's 'unique' to a collection of eigenvectors?

Undergrad Clarifications needed for an exercise in Hungerford's abstract algebra

Undergrad Equivalent definitons for primitive polynomials

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers