# Approximating Non Linear Systems by Using The Matrix Eponential

1. Jul 17, 2009

### John Creighto

I tried posting this to my blog but the preview function wasn't rendering the formula's correctly and once I posted it I couldn't edit it.

https://www.physicsforums.com/blog.php?b=1152 [Broken]

Therefor I'll post it here instead.

For simplicity let's consider a very simple ODE.

$$\dot{x_1}=a x_1^2$$

We can approximate this first order system with a second order ODE as follows:

$$\left[ \begin{array}{c} \dot{x_1} \\ \dot{x_2} \end{array} \right] = \left[ \begin{array}{ccc} 0 & 1 \\ 0 & \frac{d f(x_1)}{d x_1} \end{array} \right] \left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right]$$

Where

$$x_2=\frac{dx_1}{dt}$$

Or in the simple case mentioned above we have:

$$\left[ \begin{array}{c} \dot{x_1} \\ \dot{x_2} \end{array} \right] = \left[ \begin{array}{ccc} 0 & 1 \\ 0 & 2x_1(t_o) \end{array} \right] \left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right]$$

Using the matrix exponential the solution to the linear approximation of this stem as follows:

$$\left[ \begin{array}{c} x(t) \\ \dot{x}(t) \end{array} \right] = exp \left( \left[ \begin{array}{ccc} 0 & 1 \\ 0 & 2x_1(t_o) \end{array} \right] (t-t_o) \right) \left[ \begin{array}{c} x(t_o)) \\ \dot{x}(t_o) \end{array} \right]$$

Where:

$$\dot{x}(t_o)=ax(t_o)^2$$

Keep in mind the choice of using a second order approximation was somewhat arbitrary. We could of equally well, done a third order approximation as follows:

$$\left[ \begin{array}{c} \dot{x_1} \\ \dot{x_2} \\ \dot{x_3} \end{array} \right] = \left[ \begin{array}{ccc} 0 & 1 & 0\\ 0 & 0 & 1\\ 0 & 2 a & 2 a x(t_o) \end{array} \right] \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array} \right]$$

Where

$$x_2=\frac{dx_1}{dt}, x_3=\frac{d^2x_1}{dt^2}$$

and

$$\dot{x}(t_o)=ax(t_o)^2$$
$$\ddot{x}(t_o)=2 a x(t_o) \dot{x}(t_o)=2a^2x(t_o)^3$$

Which can also be solved using the matrix exponential.

Last edited by a moderator: May 4, 2017
2. Jul 18, 2009

### John Creighto

Scaling and Squaring

In the last post it was shown how to approximate a non linear system with a linear system of arbitrary order. This has several advantages over standard ODE numeric algorithms.

1) There is the potential to take larger step sizes.
2) It is useful in estimation problems because we can derive linear relationships between the current state and the future value.
3) We can develop expressions for error propagation
4) It allows if desired the separation of fast and slow dynamics (note that this is also done in ODE solvers for stiff systems)
5) Error propagation can be reduced by the use of scaling and squaring.

1) The computation of the matrix exponential could be complex for high order systems.
2) Numerical problems could arise if there are high order poles or poles which are close together.
3) Some linearities may not be approximated well with linear systems.

Now, well the algorithm allows for the potential of large step sizes this is more a matter of efficiency then numerical accuracy unless the matrix exponential is computed by using matrix denationalization. Matrix diagonalization only tends to be more accurate if the state space matrix is near symmetric (loosely coupled systems). Otherwise scaling and squaring algorithms tend to be more reliable in computing the matrix exponential.

The scaling and squaring principle works on the premises that:

$$exp(A)=exp(A/s)^s$$

Where $$A$$ is a matrix and $$s$$ is a constant.

(note that it is easy to show that is in gneral doesn't work if $$s$$ is a matrix unless $$A$$ and $$S$$ commute.)

$$A$$ is scaled by $$s$$ to be sufficiently small so that $$exp(A/s)$$ can be approximated with a pade approximation. The squaring is done recursively. That is:

$$A^n=A^{n/2}A^{n/2}$$

This reduces numerical error and the number of multiplications.

The choice of $$s$$ is critical because if $$s$$ is too large then there will be too much error introduced by the pade approximation but if $$s$$ is too small then there will be too much error introduced by the squaring part of the algorithm.

It should be noted that when solving the differential equation we are computing the matrix exponential of $$At$$. If $$s$$ is our scaling factor then as long as the time step is not less then $$t/s$$ we will not introduce any additional error by taking intermediate steps. Therefor if our only concern is accuracy we should compute the matrix exponential of $$At/s$$ for each time step.

If we were naive then we could now solve the differential equation as follows:

$$\boldsymbol{x}(t+t/s)=exp(A(t)t/s)\boldsymbol{x}(t)$$

This has the advantage over taking large time steps in that we can recompute the matrix exponential each time step but the disadvantage that it requires n multiplication for n step. This could result in large error propagation.

Well, we may need to do this to initially compute our state space matrix for the next time step once we have initialized the state space matrices we can reculculate the value of the the states based on large time steps using the following recursive relationship.

$$exp(A(t_1,t_2))=exp \left( A \left( t_1,\frac{t_2+t_1}{2} \right) \right) exp \left( A \left( \frac{t_2+t_1}{2},t_1 \right) \right)$$

Where the recursion is terminated when $$t_2-t_1$$ is sufficiently small that we can approximate an average state transition matrix over the interval $$[t_2,t_1]$$. Example approximations include:

$$exp(A(t_1,t_2)) \approx exp(A(t_1)(t_2-t_1))$$
$$exp(A(t_1,t_2)) \approx exp \left( \frac{A(t_1)+A(t_2)}{2} (t_2-t_1) \right) \right)$$

We expect good numerical accuracy if we can make the above approximation when $$t_2-t_1$$ is above or at least not a lot smaller then the chosen scale factor $$s$$.

Intuitively this scaling and squaring should improve numerical accuracy as additions and subtractions will take place with respect to numbers that will likely be more simmilar in magnitude. (I'll see if I can find a proof).

3. Jul 20, 2009

### John Creighto

In my firt Post we considered second order approximations:

$$\left[ \begin{array}{c} \dot{x_1} \\ \dot{x_2} \end{array} \right] = \left[ \begin{array}{ccc} 0 & 1 \\ 0 & \frac{d f(x_1)}{d x_1} \end{array} \right] \left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right]$$

of the system:

$$\dot{x_1}=a x_1^2$$

Giving the linear approximation:

$$\left[ \begin{array}{c} \dot{x_1} \\ \dot{x_2} \end{array} \right] = \left[ \begin{array}{ccc} 0 & 1 \\ 0 & 2x_1(t_o) \end{array} \right] \left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right]$$

In this approximation we:
- Derive an expression for $$\ddot{x}$$ from the first order differential equation.
-The $$\ddot{x}$$ is approximated as a linear function of the first derivative.
-The $$x$$ is computed as the integral of $$\dot{x}$$.

Because the resulting differential exultation computes $$x$$ purely as an integral of $$\dot{x}$$, the computation is metastable. This could unnecessarily increase error propagation. To correct this we should feed back information about $$x$$ into the computation of $$\ddot{x}$$.

The error in our computation of $$\ddot{x}$$ is given by:

$$e(\ddot{x})=\left(\frac{df(x)}{dx}-\frac{df(x(t_o))}{dx}\right)(\dot{x}-\dot{x}_)$$
Or in our example:
$$e(\ddot{x})=2\left(x-x_o\right)(\dot{x}-\dot{x}_o)$$
$$e(\ddot{x})=2\left(x-x_o\right)(x^2-\dot{x}_o)$$
$$e(\ddot{x})=2(x^3-x\dot{x}_o-x_ox^2+\dot{x}_ox_o)$$
differentiating with respect to x:
$$\dot{e}(\ddot{x})=2(3x^2-\dot{x}_o-2x_ox)$$
$$\dot{e}(\ddot{x})=4x_o^2-4x_o^3$$

$$\left[ \begin{array}{c} \dot{x_1} \\ \dot{x_2} \end{array} \right] = \left[ \begin{array}{ccc} 0 & 1 \\ (4x_o^2-4x_o^3) & 2x_1(t_o) \end{array} \right] \left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right] + \left[ \begin{array}{c} 0 \\ -(4x_o^2-4x_o^3)x_o \end{array} \right]$$

Last edited: Jul 20, 2009