# ARMA model/IIR filter in state space

1. Sep 28, 2013

### divB

1. The problem statement, all variables and given/known data

Write an IIR (or ARMA) model in state space representation:

2. Relevant equations

$$y(k) = b_1 y(k-1) + \cdots + b_p y(k-p) + a_0 u(k) + \cdots + a_r u(k-r)$$

3. The attempt at a solution

I found much info how to write it in space state representation but no explanation how to get there. The main problem is how to choose the state vector and how to derive the transition matrix.

For a AR and MA only system, I found the solution (I think): The state vector is just u(k)'s or the y(k)'s: It's just the equation given. Then I used "k+1" instead of "k" in the argument and found that the transition matrix must be the identify matrix.

Can anyone tell me what the "steps" are required to get there? I am confused since I only have one equation given, but I need to come up with two equations ...

Thanks,
divB

2. Sep 29, 2013

### Päällikkö

The idea of state space is that your output (y) at time t depends only on the state of the system (x) at time t and at the input (u) at time t. Furthermore, the state of the system at time t depends only on the state at time t-1 and the input at t-1. This is to say,
y(t) = Cx(t) + Du(t),
x(t) = Ax(t-1) + Bu(t-1)

Intuitively you can think of the equations in the following way: The variable y has no memory, it just reports something instantaneous about the underlying state of the system. However, x does depend on its previous value, thus it has a short memory. We can infact do "tricks" to make this memory arbitrarily long (ie. longer than just the previous value).

So the problem is now how to define the state and if you are familiar with how higher order differential equations are related to systems of first order differential equations, this may be obvious. But let's do a simple problem to illustrate the idea anyway: If y(t) = ay(t-2) + u(t), then this is already in the form of y(t) = Cx(t) + Du(t) if D = 1 and somehow Cx(t) = ay(t-2). Well, by definition x(t) = Ax(t-1) + Bu(t-1), so one might guess that if we can somehow get x to save previous values of y, we will get the wanted result. Now if we store the value of y to x: x(t) = y(t), on the next time step we can recover the previous time step's value: Before storing a new value to x, we can extract the old. However, to get the memory two time steps long we need to do a trick: make x a vector (of length 2) and make internal saves. Again x1(t) = y(t), but x2(t) = x1(t), thus x2 carries the value y had two time steps before. Confused? Let's write the equations out.

Remember that D = 1, and quite obviously C = [0 a] makes sense, so we don't have to deal with so many coefficients inside the state equations (for D the coefficient is unique, for C you can play around with the definition of state and get pretty much anything). This means that the state has to do nothing more than to know what value y had two time steps ago. x(t) = Ax(t-1) + Bu(t-1) by definition. Now A = [0 a; 1 0], B = [1; 0], and we are done. Wait, what, how did I arrive at that? Well we stored the previous value of y to x1: x1(t) = y(t-1) and the previous value of x1 to x2: x2(t) = x1(t-1). The latter is written in the second row of the matrix A and the vector B, but for the first row we need to take the definition of y: x1(t) = y(t-1) = Cx(t-1) + Du(t-1) = ax2(t-1) + u(t-1). I hope it is obvious now that A and B are as claimed before.

It's been a while I did these, hopefully the representation was clear and without errors, and you are now able to tackle the full problem of a general IIR filter. I do seem to remember that the states in x are traditionally defined the opposite way, ie. x1 carries the oldest values of y and xn the latest. This just changes some of the rows in the matrices, but does nothing fundamental, so don't be superworried if Matlab gives you matrices that are a little bit different from what was shown here.

3. Sep 29, 2013

### divB

Hi

Thanks, that's a really great explanation!

I fully understand how I obtain the second equation, y(n)=c^Tx+du, from the initially given difference equation. And I think I can see that for the case of a AR-only or MA-only system the transition matrix is just the identity matrix. Please correct me if I am already wrong here:

MA case:
$$y(k) = [ a_1 , \cdots a_r ] x + 1 u(k) \\ x(k) = [u(k-1) , u(k-2), \cdots, u(k-r) ]^T \\ x(k+1) = I\cdot x(k)$$

Similarly for the AR case where

$$y(k) = [ b_1 , \cdots b_p ] x + 1 u(k) \\ x(k) = [y(k-1) , y(k-2), \cdots, y(k-p) ]^T \\ x(k+1) = I\cdot x(k)$$

However, I am stuck when I have both of them. I think I would do the same having an (r+p)-length state vector. However, intuition tells me, that a max{p,r}-length state vector should suffice.

(BTW: I know that there are many solutions and the (r+p) variant might work)

I tried using $x_1(k) = b_1y(n-1) + a_1 u(n-1)$ etc. as state vector. However, in this case it seems to be impossible to find a matching transition matrix. I think I still do not see how the "receipt" how I can transform any such differential/difference equation to state space :-(

I guess the first step is always using the difference equation and transforming the y(k)= equation into the the y=c^t x + du equation. Is this commonly right? Or is even there a difference, based on which state vector I choose?

Thanks again,
divB

4. Sep 30, 2013

### Päällikkö

While the idea is to write x in terms on y, the final equations cannot depend on y. The idea is to replace y with a well picked x This means that your equation for the state of AR:
$x(k) = [y(k-1), y(k-2), \dots, y(k-p)]^T$
$x(k+1) = Ix(k)$
is wrong.

Now, you do have the right idea, but we just need to mathematically to change it into a more convenient form.

I'll explain again why, in a computer algorithm, and then write the equations formally.
Suppose you have a for loop going from t = 0 to t = T, and that at the end of each loop y is printed. The for loop has only one function, stateComp. The function has two outputs, y, and x. It has two inputs: x (before it is modified by the function) and u, a predefined value.I hope you are familiar with some rudimentary C, so you understand that this is precisely what the code below does:

I assume for notational simplicity that the output y is a scalar (and thus so is also u)
Code (Text):

// Initialization of x (an array), u (from listu, the list of all inputs), y at time t = 0
u = listu[0];
// etc

// Main program
for(int t=1; t <= T; t++) {
stateComp(&x, u);
u = listu[t];
y = matrixMultiply(&C, &x) + D*u;
printf("y(%d) = %f\n", t, y);
}

Programming this stateComp function and setting the constants C and D is equivalent to formulating the problem in state space. Notice that the function does not depend directly on y, so neither can the equations. It's output is the modified x. (I have omitted the fact that the functions would also need as an input, or as a global variable, the length of the vectors, so if you actually want to implement this, keep this in mind). As I am unsure if you are familiar with programming languages or if this analogy is helping at all, I will not write the stateComp function itself.

What I hope the coding example clarified is that x does not depend on y. x is the state of the system, and with y you can output some of its aspects that you are interested in (your state variables might be the position and velocity of a car, but maybe you are only interested in plotting the position). Now onto fixing the equations
$x(k) = [y(k-1), y(k-2), \dots, y(k-p)]^T$
This is a "definition" of x, of how we wish each element x to correspond to y. This is fine. Note that this only conveys an idea, and not how x behaves in time.
$x(k+1) = Ix(k)$
This equation tells you how x changes in time, by definition: It relates x at the next time step to the current one. However, your equation says that x at the next time step is the same as x now. Thus it claims that x never changes. Clearly this is wrong.
Let's combine the two equations and rewrite:
$x(k+1) = [y(k-1), y(k-2), \dots, y(k-p)]^T$
Now this is an x that does depend on time: y changes and with it, x. However, the state is not supposed to depend on y, so this is not a permissible equation. Note that here I only gave reasons of why none of the equations work. So what to do?

If we want
$x(k) = [y(k-1), y(k-2), \dots, y(k-p)]^T$,
then clearly, if we increment the time:
$x(k+1) = [y(k), y(k-1), \dots, y(k-p+1)]^T$,
write the first equation back here and we get:
$x(k+1) = [y(k), x_1(k), \dots, x_{p-1}(k)]^T$,
right? Now what remains is to get rid of y(k) and replace it with a function of x(k). But this is precisely the defining equation of AR (I've dropped the input u for simplicity, I'll let think how to inject that into the equations, for hints see the example in my previous post):
$y(k) = [b_1, \dots, b_p]x$
Thus, simply
$x(k+1) = [ b_1x_1(k) + \dots+ b_px_p(k), x_1(k), \dots, x_{p-1}(k)]^T$,
which you can write in matrix format.

The MA case is also wrong: Looking at the evolution equation of x (x(k+1) = x(k)), it is apparent that the state never changes. The fix is similar to the one above:
If we want the state to remember the old values of u:
$x(k) = [u(k-1), u(k-2), \dots, u(k-p)]^T$,
then
$x(k+1) = [u(k), u(k-1), \dots, u(k-p+1)]^T$,
or in other words
$x(k+1) = [u(k), x_1(k), \dots, x_{p-1}(k)]^T$.
Remember that u(k) dependence is fine, so we are done.

I hope this clarified, not confused, you.

5. Oct 4, 2013

### divB

Thanks for your effort! No I finally got it.