Derivation of bayesian inference in Hidden Markov model

1. Jul 1, 2008

zyh

hi, everyone, I have a problem when I learn the Bayesian tracking in an Hidden Markov model. Firstly ,the Hidden Markov model is represented here:
http://en.wikipedia.org/wiki/Recursive_Bayesian_estimation

secondly, the problem I encounted is in the follow formular ONE

$$P(X_{i}|y_{0},\ldots,y_{i-1})=\int P(X_{i},X_{i-1}|y_{0},\ldots,y_{i-1})dX_{i-1}$$

and then get formular TWO
$$=\int P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})P(X_{i-1}|y_{0},\ldots y_{i-1})dX_{i-1}$$

then get the formular Three
$$=\int P(X_{i}|X_{i-1})P(X_{i-1}|y_{0},\ldots y_{i-1})dX_{i-1}$$

How can I get the formular Two to three. why
$$P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=P(X_{i}|X_{i-1})$$
Thank you very much for reading my post, any suggestion is appreciated.

2. Jul 2, 2008

Well, it's hard to tell exactly what's going on with your notation, but it looks to me like what you want is the probability of the current state of a Markov chain given all previous states (I don't see anything "hidden" in what you've written). In that case, the way you go from step 2 to step 3 is by application of the Markov assumption. I.e., part of the definition of a Markov chain is that, given the previous state, the current state is conditionally independent of all other previous states. Which is to say $P(x_n|x_{n-1}, \ldots x_0) = P(x_n|a_{n-1})$. So, you get from step 2 to step 3 by simply substituting this into the expression. Again, this equation is part of the definition of a Markov chain.

3. Jul 2, 2008

zyh

I have learned this top about "Hidden markov model" from the <<computer vision, -a modern approach>>.
here are the two "assumptions"
1. only the immediate past matters:

This formular is the same as you said:

$$P(X_{i}|X_{1},\ldots X_{i-1})=P(X_{i}|X_{i-1})$$
2. measurements depend only on the current state

$$P(Y_{i},Y_{j},\ldots Y_{k}|X_{i})=P(Y_{i}|X_{i})P(Y_{j},\ldots,Y_{k}|X_{i})$$

Why could I get from step 2 to step 3 by simply substituting "assumption one" into the expression. they are different expressions.

can
$$P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=P(X_{i}|X_{i-1})$$ be proofed from the "assumption one"?

4. Jul 7, 2008

Okay, so the $y_i$'s are observations, and the $X_i$'s are (hidden) states, right? To demonstrate the result you're after, I'd re-write $P(X_i | X_{i-1}, y_i, \ldots, y_{i-1})$ using Bayes' Theorem, and then apply the assumptions to simplify the result until you get down to $P(X_i|X_{i-1})$.

5. Jul 7, 2008

zyh

Thanks.
Yes, you are right, $$y_i$$ is the observation from hidden state $$X_i$$,and I want to simplify the result to get down to $$P(X_i)|X_{i-1})$$.But I can't get the proof.

6. Jul 7, 2008

Well, why don't you show us what you have so far on the proof, and where you get stuck?

7. Jul 8, 2008

zyh

I want to prove the equation according to the "two assumptions" before.
$$P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=P(X_{i}|X_{i-1})$$

I got stuck to prove it, till now, I don't have any ideas.

8. Jul 8, 2008

Well, why don't you try my suggestion of rewriting the left-hand-side using Bayes theorem, and then seeing if you can apply the assumptions to get what you want?

9. Jul 8, 2008

D H

Staff Emeritus
In a discrete time Markov process, the state Xn at time t_n depends only on the state at the previous time step, Xn-1. In a hidden Markov process, the state Xn directly determines the observables Yn at that time but the observables do not affect the state. There is no feedback. The observables are not a part of the state proper. That is why it is called a hidden Markov model.

10. Jul 8, 2008

zyh

thanks, frankly, I try to use Bayse theorem several times to prove the equation, but I failed.

$$P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=\frac{P(X_{i},X_{i-1},y_{0,}\ldots,y_{i-1})}{P(X_{i},y_{0},\ldots,y_{0})}=?$$

or
$$P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=\frac{P(X_{i-1},y_{0},...,y_{i-1}|X_{i})P(X_{i})}{P(X_{i},y_{0},\ldots,y_{0})}=?$$

11. Jul 8, 2008

zyh

thanks DH, do you mean that "the observables do not affect the state"
so, it's obviously that $$P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=P(X_{i}|X_{i-1})$$. Because Xi is only depend on X_i-1 and y0,...y_i-1 can't affect Xi.

Is my thinking right?

12. Jul 8, 2008

D H

Staff Emeritus
Yes. That's why I put "the observables do not affect the state" in italics. That is what it means for the state to be hidden.

13. Jul 8, 2008

zyh

OK, Thank you everyone, my problem is totally solved!

14. Jul 9, 2008

Err, not so fast people. The point here was to *prove* that the observables don't affect the state (i.e., the current state is conditionally independent of the previous observables, given the previous state). We know this should be the case, but having D H assure us that it is doesn't add up to a proof. Let's also note that the observables *do* affect the probability of being in a particular state if you *don't* know the previous state, so the situation is not quite as simple as it might appear.

As far as the derivation goes, you need to be a little bit tricky with the application of Bayes' Theorem. Specifically, you want all of the terms to still be conditioned on $X_{i-1}$. I.e.:

$$P(X_i|X_{i-1}, y_0, \ldots, y_{i-1}) = \frac{P(y_0, \ldots, y_{i-1}|X_{i-1}, X_i)P(X_i|X_{i-1})}{P(y_0, \ldots, y_{i-1}|X_{i-1})}$$

15. Jul 9, 2008

zyh

I find the tricky called "conditional Bayes' Theorem" http://en.wikipedia.org/wiki/Bayes'_theorem#Extensions_of_Bayes.27_theorem

$$P(A|B\cap C)=\frac{{P(A\cap B\cap C)}}{{P(B\cap C)}}=\frac{{P(B|A\cap C)\, P(A|C)\, P(C)}}{{P(C)\, P(B|C)}}=\frac{{P(B|A\cap C)\, P(A|C)}}{{P(B|C)}}\,.$$

so, the yi is only depend on xi, and the equation proved that:
$$P(y_{0},\ldots,y_{i-1}|X_{i-1},X_{i})=P(y_{0},\ldots,y_{i-1}|X_{i-1})$$

I think it's a full proof now. Thank you very much, I really appreciate your help and I have gain a lot knowledge.

16. Oct 14, 2008

zyh

HI,quadraphonics. I still get confused that why this formula below proved?

$$P(y_{0},\ldots,y_{i-1}|X_{i-1},X_{i})=P(y_{0},\ldots,y_{i-1}|X_{i-1})$$

It still need to be proved. But I can't find its proof. Can some one help me? thanks

17. Oct 15, 2008

This is a consequence of the two assumptions that go into an HMM. First is the Markov assumption on the hidden variable $X_i$:

$$P(X_i | X_1 , \ldots, X_{i-1}, Y_1, \ldots, Y_{i-1}) = P(X_i | X_{i-1})$$

I.e., the current state is conditionally independent of all previous states and observables, given the previous state. You can also flip this around and show that the current state is independent of all future states and observations, given the next state. The next assumption is a similar one on the observable process $Y_i$:

$$P(Y_i | X_1 , \ldots, X_{N}, Y_1, \ldots, Y_{i-1}, Y_{i+1}, \ldots, Y_{N}) = P(Y_i | X_i)$$

where $N$ is the number of observations. That is, the current observable is conditionally independent of everything else, given the current state. To prove the relation in question, first consider a simple case with just two observations:

$$P(Y_1, Y_2 | X_2, X_3) = P(Y_2|X_2, X_3, Y_1)P(Y_1|X_2, X_3)$$

We want to show that $X_3$ can be eliminated from the LHS. The first term on the RHS is independent of both $X_3$ and $Y_1$ according to the second HMM assumption. However, let's go ahead and leave $Y_1$ in place, since we don't want to eliminate it. To eliminate $X_3$ from the second term, introduce $X_1$ and then integrate it back out:

$$P(Y_1 | X_2, X_3) = \sum_{x_1} P(Y_1 | x_1, X_2, X_3) P(x_1 | X_2, X_3)$$

Now, the first term on the RHS is independent of $X_2$ and $X_3$, again owing to the second HMM assumption. But, again, let's leave $X_2$ in anyhow. The second RHS term is independent of $X_3$, according to the first HMM assumption. If you put that all back together, you should see how to prove the relation.

18. Oct 15, 2008

double post

19. Oct 28, 2008

zyh

Yes, I have read your derivation. In which $$X_3$$ will be eliminated through decomposition to a integration equation. That's great!
$$P(Y_1, Y_2 | X_2, X_3) =P(Y_1, Y_2 | X_2)$$