Derivation of bayesian inference in Hidden Markov model

In summary: I don't know where to stop.In summary, the Hidden Markov model is a model where the state does not directly affect the observables. However, the observables are still a part of the state. This is why it is called a hidden Markov model.
  • #1
zyh
137
0
hi, everyone, I have a problem when I learn the Bayesian tracking in an Hidden Markov model. Firstly ,the Hidden Markov model is represented here:
http://en.wikipedia.org/wiki/Recursive_Bayesian_estimation

secondly, the problem I encounted is in the follow formula ONE


[tex]$P(X_{i}|y_{0},\ldots,y_{i-1})=\int P(X_{i},X_{i-1}|y_{0},\ldots,y_{i-1})dX_{i-1}$
[/tex]

and then get formula TWO
[tex]$=\int P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})P(X_{i-1}|y_{0},\ldots y_{i-1})dX_{i-1}$[/tex]

then get the formula Three
[tex]$=\int P(X_{i}|X_{i-1})P(X_{i-1}|y_{0},\ldots y_{i-1})dX_{i-1}$[/tex]

How can I get the formula Two to three. why
[tex]$P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=P(X_{i}|X_{i-1})$[/tex]
Thank you very much for reading my post, any suggestion is appreciated.
 
Physics news on Phys.org
  • #2
Well, it's hard to tell exactly what's going on with your notation, but it looks to me like what you want is the probability of the current state of a Markov chain given all previous states (I don't see anything "hidden" in what you've written). In that case, the way you go from step 2 to step 3 is by application of the Markov assumption. I.e., part of the definition of a Markov chain is that, given the previous state, the current state is conditionally independent of all other previous states. Which is to say [itex]P(x_n|x_{n-1}, \ldots x_0) = P(x_n|a_{n-1}) [/itex]. So, you get from step 2 to step 3 by simply substituting this into the expression. Again, this equation is part of the definition of a Markov chain.
 
  • #3
Thank you quadraphonics for your help.
I have learned this top about "Hidden markov model" from the <<computer vision, -a modern approach>>.
here are the two "assumptions"
  1. only the immediate past matters:

    This formula is the same as you said:

    [tex]$P(X_{i}|X_{1},\ldots X_{i-1})=P(X_{i}|X_{i-1})$[/tex]
  2. measurements depend only on the current state

    [tex]$P(Y_{i},Y_{j},\ldots Y_{k}|X_{i})=P(Y_{i}|X_{i})P(Y_{j},\ldots,Y_{k}|X_{i})$[/tex]

Why could I get from step 2 to step 3 by simply substituting "assumption one" into the expression. they are different expressions.

can
[tex]$P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=P(X_{i}|X_{i-1})$[/tex] be proofed from the "assumption one"?
 
  • #4
Okay, so the [itex]y_i[/itex]'s are observations, and the [itex]X_i[/itex]'s are (hidden) states, right? To demonstrate the result you're after, I'd re-write [itex]P(X_i | X_{i-1}, y_i, \ldots, y_{i-1})[/itex] using Bayes' Theorem, and then apply the assumptions to simplify the result until you get down to [itex]P(X_i|X_{i-1})[/itex].
 
  • #5
quadraphonics said:
Okay, so the [itex]y_i[/itex]'s are observations, and the [itex]X_i[/itex]'s are (hidden) states, right? To demonstrate the result you're after, I'd re-write [itex]P(X_i | X_{i-1}, y_i, \ldots, y_{i-1})[/itex] using Bayes' Theorem, and then apply the assumptions to simplify the result until you get down to [itex]P(X_i|X_{i-1})[/itex].

Thanks.
Yes, you are right, [tex]y_i[/tex] is the observation from hidden state [tex]X_i[/tex],and I want to simplify the result to get down to [tex]P(X_i)|X_{i-1})[/tex].But I can't get the proof.
 
  • #6
Well, why don't you show us what you have so far on the proof, and where you get stuck?
 
  • #7
hi, quadraphonics.
I want to prove the equation according to the "two assumptions" before.
[tex]$P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=P(X_{i}|X_{i-1})$[/tex]

I got stuck to prove it, till now, I don't have any ideas.
 
  • #8
Well, why don't you try my suggestion of rewriting the left-hand-side using Bayes theorem, and then seeing if you can apply the assumptions to get what you want?
 
  • #9
In a discrete time Markov process, the state Xn at time t_n depends only on the state at the previous time step, Xn-1. In a hidden Markov process, the state Xn directly determines the observables Yn at that time but the observables do not affect the state. There is no feedback. The observables are not a part of the state proper. That is why it is called a hidden Markov model.
 
  • #10
quadraphonics said:
Well, why don't you try my suggestion of rewriting the left-hand-side using Bayes theorem, and then seeing if you can apply the assumptions to get what you want?
thanks, frankly, I try to use Bayse theorem several times to prove the equation, but I failed.

[tex]P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=\frac{P(X_{i},X_{i-1},y_{0,}\ldots,y_{i-1})}{P(X_{i},y_{0},\ldots,y_{0})}=?[/tex]

or
[tex]P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=\frac{P(X_{i-1},y_{0},...,y_{i-1}|X_{i})P(X_{i})}{P(X_{i},y_{0},\ldots,y_{0})}=?[/tex]
 
  • #11
D H said:
In a discrete time Markov process, the state Xn at time t_n depends only on the state at the previous time step, Xn-1. In a hidden Markov process, the state Xn directly determines the observables Yn at that time but the observables do not affect the state. There is no feedback. The observables are not a part of the state proper. That is why it is called a hidden Markov model.
thanks DH, do you mean that "the observables do not affect the state"
so, it's obviously that [tex]$P(X_{i}|X_{i-1},y_{0,\ldots}y_{i-1})=P(X_{i}|X_{i-1})$[/tex]. Because Xi is only depend on X_i-1 and y0,...y_i-1 can't affect Xi.

Is my thinking right?
 
  • #12
Yes. That's why I put "the observables do not affect the state" in italics. That is what it means for the state to be hidden.
 
  • #13
OK, Thank you everyone, my problem is totally solved!
 
  • #14
Err, not so fast people. The point here was to *prove* that the observables don't affect the state (i.e., the current state is conditionally independent of the previous observables, given the previous state). We know this should be the case, but having D H assure us that it is doesn't add up to a proof. Let's also note that the observables *do* affect the probability of being in a particular state if you *don't* know the previous state, so the situation is not quite as simple as it might appear.

As far as the derivation goes, you need to be a little bit tricky with the application of Bayes' Theorem. Specifically, you want all of the terms to still be conditioned on [itex]X_{i-1}[/itex]. I.e.:

[tex]
P(X_i|X_{i-1}, y_0, \ldots, y_{i-1}) = \frac{P(y_0, \ldots, y_{i-1}|X_{i-1}, X_i)P(X_i|X_{i-1})}{P(y_0, \ldots, y_{i-1}|X_{i-1})}
[/tex]
 
  • #15
thanks for your hint.

I find the tricky called "conditional Bayes' Theorem" http://en.wikipedia.org/wiki/Bayes'_theorem#Extensions_of_Bayes.27_theorem

[tex]P(A|B\cap C)=\frac{{P(A\cap B\cap C)}}{{P(B\cap C)}}=\frac{{P(B|A\cap C)\, P(A|C)\, P(C)}}{{P(C)\, P(B|C)}}=\frac{{P(B|A\cap C)\, P(A|C)}}{{P(B|C)}}\,.[/tex]

so, the yi is only depend on xi, and the equation proved that:
[tex]P(y_{0},\ldots,y_{i-1}|X_{i-1},X_{i})=P(y_{0},\ldots,y_{i-1}|X_{i-1})[/tex]

I think it's a full proof now. Thank you very much, I really appreciate your help and I have gain a lot knowledge.
 
  • #16
HI,quadraphonics. I still get confused that why this formula below proved?

[tex]P(y_{0},\ldots,y_{i-1}|X_{i-1},X_{i})=P(y_{0},\ldots,y_{i-1}|X_{i-1})[/tex]

It still need to be proved. But I can't find its proof. Can some one help me? thanks
 
  • #17
This is a consequence of the two assumptions that go into an HMM. First is the Markov assumption on the hidden variable [itex]X_i[/itex]:

[tex]
P(X_i | X_1 , \ldots, X_{i-1}, Y_1, \ldots, Y_{i-1}) = P(X_i | X_{i-1})
[/tex]

I.e., the current state is conditionally independent of all previous states and observables, given the previous state. You can also flip this around and show that the current state is independent of all future states and observations, given the next state. The next assumption is a similar one on the observable process [itex]Y_i[/itex]:

[tex]
P(Y_i | X_1 , \ldots, X_{N}, Y_1, \ldots, Y_{i-1}, Y_{i+1}, \ldots, Y_{N}) = P(Y_i | X_i)
[/tex]


where [itex]N[/itex] is the number of observations. That is, the current observable is conditionally independent of everything else, given the current state. To prove the relation in question, first consider a simple case with just two observations:

[tex]
P(Y_1, Y_2 | X_2, X_3) = P(Y_2|X_2, X_3, Y_1)P(Y_1|X_2, X_3)
[/tex]

We want to show that [itex] X_3 [/itex] can be eliminated from the LHS. The first term on the RHS is independent of both [itex]X_3 [/itex] and [itex] Y_1 [/itex] according to the second HMM assumption. However, let's go ahead and leave [itex] Y_1 [/itex] in place, since we don't want to eliminate it. To eliminate [itex]X_3 [/itex] from the second term, introduce [itex]X_1 [/itex] and then integrate it back out:

[tex]
P(Y_1 | X_2, X_3) = \sum_{x_1} P(Y_1 | x_1, X_2, X_3) P(x_1 | X_2, X_3)
[/tex]

Now, the first term on the RHS is independent of [itex] X_2[/itex] and [itex]X_3[/itex], again owing to the second HMM assumption. But, again, let's leave [itex] X_2[/itex] in anyhow. The second RHS term is independent of [itex]X_3[/itex], according to the first HMM assumption. If you put that all back together, you should see how to prove the relation.
 
  • #18
double post
 
  • #19
Thank you quadraphonics.
Because I haven't receive the notification email, I'm sorry for the late response. I will carefully read your post.
 
  • #20
Yes, I have read your derivation. In which [tex]X_3[/tex] will be eliminated through decomposition to a integration equation. That's great!
[tex]P(Y_1, Y_2 | X_2, X_3) =P(Y_1, Y_2 | X_2) [/tex]

Note that you give another two assumption of HMM, which is a bit different notation with mine in "third post". My original question in my first post can be easily proved by your assumption one. But I do think these two types of assumptions are equivalent.

Thanks for helping!
 

1. What is a Hidden Markov Model (HMM)?

A Hidden Markov Model is a probabilistic model used to model sequential data where the underlying system is assumed to be a Markov process, meaning that the next state depends only on the current state and not on any previous states. In HMMs, the state of the system is unobservable (hidden) and can only be inferred from the observed data.

2. What is the Bayesian inference approach in HMMs?

Bayesian inference is a statistical method used to update beliefs about a system based on prior knowledge and observed data. In HMMs, the Bayesian inference approach is used to estimate the hidden states of the system given the observed data and prior knowledge about the system's parameters.

3. How is the Hidden Markov Model represented mathematically?

A Hidden Markov Model is represented by a set of states, a set of observations, a transition matrix that defines the probabilities of transitioning between states, an emission matrix that defines the probabilities of observing each state, and an initial state distribution that defines the probabilities of starting in each state.

4. What is the derivation of the Bayesian inference in HMMs?

The derivation of Bayesian inference in HMMs involves using Bayes' theorem to calculate the posterior probability of the hidden states given the observed data and prior knowledge about the system. This involves calculating the likelihood of the observed data given the hidden states, the prior probability of the hidden states, and the evidence or marginal likelihood. The Bayes' theorem equation is then used to update the prior probability to the posterior probability.

5. How is the Viterbi algorithm used in HMMs?

The Viterbi algorithm is used in HMMs to find the most likely sequence of hidden states that could have generated a given sequence of observations. This algorithm uses dynamic programming to efficiently calculate the most probable path through the HMM by considering all possible paths and choosing the one with the highest probability. This is useful for tasks such as speech recognition and part-of-speech tagging.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
969
  • Calculus and Beyond Homework Help
Replies
2
Views
713
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
Replies
4
Views
4K
  • Introductory Physics Homework Help
Replies
2
Views
755
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
745
Replies
4
Views
743
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
Back
Top