HMM with continuous observation - PDFs to probabilities

In summary, the conversation discusses concerns about transitioning from a discrete observation Hidden Markov Model to a continuous observation Hidden Markov Model. The change results in the use of probability density functions instead of probabilities, which can cause confusion and uncertainty in interpreting parameters. However, this does not affect the functionality of the model and can be resolved through further understanding and clarification.
  • #1
rynlee
45
0
So I am working with a Hidden Markov Model with continuous observation, and something has been bothering me that I am hoping someone might be able to address.

Going from a discrete-observation HMM to continuous-observation HMM is actually quite straightforward (for example see Rabiner's 1989 tutorial on HMM). You just change the probability of observing a symbol k in state i, bi(k), to a PDF of a distribution (typically Gaussian) bi(vk)around some mean value.

Here's the thing, if you do that, then the forward and backward coefficients become products of PDFs, instead of products of probabilities, which gives them increasingly higher-order density units. That is quite disconcerting, for example if you use the Baum formalism for re-estimating the parameters defining the HMM, training against some data set, then you end up setting the probability of observing each state in the first timestep, πi, equal to a PDF (well a product of two PDFs, again see Rabiner), which doesn't make any sense...

Looking at several papers in the literature however, that seems to be what people are doing, nobody comments on the shift from probabilities to probability densities, so I'm hoping that someone can explain how this situation is OK/what I am missing.

Thank you!
 
Physics news on Phys.org
  • #2
rynlee said:
... if you do that, then the forward and backward coefficients become products of PDFs, instead of products of probabilities, which gives them increasingly higher-order density units...
That doesn't quite sound right, can you give an example?
 
  • #3
Sure, suppose your Gaussian mixture model only has one mixed state for the sake of argument:

N(mu_i, sigma_i, Ot) = 1/(sqrt(2*pi)*sigma_i) * exp(-(Ot-mu_i)^2/(2*sigma_i^2) )

and
b_i(Ot) = N(mu_i, sigma_i, Ot)

for each state 1<= i <= N

Because of the leading term in the gaussian, b_i(Ot) has units 1/sigma, as we would expect for a probability density.

So when you calculate alpha inductively,

alpha_j(t) = sum[alpha_i(t-1) * a_ij, i=1:N] * b_j(Ot)

each successive alpha has units of (1/sigma)^t. That is opposed to alpha being a unitless probability.

Now, the Baum-Welch algorithm shouldn't care if alpha and beta have units or not, since it's just looking for relative quantities, but it remains highly disconcerting as the interpretations of the parameters breaks down.
 

What is HMM with continuous observation?

HMM with continuous observation is a statistical model used in machine learning for sequential data. It is a type of Hidden Markov Model (HMM) where the observations are continuous, rather than discrete, and the probability distribution is represented by a probability density function (PDF).

How does HMM with continuous observation work?

HMM with continuous observation works by assuming that there is an underlying hidden state that generates the observed data. The model calculates the probability of the observed data based on the current state and a transition matrix, which specifies the probability of transitioning from one state to another. The model then updates the state based on the observed data, and the process is repeated for each observation.

What are the advantages of using HMM with continuous observation?

HMM with continuous observation allows for more flexibility in modeling sequential data, as it can capture a wider range of patterns and dependencies compared to discrete observation models. It also allows for the incorporation of prior knowledge and can handle missing data more effectively.

How are PDFs converted to probabilities in HMM with continuous observation?

In HMM with continuous observation, the PDFs are converted to probabilities through a process called normalization. This involves dividing the PDF by the sum of all possible values of the PDF, resulting in a probability distribution. This distribution is then used to calculate the probability of the observed data for each state.

What are some real-world applications of HMM with continuous observation?

HMM with continuous observation has been used in various fields such as speech recognition, financial market prediction, and bioinformatics. It is also commonly used in natural language processing tasks such as text classification and part-of-speech tagging.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
21
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
5K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
4K
  • Quantum Interpretations and Foundations
Replies
21
Views
2K
Replies
5
Views
1K
Replies
2
Views
1K
Replies
7
Views
2K
  • Electrical Engineering
Replies
1
Views
4K
Back
Top