HMM with continuous observation - PDFs to probabilities

AI Thread Summary
Transitioning from a discrete to a continuous observation Hidden Markov Model (HMM) involves replacing the probability of observing a symbol with a probability density function (PDF), typically Gaussian. This change results in the forward and backward coefficients becoming products of PDFs, leading to higher-order density units, which raises concerns about the interpretation of parameters. Specifically, using the Baum-Welch algorithm for parameter re-estimation can yield a situation where initial state probabilities are treated as PDFs, complicating their interpretation. Despite this issue, literature indicates that many practitioners do not address the implications of this shift from probabilities to probability densities. Understanding how this transition is managed in practice is crucial for ensuring the validity of HMM applications.
rynlee
Messages
44
Reaction score
0
So I am working with a Hidden Markov Model with continuous observation, and something has been bothering me that I am hoping someone might be able to address.

Going from a discrete-observation HMM to continuous-observation HMM is actually quite straightforward (for example see Rabiner's 1989 tutorial on HMM). You just change the probability of observing a symbol k in state i, bi(k), to a PDF of a distribution (typically Gaussian) bi(vk)around some mean value.

Here's the thing, if you do that, then the forward and backward coefficients become products of PDFs, instead of products of probabilities, which gives them increasingly higher-order density units. That is quite disconcerting, for example if you use the Baum formalism for re-estimating the parameters defining the HMM, training against some data set, then you end up setting the probability of observing each state in the first timestep, πi, equal to a PDF (well a product of two PDFs, again see Rabiner), which doesn't make any sense...

Looking at several papers in the literature however, that seems to be what people are doing, nobody comments on the shift from probabilities to probability densities, so I'm hoping that someone can explain how this situation is OK/what I am missing.

Thank you!
 
Physics news on Phys.org
rynlee said:
... if you do that, then the forward and backward coefficients become products of PDFs, instead of products of probabilities, which gives them increasingly higher-order density units...
That doesn't quite sound right, can you give an example?
 
Sure, suppose your Gaussian mixture model only has one mixed state for the sake of argument:

N(mu_i, sigma_i, Ot) = 1/(sqrt(2*pi)*sigma_i) * exp(-(Ot-mu_i)^2/(2*sigma_i^2) )

and
b_i(Ot) = N(mu_i, sigma_i, Ot)

for each state 1<= i <= N

Because of the leading term in the gaussian, b_i(Ot) has units 1/sigma, as we would expect for a probability density.

So when you calculate alpha inductively,

alpha_j(t) = sum[alpha_i(t-1) * a_ij, i=1:N] * b_j(Ot)

each successive alpha has units of (1/sigma)^t. That is opposed to alpha being a unitless probability.

Now, the Baum-Welch algorithm shouldn't care if alpha and beta have units or not, since it's just looking for relative quantities, but it remains highly disconcerting as the interpretations of the parameters breaks down.
 
I was reading documentation about the soundness and completeness of logic formal systems. Consider the following $$\vdash_S \phi$$ where ##S## is the proof-system making part the formal system and ##\phi## is a wff (well formed formula) of the formal language. Note the blank on left of the turnstile symbol ##\vdash_S##, as far as I can tell it actually represents the empty set. So what does it mean ? I guess it actually means ##\phi## is a theorem of the formal system, i.e. there is a...
Back
Top