So I am working with a Hidden Markov Model with continuous observation, and something has been bothering me that I am hoping someone might be able to address. Going from a discrete-observation HMM to continuous-observation HMM is actually quite straightforward (for example see Rabiner's 1989 tutorial on HMM). You just change the probability of observing a symbol k in state i, bi(k), to a PDF of a distribution (typically Gaussian) bi(vk)around some mean value. Here's the thing, if you do that, then the forward and backward coefficients become products of PDFs, instead of products of probabilities, which gives them increasingly higher-order density units. That is quite disconcerting, for example if you use the Baum formalism for re-estimating the parameters defining the HMM, training against some data set, then you end up setting the probability of observing each state in the first timestep, πi, equal to a PDF (well a product of two PDFs, again see Rabiner), which doesn't make any sense... Looking at several papers in the literature however, that seems to be what people are doing, nobody comments on the shift from probabilities to probability densities, so I'm hoping that someone can explain how this situation is OK/what I am missing. Thank you!