Hi All,(adsbygoogle = window.adsbygoogle || []).push({});

I need to train an HMM using data with sequences of variable length (5 - 500 symbols per input sequence).

From what I've seen thus far, all (or most) trainings are perfirmed on data-sets of a fixed size, although there is no explicit demand for this in the HMM structure.

So, first of all - what am I missing and is it indeed not advised to train HMM with variable-length data? Does this violate the stochastic assumptons of the EM/Viterbi algorithms?

Next, for the model that I receive, I have "good" performance for "short" sequences, but as the sequence gets longer, the perfromance decreases (and sometimes increases back). I can relate this to two possible causes:

1) Longer sequences have dynamics uncaptured by the HMM since they are not the majority of the training set hence the "random" prediction behavior

2) HMM gets stuck on short-length model (which is another way to rephrase (1), but not exactly).

Can someone please advise on the matter?

Thanks!

**Physics Forums | Science Articles, Homework Help, Discussion**

Join Physics Forums Today!

The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

# HMM training with variable length data

**Physics Forums | Science Articles, Homework Help, Discussion**