1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Hidden Markov Model

  1. Mar 8, 2015 #1
    1. The problem statement, all variables and given/known data
    Consider an HMM with two possible states, “R” and “G” (for “regulatory” and “gene” sequences respectively). Each state emits one character, chosen from the alphabet {A,C,G,T}.

    The transition probabilities of this HMM are:
    aRG = aGR = 1/4
    aRR = aGG = 3/4
    The emission probabilities are:
    eR (A)= eR (C)= eR (G)= eR (T)=1/4
    eG (A)= eG (T)=2/10 and
    eG (C)= eG (G)=3/10

    Assume that the initial state of the HMM is “R” or “G” with equal probabilities. Given a sequence S = ACGT and an HMM path π = RGGR, calculate the probability Pr(S, π) of the sequence and the path.

    2. Relevant equations
    $$P(S,\pi) = \prod_{i=1} a_{\pi_{i-1}},\pi_i e_{\pi_i}(x_i)$$

    3. The attempt at a solution

    We discussed this equation in class but never actually used it or spent time describing how to wield it. I just don't see how the information I'm given comes together in that equation.

    Thanks for any help.
  2. jcsd
  3. Mar 9, 2015 #2
    I think that I have found a way to do this, but I want to make sure it is correct.

    I believe that the equation is telling me to multiply the two probabilities (prob.to be ACGT within R or G and prob. that R or G changed) and then multiply all of those together. I'm just not sure about order..

    So, I have S=ACGT and pi = RGGR


    The first probability is 1/2 for either R and G. Then A, being in R, has prob 1/4. And then there is a just a 1/4 probability that R->G for the next character.

    I'm not sure if 1/2 should be included from the beginning or not because the of how the last character works. But I have


    So, I've multiplied each emission probability with the probability of R/G swapping. Then multiplied all of those together. The last probability I did not multiply against anything (other than the series) because there is no following change. Though, I wonder if I should multiply by 1/2 since the first choice was probability 1/2 between R and G.

    Is this the correct method?

  4. Mar 10, 2015 #3


    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    It isn't entirely clear, but judging from the equation you are given it looks like the RGGR path starts at the second state (1), and the first emitted character is just after entry to that state. (Consider the i=1 term in the product.). If so, you need to consider (and sum) the sequences RRGGR and GRGGR.
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted