HMM Homework: Calculate Pr(S,π) with aRG=aGR=1/4, aRR=aGG=3/4

  • Thread starter Thread starter bowlbase
  • Start date Start date
  • Tags Tags
    Model
AI Thread Summary
The discussion revolves around calculating the probability Pr(S, π) for a Hidden Markov Model (HMM) with states "R" and "G" and a given sequence S = ACGT and path π = RGGR. The transition probabilities are defined, with aRG and aGR both at 1/4, while aRR and aGG are at 3/4. Emission probabilities vary by state, with each character having specific probabilities based on the state. Participants express uncertainty about the correct application of the formula and whether to include the initial state probabilities in their calculations. Clarification is sought on the correct method for combining probabilities and the implications of the initial state choice on the overall calculation.
bowlbase
Messages
145
Reaction score
2

Homework Statement


Consider an HMM with two possible states, “R” and “G” (for “regulatory” and “gene” sequences respectively). Each state emits one character, chosen from the alphabet {A,C,G,T}.

The transition probabilities of this HMM are:
aRG = aGR = 1/4
aRR = aGG = 3/4
The emission probabilities are:
eR (A)= eR (C)= eR (G)= eR (T)=1/4
eG (A)= eG (T)=2/10 and
eG (C)= eG (G)=3/10

Assume that the initial state of the HMM is “R” or “G” with equal probabilities. Given a sequence S = ACGT and an HMM path π = RGGR, calculate the probability Pr(S, π) of the sequence and the path.

Homework Equations


$$P(S,\pi) = \prod_{i=1} a_{\pi_{i-1}},\pi_i e_{\pi_i}(x_i)$$

The Attempt at a Solution



We discussed this equation in class but never actually used it or spent time describing how to wield it. I just don't see how the information I'm given comes together in that equation.

Thanks for any help.
 
Physics news on Phys.org
I think that I have found a way to do this, but I want to make sure it is correct.

I believe that the equation is telling me to multiply the two probabilities (prob.to be ACGT within R or G and prob. that R or G changed) and then multiply all of those together. I'm just not sure about order..

So, I have S=ACGT and pi = RGGR

ACGT
RGGR

The first probability is 1/2 for either R and G. Then A, being in R, has prob 1/4. And then there is a just a 1/4 probability that R->G for the next character.

I'm not sure if 1/2 should be included from the beginning or not because the of how the last character works. But I have

(1/4*1/4)(3/10*3/4)(3/10*1/4)(1/4)

So, I've multiplied each emission probability with the probability of R/G swapping. Then multiplied all of those together. The last probability I did not multiply against anything (other than the series) because there is no following change. Though, I wonder if I should multiply by 1/2 since the first choice was probability 1/2 between R and G.

Is this the correct method?

Thanks.
 
It isn't entirely clear, but judging from the equation you are given it looks like the RGGR path starts at the second state (1), and the first emitted character is just after entry to that state. (Consider the i=1 term in the product.). If so, you need to consider (and sum) the sequences RRGGR and GRGGR.
 
I picked up this problem from the Schaum's series book titled "College Mathematics" by Ayres/Schmidt. It is a solved problem in the book. But what surprised me was that the solution to this problem was given in one line without any explanation. I could, therefore, not understand how the given one-line solution was reached. The one-line solution in the book says: The equation is ##x \cos{\omega} +y \sin{\omega} - 5 = 0##, ##\omega## being the parameter. From my side, the only thing I could...

Similar threads

Back
Top