Find a Splice Site in a DNA Sequence
A DNA sequence consists of base letters A, C, G, and T. Suppose there is a sequence that begins in an exon, contains a splice site, and ends in an intron. If the exons have a uniform base composition, the introns are deficient in C and G, and the splice site consensus nucleotide is G with probability 0.95, the frequency distributions are as follows.
The state machine has states for exon (1), splice (2), intron (3), and end (4), with the following transition probabilities between states.
The emissions are nucleotides A (1), C (2), G (3), T (4), or end (5).
Find the most probable nucleotide subsequence (exon, splice, intron, or end).
Out[6]= | |
Find the joint probability of the preceding nucleotide sequence and the DNA sequence.
Out[7]= | |