Find a Splice Site in a DNA Sequence
A DNA sequence consists of base letters A, C, G, and T. Suppose there is a sequence that begins in an exon, contains a splice site, and ends in an intron. If the exons have a uniform base composition, the introns are deficient in C and G, and the splice site consensus nucleotide is G with probability 0.95, the frequency distributions are as follows.
In[1]:= | ![]() X |
![]() |
In[2]:= | ![]() X |
The state machine has states for exon (1), splice (2), intron (3), and end (4), with the following transition probabilities between states.
![]() |
In[3]:= | ![]() X |
The emissions are nucleotides A (1), C (2), G (3), T (4), or end (5).
In[4]:= | ![]() X |
In[5]:= | ![]() X |
Find the most probable nucleotide subsequence (exon, splice, intron, or end).
In[6]:= | ![]() X |
Out[6]= | ![]() |
Find the joint probability of the preceding nucleotide sequence and the DNA sequence.
In[7]:= | ![]() X |
Out[7]= | ![]() |