|
Home
mRNA Site Predictor
Example Output
List of Outputs
Tools
Sites in User Submitted Sequence
Sites in Known Genes
Check Results
Documentation
Output Format
Overview Plot
Top Sites
Full Site Detail
DSM/HMM Used
DSM Overview Plot
Empirical Positions
Nucleotide Frequency
|
| Empirical
Positioning of the Elements |
The
structure of the discrete state-space model (DSM) used to model
3'-processing control sequences for yeast. All state-to-state
transitions not explicitly labeled have a probability of 1.0.
The hexagonal elements are background elements that can take
on any length in the given range with equal probability. The
functional elements e1-e4 are hexamers, with
individual nucleotide frequencies determined through analysis
with the Gibbs Sampler. Probabilities p1-p4
were optimized empirically in analysis of known processing
sites. The position of the cleavage and polyadenylation is
the center of the c element. Nucleotide probabilities
for the c element were obtained from the 1,352 training
sequences.
The positioning of the elements was determined by measuring
the distribution of the positions for each hexamer in the region
near the 3'-processing site for 1,352 putative processing sites.
Similar hexamers were clustered using a k-means algorithm on
the basis of similar profiles.
The probabilities shown in the model at the top were empirically
optimized to p1 = 0.8, p2 = 0.65, p3 =
0.5, and p4 = 0.65. The figure below shows the resulting
probability of occurrence for each of the elements, e1-e4
in our model.
|
|