|
Home
mRNA Site Predictor
Example
Output
List of Outputs
Tools
Sites in User Submitted Sequence
Sites in Known Genes
Check Results
Documentation
Output Format
Overview Plot
Top Sites
Full Site Detail
DSM/HMM Used
DSM Overview Plot
Empirical Positions
Nucleotide Frequency
|
|
List
of Top Sites |
Following the overview plot comes an ordered list of the most
probable sites for 3'-end processing. As shown below for CYC1,
the list includes the position, e-value, and DSM score. The
e-value is the expected number of sites with this score or
higher given the length of the query sequence. The list is
cut off at sites with DSM >= 3.0, which corresponds roughly
to a probability of 0.008 of occurrence in random sequence.
The probability of occurrence in random sequence was obtained
empirically through the analysis of 2 x 105 nucleotides
of sequence that were generated with 2nd order statistics (preserving
nucleotide, di-nucleotide, and tri-nucleotide frequencies)
from yeast transcripts (including 5'UTR, CDS, and 3'UTR). The
e-value is obtained by multiplying the probability of occurrence
by the length of the query sequence.
Top
Sites (for CYC1)
| position |
e-val |
DSM |
| 554 |
0.09826344 |
4.497124 |
| 559 |
1.266864 |
3.659934 |
| 549 |
4.144594 |
3.197224 |
| 404 |
4.364864 |
3.175544 |
| 402 |
4.993354 |
3.118584 |
| 407 |
5.988874 |
3.040034 |
|
|
|