Note!: The prediction tools have returned! They can be accessed using the menu at right. Note!

(30 March 2005) The old code was not recoverable, so this is a new code, and the model has changed somewhat due to variations in exact implementation. Qualitatively, at least, the predictions are largely unchanged, however there will undoubtedly be some differences between the old predictions and the new. One benefit is that the specificity of the new predictions appears to be significantly better. I'm hoping to improve the e-value sophistication, but the tools should be useable as is.


As you may notice, these pages are undergoing some changes, so links may occasionally not work- please send messages to me at jhgraber@jax.org, if you find broken links. The existing output of known genes of interest will be converted to the new model in the near future, but not today!

Citing this work

If you utilize these predictions, please cite "Probabilistic prediction of S. cerevisiae mRNA 3'-processing sites" JH Graber, GD McAllister, and TF Smith, Nucleic Acids Research 30(8):1851-8, 15 April 2002.

Available Analysis

Analysis of 429 pairs of overlapping ORFs
Assessment of putatively spurious predictions

EST sequence data
(fasta formatted, coding sequence replaced with Ns, sequence extending from 110 bases upstream to 50 bases downstream of the putative cleavagesite)

Most likely 3'UTR for all predicted yeast genes
(tab delimited data: each line contains ORF identifier, position of maximum DSM/HMM score in the 500 nt following the stop codon, and value of the maximum DSM/HMM score in the 500 nt following the stop codon) Note: this is only, of course, the most likely site of 3'-end processing, based on our methodology. Many genes have multiple sites of nearly equal strength. Also, we will probably miss about 3% of the genes that have UTRs extending beyond our arbitrary cutoff of 500 nt.

Sequence of most likely 3'UTR for all predicted yeast genes
(fasta formatted sequences of the UTRs implied by the position of the most likely site of 3'-end processing, based on our methodology.) Same caveats apply as before.

Sequence of all predicted genes, ORF + most likely 3'UTR
(Warning: filesize > 5 MB) Same caveats apply as before)