Our interlinked databases now hold over 36 gigabytes of data. Because of this,
we are unable to post the full contents of the database tables here. However,
we are posting organism-wide sequence data which is available for public use.
Please
cite PACdb if you use the data in your research.
Thanks for your interest and feel free to e-mail us at

if you have any questions or requests.
* The 3'-UTR sequence is the single longest sequence for each gene whose terminating putative cleavage
site is of
"High" or
"Very High" confidence and has been clustered with any adjacent putative cleavage sites using
an organism-specific distance threshold (see PACdb paper).
** The flanking sequence is for any site that is of
"High" or
"Very High" confidence and has been clustered with any adjacent putative cleavage sites using
an organism-specific distance threshold (see PACdb paper).
FASTA formatted sequences from PACdb use a "rich defline" that communicates information related to the sequence,
such as related genomic information, related 3-processing information, and related gene information. There are
a number of fields that may or may not be present. If that field is not present, the abbreviation is present
instead. Here are the fields:
Genomic Fields:
genome sequence id (gsid) : could be a chromosome or contig
genome sequence start (gsi)
genome sequence stop (gsf)
3'-Processing Fields:
3'-processing coordinate (pac)
multiplicity, or the number of sequences (ESTs) giving this site (trc)
Gene Fields:
probable gene id (exid)
gene CDS start (tri)
gene CDS stop (trf)
Example:
>1:5695:5895:5895:trc:At1g01010.1:tri:trf
TTCTTTGCTCTGTTTTCTCGCTCCGGAAAAGTTTGAAGTTATATTTTATT
AGTATGTAAAGAAGAGAAAAAGGGGGAAAGAAGAGAGAAGAAAAATGCAG
AAAATCATATATATGAATTGGAAAAAAGTATATGTAATAATAATTAGTGC
ATCGTTTTGTGGTGTAGTTTATATAAATAAAGTGATATATAGTCTTGTAT