NMF analysis pipeline output files:
All output filenames begin with the prefix set by the Ðo option at the command line. The files generated, in order of steps of the analysis are:
prefix.stat : mono- and di-nucleotide counts and frequencies from input file
2: WindowCount (assuming w = 3, k = 4):
prefix.w3k4.s2.counts.txt The critical file for further analysis: a tab-delimited text file with the positional word count (PWC) matrix where rows represent each k-mer and columns the counts within each aligned window. The first row has as headers the position at the start of each window, and the first column is the k-mers (ranked in decreasing s-squared).
Additional files that can be ignored for nmf
prefix.w3k4.s2.txt: the actual s^2 values (similar to chi-squared) that rank the k-mers for output in the .s2.counts.txt file.
prefix.sw3k4.counts.txt: the transformed PWC matrix after smoothing and addition of pseudocounts
4: nnmf (assuming r = 8):
prefix..sw3k4r8.weights.txt: The positioning matrix, giving the probability of observing a given motif at a specific position.
prefix..sw3k4r8.bases.txt: The sequence content, giving the probability of observing each k-mer as a part of each motif.
prefix..sw3k4r8.prog.txt: A file that simply tracks the progress of the analysis (can be deleted after the run is complete).
prefix.sw3k4r8.nweights.txt: Normalized version of the nmf weights, such that all vectors have common integration.
prefix.sw3k4r8.rwords.txt: A text table that separates each column in the base file, sorting by decreasing contribution of k-mer to the motif. In the final output there are two columns per motif, k-mer and weight, sorted by decreasing weight.
prefix.sw3k4r8.n.png: Line plot of positioning for normalized nmf weights
prefix.sw3k4r8.w.png: Line plot of positioning for raw nmf weights
prefix.sw3k4r8.motifs: Text listing of the matrixes describing the MCMC-derived motifs from the NMF bases
prefix.sw3k4r8.models.txt: reformatted file with matrixes for the motifs
zf.sw3k4r8.A.logoEx.txt: random sequences for logo for the first motif
zf.sw3k4r8.B.logoEx.txt: random sequences for logo for the second motif
zf.sw3k4r8.A.png: sequence logo image for the first motif
zf.sw3k4r8.B.png: sequence logo image for the second motif
prefix.sw3k4r8.html: The web page that displays the results.