REDUCE is an acronym that stands for
Regulatory
Element
Detection
Using
Correlation with
Expression. Based on a simple model for transcriptional regulation by independently acting transcription factors, REDUCE makes it possible to find regulatory elements based on a single microarray experiment. MotifREDUCE in the REDUCE Suite is a more robust and efficient reimplementation of the "
original REDUCE algorithm" by Bussemaker
et al (2001).
MotifREDUCE [options] -sequence=seqfile -measurement=measfile
Required parameters:
-sequence=seqfile --- sequence file in FASTA format
-meas=measfile --- measurement (expression/binding) file in tab-delimited format
Optional parameters:
[-topo_list=topofile] --- name of topology file (up_to_octamers)
[-topo=topology] --- single topology pattern, e.g., X3--X4
[-dicfile=file] --- list of motifs to check against. IUPAC wild cards
allowed; no length limit
[-ntop=integer] --- number of top seed motifs to print out (10)
[-iupac_pos=integer] --- number of positions to check for IUPAC degeneracy (0)
[-iupac_sym=string] --- IUPAC symbols to check against ('KMRSWYBDHVN')
[-output=dir_name] --- path to the output directory (./)
[-p_value=float] --- threshold to stop looking for new motifs (0.001)
[-max_motif=integer] --- maximum # of motifs to search (20)
[-strand=integer] --- 1 |+1 |F | L for leading strand;
2 |+2 |B for both strands;
-1 | R |C for reverse complementary;
0 | A |D auto-detection (check 1 and 2)
[-runlog=[stderr|stdout|file]]
--- direct running diagnostics message to stderr,
stdout or a specific file (stderr)
[-help] --- print out this help message
Usage:
mkdir -p results # use topology file (up_to_heptamers)
MotifREDUCE \
-meas=$REDUCE_SUITE/examples/MotifREDUCE/yeast_sample.csv \
-sequence=$REDUCE_SUITE/examples/MotifREDUCE/genome5pns600.fasta \
-topo_list=$REDUCE_SUITE/examples/MotifREDUCE/up_to_heptamers \
-o=results
HTMLSummary -c -o=results
Notes:
- The command-line user-interface of MotifREDUCE is identical to that of MatrixREDUCE, but skips the Levenberg-Marquardt non-linear least-squares optimization of weight matrix (Ws). The result is a list of motifs, which are expressed in matrix form with 1s and 0s.
- The above example dataset takes ~10s and finds 10 significant motifs on a contemporary laptop computer.