Author Topic: MotifREDUCE  (Read 90552 times)

xiangjun

  • Administrator
  • with posts
  • *****
  • Posts: 42
    • View Profile
MotifREDUCE
« on: September 27, 2016, 02:39:42 pm »
REDUCE is an acronym that stands for Regulatory Element Detection Using Correlation with Expression. Based on a simple model for transcriptional regulation by independently acting transcription factors, REDUCE makes it possible to find regulatory elements based on a single microarray experiment. MotifREDUCE in the REDUCE Suite is a more robust and efficient reimplementation of the "original REDUCE algorithm" by Bussemaker et al (2001).

MotifREDUCE [options] -sequence=seqfile -measurement=measfile

  Required parameters:
    -sequence=seqfile     --- sequence file in FASTA format
    -meas=measfile        --- measurement (expression/binding) file in tab-delimited format

  Optional parameters:
    [-topo_list=topofile]  --- name of topology file (up_to_octamers)
    [-topo=topology]       --- single topology pattern, e.g., X3--X4
    [-dicfile=file]        --- list of motifs to check against. IUPAC wild cards
                                   allowed; no length limit
    [-ntop=integer]        --- number of top seed motifs to print out (10)
    [-iupac_pos=integer]   --- number of positions to check for IUPAC degeneracy (0)
    [-iupac_sym=string]    --- IUPAC symbols to check against ('KMRSWYBDHVN')

    [-output=dir_name]     --- path to the output directory (./)
    [-p_value=float]       --- threshold to stop looking for new motifs (0.001)
    [-max_motif=integer]   --- maximum # of motifs to search (20)
    [-strand=integer]      ---  1 |+1 |F | L for leading strand;
                                2 |+2 |B     for both strands;
                               -1 | R |C     for reverse complementary;
                                0 | A |D     auto-detection (check 1 and 2)

    [-runlog=[stderr|stdout|file]]
                           --- direct running diagnostics message to stderr,
                                   stdout or a specific file (stderr)
    [-help]                --- print out this help message

  Usage:
    mkdir -p results   # use topology file (up_to_heptamers)
    MotifREDUCE \
        -meas=$REDUCE_SUITE/examples/MotifREDUCE/yeast_sample.csv \
        -sequence=$REDUCE_SUITE/examples/MotifREDUCE/genome5pns600.fasta \
        -topo_list=$REDUCE_SUITE/examples/MotifREDUCE/up_to_heptamers \
        -o=results
    HTMLSummary -c -o=results

Notes:
  • The command-line user-interface of MotifREDUCE is identical to that of MatrixREDUCE, but skips the Levenberg-Marquardt non-linear least-squares optimization of weight matrix (Ws). The result is a list of motifs, which are expressed in matrix form with 1s and 0s.
  • The above example dataset takes ~10s and finds 10 significant motifs on a contemporary laptop computer.
« Last Edit: September 27, 2016, 02:44:11 pm by xiangjun »

 

Created and maintained by Dr. Xiang-Jun Lu [律祥俊]. See also http://forum.x3dna.org and http://x3dna.org