Recent Posts

Pages: 1 ... 5 6 [7]
Documentation / Summary of the REDUCE Suite v2.2 programs
« Last post by xiangjun on July 29, 2016, 04:08:51 pm »
The REDUCE Suite v2.2 contains a total of 12 programs, as outlined below. The software is distributed with full source code in ANSI C. For the benefit of users, precompiled binaries for the most common Linux, Mac OS X, and Windows operating systems are also provided. Command-line help message is available for each program by specifying the -h option (e.g., LogoGenerator -h), which also include sample usages to get you started. If you have any questions in using the Suite, please do not hesitate to post them on the Forum.

Motif discovery and model building:

  • MotifREDUCE — An algorithm that builds a motif-based multivariate linear model. REDUCE is an acronym that stands for Regulatory Element Detection Using Correlation with Expression. Based on a simple model for transcriptional regulation by independently acting transcription factors (Bussemaker et al, 2001), REDUCE makes it possible to discover regulatory motifs based on a single microarray experiment. MotifREDUCE is a robust and efficient reimplementation of the original REDUCE algorithm. Required inputs are (i) a genome-wide set of measurements (mRNA expression log-ratios or ChIP fold-enrichments) and (ii) a nucleotide sequence associated with each measurement (e.g., upstream promoter sequence). Output are (i) a set of cis-regulatory oligonucleotide motifs, and (ii) the corresponding regression coefficients.
  • MatrixREDUCE — A more sophisticated algorithm that builds a multivariate linear model based on weight matrices (Foat et al., 2005, 2006). Required inputs are the same as far MotifREDUCE: (i) a genome-wide set of measurements (mRNA expression log-ratios or ChIP fold-enrichments) and (ii) a nucleotide sequence associated with each measurement (e.g., upstream promoter sequence). Outputs include (i) the binding specificity, in the form of a position-specific affinity matrix (PSAM), and (ii) the condition-specific concentration/activity for each of a set of trans-acting factors (TF).
  • OptimizePSAM — Fits PSAM parameters and coefficients for a single-TF model. MatrixREDUCE makes iterative calls to this program to build a multivariate model.
  • Transfactivity — Fit a multivariate linear model to one or more genome-wide sets of measurements. In contrast to MotifREDUCE/MatrixREDUCE, motifs/PSAMs are not inferred from the data, as in, but instead are provided as inputs. This is useful for inferring changes in the (hidden) regulatory activity of one or more TFs of known binding specificity. Transfactivity is a contraction of "trans-factor" and "activity".

Visualization of results:
  • HTMLSummary — A utility for visualizing the result of a MatrixREDUCE or MotifREDUCE run in HTML format.
  • LogoGenerator — A versatile and robust command-line tool that generates logo images in a variety of styles (raw data, frequency, conventional bit information, or affinity logo in ??G). The input can be a PSAM or a multiple sequence alignment file in either FASTA or flat format. The output logo image is in EPS format and is converted to PNG format by default for display in a web page (as from HTMLSummary), using the widely and freely available tool GhostScript tool gs. Other supported image formats include PDF, JPEG, and GIF (further utilizing the convert utility program from ImageMagick).

Affinity-based sequence analysis:
  • AffinityProfile — Convert one or more DNA/RNA sequences to single-nucleotide resolution affinity profiles or a total regional affinity. A set of motifs and/or PSAMs is required as input.

Miscellaneous utilities:
  • Convert2PSAM — Convert commonly used motif (pattern) representations of nucleic acid sequences to PSAM format, which is unique to the REDUCE Suite. It also serves to standardize the various formats to a simplified PWM format for easy communication.
  • Topo2Dictfile — Generate a motif dictionary file according to user-specified topological patterns, allowing for easy user manipulation (deleting/adding specific motifs, introducing IUPAC degeneracy symbols, etc).
  • ProcessFASTA — Process a sequence file in FASTA format to select a list of sequences based on their IDs, convert to reverse complement, combine ID and sequence into a single line, etc.
  • ProcessTdat — Manipulate tab-delimited measurement files (extract a subset of experiments, perform log-transformation, sort entries by ID, etc).
  • ExtractWindows — Extract subsequences from larger sequences (e.g., a chromosome), based on a set of start/end coordinates.
Documentation / Set up the REDUCE Suite
« Last post by xiangjun on April 21, 2016, 11:13:30 am »
Starting from REDUCE Suite v2.2, we've streamlined the downloading process. Users just need to register on this Forum, and log in to see the member-only download section (at the upper-left corner). The Suite is distributed with the source code (in ANSI C) available. Moreover, for user convenience, we have compiled the Suite on common operating systems, including Mac OS X, Linux (64-bit), and Windows (via Cygwin).

Assume you have at least basic knowledge of Linux (Unix/Mac OS X) and know how to use the shell, getting the REDUCE Suite up and running is a really simple process. It involves to set up an environment variable REDUCE_SUITE so the system knows where the Suite has been installed, and an update of your command PATH so that you can run the associated programs conveniently. The whole process is further facilitated by the REDUCE_Suite_setup script, as shown below:

Code: [Select]
To install the REDUCE Suite, do as follows:
  (0) Download the Suite from
      You also need to have Perl installed for the setup step.
      Note that you *must* register and log in to see the download page.
      Assuming your downloaded tarball is for the macOS:

  (1) tar zxvf REDUCE-Suite-v2.2-macosx-intel.tar.gz
        This will create a directory named REDUCE-Suite-v2.2/

  (2) cd REDUCE-Suite-v2.2/
        You are now in the REDUCE-Suite-v2.2/ directory

  (3) >>> [optional] ONLY IF you compile REDUCE Suite from source <<<
        (3a) cd src/
        (3b) make
        (3c) cd ../   # back to REDUCE-Suite-v2.2/, as with step (2)

  (4) ./bin/REDUCE_Suite_setup    # assuming you are at directory: REDUCE-Suite-v2.2/
        To run the REDUCE Suite, you need to set up the followings:
          o the environment variable REDUCE_SUITE
          o add $REDUCE_SUITE/bin to your command line search path

        for your 'bash' shell, please add the following into ~/.bashrc:
            export REDUCE_SUITE='/Users/xiangjun/Luxes/REDUCE_Suite'
            export PATH='/Users/xiangjun/Luxes/REDUCE_Suite/bin':$PATH

         and then logout and login again, or run the following command:
              source ~/.bashrc

  (5) type MatrixREDUCE -h
           LogoGenerator -h
           HTMLSummary -h
      etc for command-line help and worked examples

  (6) Note: to use HTMLSummary for the summary page, you need to install
            GhostScript. See $REDUCE_SUITE/config/pkg_settings.cfg for
            setting path to the command 'gs'. LogoGenerator generates
            the logo image in EPS format, and uses 'gs' to convert EPS
            into PDF, PNG, or JPG. Additionally, with ImageMagick, you
            can also get the logo image in GIF.

The above instruction should help get you started with the Suite. If you have any questions, please do not hesitate to ask. The Forum has been created for any related questions, comments, or suggestions.

Announcements / REDUCE Suite v2.2 is available
« Last post by xiangjun on February 15, 2016, 12:47:46 pm »
We are pleased to announce the release of the REDUCE Suite v2.2, a set of software tools to model the regulation of gene expression by transcription factors (TF). By directly correlating genome-wide mRNA expression or TF binding data (e.g. ChIP-chip) with associated nucleotide sequences, the REDUCE Suite can discover the sequence-specific binding affinity of a TF from a single experiment, using all measurements simultaneously, and without using any "background" sequence model.

The REDUCE Suite of software programs has been developed and actively maintained by the Laboratory of Dr. Harmen Bussemaker at the Department of Biological Sciences, Columbia University in the City of New York. The suite has its origin in the REDUCE algorithm of Bussemaker et al. in Nature Genetics (2001), which pioneered the use of motif-based linear regression model to discover cis-regulatory elements (motifs) and infer condition-specific transcription factor activities from a single genome-wide mRNA expression profile. Dr. Barrett Foat, a former graduate student in the Bussemaker Lab, extended REDUCE by adding an optimization procedure to obtain a so-called Position Specific Affinity Matrix (PSAM). He implemented his algorithm in a new program, MatrixREDUCE, using Perl and GNU Scientific Library (GSL). Dr. Xiang-Jun Lu has completely rewritten and significantly enhanced the code using pure ANSI C to make each component program efficient and the whole package self-contained.

Following the release of MatrixREDUCE v1.0 in late 2006, we have made many significant additions and improvements to the software based on extensive feedback from within and outside the lab. Specially, we have greatly improved the calculation of P-values based on a heuristic null model described in Foat et al (2008), and developed a versatile topology-based approach to specify motif patterns. Moreover, we have implemented MotifREDUCE as a standalone, yet more robust and efficient, replacement of the original REDUCE program, and created a command-line driven, general-purpose DNA/RNA-related LogoGenerator. Overall, the suite now consists of more than ten standalone, yet interconnected programs. To better reflect both its root and new versatile functionality, the package has been renamed the "REDUCE Suite", currently at version 2.2.

We understand that getting a scientific software tool published is just the beginning; in the long run, it is the continuous refinements and adaptation to the changing world that make a software suite alive. As a matter of fact, the REDUCE Suite contain many unpublished features. Moreover, while standard "no warranty" applies, we stand firmly behind the software. We strive to get back to your questions, suggestions and bug reports quickly and concretely on the Forum. Browsing the Forum should convince you of our dedication to the REDUCE Suite!

The C source code is in the src/ directory with each tarball. Please refer to the post titled "Set up the REDUCE Suite" on how to compile the code yourself. All REDUCE Suite related questions are welcome on the Forum (only). Do not be shy in sharing openly, but CONCRETELY, any difficult/negative experiences you may have in installing or using the software. By asking your questions on the public Forum, you're benefiting not only yourself but also the user community.

Welcome to the REDUCE Suite, and we look forward to communicating with you on the Forum.

Xiang-Jun Lu & Harmen Bussemaker

PS: The REDUCE Suite v2.2 is in a stable status: its key code components and functionality features, without material changes, have been extensively tested and utilized in real-world applications of nearly a decade. Due to my commitment to the NIH funded 3DNA/DSSR project, I am supporting the REDUCE Suite in maintenance mode. No new features are planned, but I will promptly address any REDUCE Suite related questions/bugs, exclusively, via this open Forum.

Note added on July 17, 2018: FeatureREDUCE is not included in the suite and it is not supported on the Forum.
Pages: 1 ... 5 6 [7]
Created and maintained by Dr. Xiang-Jun Lu [律祥俊]. See also and