General Category > Documentation

Other utility programs

(1/1)

xiangjun:
The REDUCE Suite distribution also includes the following auxiliary programs. Simple type the corresponding program name with -h (e.g., Convert2PSAM -h) should provide sufficient information to get one started.

Convert2PSAMAs its name suggests, Convert2PSAM is a utility program that converts other commonly used motif (pattern) representations in nucleic acid sequences to PSAM, which is unique to the REDUCE Suite. It can also be used to standardize the various formats to a simplified PWM format for easy communication.

Topo2DictfileThe default topological pattern mechanism can be used to specify sequence motifs in a compact, convenient, and flexible way. However, it defines the motifs implicitly, has length limit (15 non-gap positions), and does not take into consideration of the IUPAC degenerate symbols. As an example, X6 stands for exactly 4^6 = 4096 combinations, from AAAAAA, AAAAAC, ... TTTTTT. Sometimes, we may need more control by specifying the motifs explicitly in a dictionary file, with arbitrary length and IUPAC symbols. This can be facilitated by Topo2Dictfile by first generating a motif dictionary accordingly to user-specified topological patterns, and then editing it as needed, e.g., deleting some motifs, adding more, or introducing IUPAC degeneracy symbols etc.

ProcessFASTAProcessFASTA is a simple utility program to process a sequence file in FASTA format, e.g., to select a list of sequences based on ids, convert to reverse complementary, combine id and sequence into one-line etc. While such functionalities are surely available in various heavy-duty toolboxes/environments (BioPerl, EMBOSS, BioConductor etc.), none fits ours needs perfectly. We have thus developed this simple utility program mainly for our own convenience.

ProcessTdatThis is simple utility program to process a tab-delimited text file, e.g., to extract a subset, perform log transformation, and sort entries by id order etc. It is created following the same idea as for ProcessFASTA.

ExtractWindowsA simple utility program to extract sequence fragments from a sequence file, probably of a chromosome.

psamdir2listA Perl utility program to generate a list of PSAM in a given directory. The resultant list can be fed into AffinityProfile or Transfactivity.

Millson:
Hi Xiangjun, do you have more resources to share on ProcessFASTA? Cheers.

xiangjun:
Hi,

Type ProcessFASTA -h for more info. Check the source code for technical details.

Best regards,

Xiang-Jun

Navigation

[0] Message Index

Go to full version