Dear Anthony,

Following up on Xiang-Jun's reply, you are correct that the Transfactivity program does not explicitly deal with collinearity. This should not be a problem when Transfactivity is used to infer TF activities for additional expression profiles using one more PSAMs generated by MatrixREDUCE, as the stepwise PSAM discovery implemented by MatrixREDUCE was explicitly designed to make the PSAMs distinct from each other. In other words, when AffinityProfile is used with a set of PSAMs discovered by MatrixREDUCE to create a matrix containing total affinities for each sequence (which is also the first step performed by Transfactivity), the columns of that matrix will be close to orthogonal. The value of the regression coefficients in a multi-PSAM linear regression will then be close those obtained in separate single-PSAM fits.

Things are potentially different, however, when Transfactivity is used with a set of PSAMs obtained from another source such as Jaspar. In that case, there is no guarantee that the columns of the affinity matrix created by AffinityProfile are independent of each other, and the behavior of the regression could indeed become unstable due to collinearity. We were dealing with exactly this situation in two of our lab’s previous papers. In one case, we implemented L2-penalized regression in R with a design matrix generated by AffinityProfile to deal with collinearity when inferring protein-level activities for a large number of yeast transcription factors (Lee et al., Mol Syst Biol 2010;

https://www.ncbi.nlm.nih.gov/pubmed/20865005). In the second case, when we were doing the same for human transcription factors based on a collection of PWMs from Jaspar, we did some additional preprocessing on the design matrix in R as well (Lee et al., PNAS 2014;

https://www.ncbi.nlm.nih.gov/pubmed/24706889; see supplemental methods).

I hope this is useful.

Best regards,

Harmen