Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - jason

Pages: [1]
1
Hello,

I have been using transfactivity on my data and am very pleased with the results! It looks very coherent and seems to be giving the "right" answer. I am close to publishing the results, but wanted to run my current process by you and get your opinion regarding whether this is valid given the statistical inferences used in Transfactivity.

I calculated effect sizes for my RNAseq dataset using Sleuth (same idea as DESeq2 or EdgeR), and fed the raw effect sizes into Transfactivity to make TF motif activity inferences. I subsetted the results on motifs that passed an arbitrary significance threshold in at least 1 sample, then reported the actual coefficients (f value) of the significant motifs. Since the raw coefficients vary so much in value, I rescaled them by dividing by the largest coefficient in the row. Therefore, at least 1 sample will have a value of either -1 or +1 in each row, and everything else is relative to that.

You can find an illustrated example of this process at this link: https://drive.google.com/open?id=1T6wy3ho5nml7f5tq83u0UDvmsVPw4DF3

However, there are many alternative ways to input the data and ways to report it, and I was just hoping to get your opinions on the options.

Input:

1. Input the effect sizes for each gene as I did above. The problem here is many of the largest effect sizes are not actually significant (usually lowly expressed genes), and this will throw off the TF activity inferences.

2. Input the effect sizes only for genes that passed significance. I presume this will throw off the predictions though if I only feed it data for ~500 genes.

3. Input the effect sizes, but use some arbitrary process to get rid of the signal from non-significance genes, such as discarding only the ~200 genes with large effect sizes but no significant p-value, or alternatively just setting the effect size to 0 if it did not pass significance.

4. Input the p-values directly (signed -log10). I tried this and it works decently well, but obviously Transfactivity is expecting to predict magnitude of gene expression change.... not its significance which can vary wildly even just based on things like # of replicates I used.... so this seems wrong.

5. Input the row-normalized TPM matrix directly (or normalized count matrix), and then, to figure out which motifs associate with my statistical covariates, feed the Transfactivity coefficients into another regression to predict those that track with my covariates.

Which one would you recommend?


Reporting the output:

1. Report row-normalized coefficients as I did.

2. Report the signed -log p-values. As you can see in the pdf example, some motifs are MUCH more significant than others in the results, and this distinction is lost using the coefficients.


Thanks again for writing and maintaining such a great tool, hope to hear your opinions.



2
General Discussion / Re: Error converting PWM to PSAM
« on: December 21, 2017, 08:40:13 pm »
It works great! Thanks a ton.

3
General Discussion / Re: Error converting PWM to PSAM
« on: December 19, 2017, 02:08:41 pm »
Great! If all it takes is a simple transpose of the PFM matrix, I could handle the rest, but thank you for offering to upgrade the script! Hopefully other users in the future find it helpful.

-Jason

4
General Discussion / Re: Error converting PWM to PSAM
« on: December 19, 2017, 01:27:00 pm »
I see! I don't have much experience with the PWM format, so I thank you for looking into this.

Is the expected PWM format then only integer values, like a count matrix?

Yetfasco also provides the data as a "PFM" or Position Frequency Matrix. In this matrix, all the values are stored as fractions between 0 and 1. http://yetfasco.ccbr.utoronto.ca/1.02/Downloads/Expert_PFMs.tar.gz

The equivalent PFM to YDR146C that you pasted is this:

A       0.403846154     0.653846154     0.0     0.0     1.0     0.0     0.0     0.615384615
T       0.25    0.038461538     0.0     0.0     0.0     0.0     0.0     0.076923077
G       0.25    0.038461538     0.0     0.0     0.0     1.0     0.0     0.076923077
C       0.096153846     0.269230769     1.0     1.0     0.0     0.0     1.0     0.230769231


I also tried converting the PFM to PSAM and got an error, but perhaps I could, for example, scale each PFM to integer values from 0 to 100 and convert that?

5
General Discussion / Error converting PWM to PSAM
« on: December 18, 2017, 02:01:28 pm »
Hi, I've downloaded a set of PWMs from YeTFaSCo: http://yetfasco.ccbr.utoronto.ca/1.02/Downloads/Expert_PWMs.tar.gz

I would like to use these PWMs with the Transfactivity program.

However, I can't seem to get the convert2psam utility to work on these... I assume that it expects a slightly different PWM format than the ones provided by the download, but I can't figure out exactly what format it expects. Could you perhaps let me know if I'm doing something wrong?

Here's the command I ran and the error:

 bin/Convert2PSAM -source=pw -inp=data/yetfasco/ALIGNED_ENOLOGO_FORMAT_PWMS/YDR146C_569.pwm -pwmfile=test.xml

<data/yetfasco/ALIGNED_ENOLOGO_FORMAT_PWMS/YDR146C_569.pwm> not in PWM format: [A   0.381537584575116   1.07668300283655   -800   -800   1.68965987938785   -800   -800   0.989220160345073] contains invalid W a


Pages: [1]
Created and maintained by Dr. Xiang-Jun Lu [律祥俊]. See also http://forum.x3dna.org and http://x3dna.org