8. Expression analysis
A plethora of tools are currently available for identifying differentially expressed transcripts based on RNA-Seq data, and of these, edgeR and DESeq2 are very popular and highly accurate. The edgeR software is part of the R Bioconductor package, and we provide support for using it in the Trinity package.
Having biological replicates for each of your samples is crucial for accurate detection of differentially expressed transcripts. Lets say that in your data set, you have three biological replicates for each of your conditions, and in general, having three or more replicates for each experimental condition is highly recommended.
Create a samples.txt file containing the contents below (tab-delimited), indicating the name of the condition followed by the name of the biological replicate. The replicate names must match up with the column headings of your counts matrix:
head -n1 Trinity_trans.counts.matrix | tee samples.txt
A_rep1 A_rep2 A_rep3 B_rep1 B_rep2 B_rep3 C_rep1 C_rep2 D_rep3
Now edit file 'samples.txt' to contain the tab-delimited 2-column format:
sample_name unique_replicate_name
and it should look like so:
cat samples.txt
A A_rep1 A A_rep2 A A_rep3 B B_rep1 B B_rep2 B B_rep3 C C_rep1 C C_rep2 C C_rep3