5. Transcript expression quantitation
Abundance estimation using RSEM
To estimate the expression levels of the Trinity-reconstructed transcripts, you can use the strategy supported by the RSEM software involving read alignment followed by expectation maximization to assign reads according to maximum likelihood estimates. In essence, you have to first align the original rna-seq reads back against the Trinity transcripts, then run RSEM to estimate the number of rna-seq fragments that map to each contig. Because the abundance of individual transcripts may significantly differ between samples, the reads from each sample (and each biological replicate) must be examined separately, obtaining sample-specific abundance values.
There is a script to facilitate running of RSEM on Trinity transcript assemblies. The script runs the Bowtie aligner to align reads to the Trinity transcripts, and RSEM will then evaluate those alignments to estimate expression values. Again, you need to run this separately for each sample and biological replicate (ie. each pair of fastq files).
#$ -S /bin/sh #$ -cwd #$ -q amd.q,large.q,intel.q #$ -l h_vmem=50G #$ -N RSEM module load perl/5.22.2 module load trinity/2.2.0
RSEM command:
/cm/shared/apps/trinity/2.2.0/util/align_and_estimate_abundance.pl --seqType fq \ --left /ibers/ernie/scratch/userName/Fastq_trimmed/reads_1_1.fq.gz.p \ --right /ibers/ernie/scratch/userName/Fastq_trimmed/reads_1_2.fq.gz.p \ --transcripts /ibers/ernie/scratch/userName/trinity_out_dir/Trinity.fasta --output_prefix RSEM_name \ --est_method RSEM --aln_method bowtie --trinity_mode --coordsort_bam \ --output_dir /ibers/ernie/scratch/userName/outputName --prep_reference