6. Construct an expression matrix

From IBERS Bioinformatics and HPC Wiki
Revision as of 15:31, 6 March 2017 by Val1 (talk | contribs) (Created page with "Now, given the expression estimates for each of the transcripts in each of the samples, you're going to pull together expression values into matrices containing transcript IDs...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Now, given the expression estimates for each of the transcripts in each of the samples, you're going to pull together expression values into matrices containing transcript IDs in the rows, and sample replicate names in the columns. you'll make two matrices, one containing the estimated counts, and another containing the TPM expression values that are cross-sample normalized using the TMM method. This is all done for you by the following script in Trinity, indicating the method you used for expression estimation and providing the list of individual sample abundance estimate files:

#$ -S /bin/sh
#$ -cwd
#$ -q large.q
#$ -l h_vmem=300G
#$ -N Matrix

Module you have to load:

module load perl/5.22.2
module load trinity/2.2.0
module load R/R-3.1.2
/cm/shared/apps/trinity/2.2.0/util/abundance_estimates_to_matrix.pl --est_method RSEM --out_prefix S5_Trinity_trans_all \
--name_sample_by_basedir /ibers/ernie/scratch/userName/RSEM_A/A.isoforms.results \
/ibers/ernie/scratch/userName/RSEM_B/B.isoforms.results \
/ibers/ernie/scratch/userName/RSEM_C/Cl.isoforms.results \
/ibers/ernie/scratch/userName/RSEM_D/D.isoforms.results \
/ibers/ernie/scratch/userName/RSEM_E/E.isoforms.results \

You should find a matrix file called 'Trinity_trans.counts.matrix', which contains the counts of RNA-Seq fragments mapped to each transcript. Examine the first few lines of the counts matrix:

head -n5 Trinity_trans.counts.matrix | column -t
                                              A                       B                    C                      D                      E  
TRINITY_DN323_c0_g1_i1    846                782                 792                 1403                1397                
TRINITY_DN2438_c0_g1_i1  418                364                353                   13                      10                  
TRINITY_DN4819_c0_g1_i1  136                128                 165                    58                       64                 
TRINITY_DN1223_c0_g1_i1      7                   4                      6                    6                         9