3. Running Trinity

From IBERS Bioinformatics and HPC Wiki
Revision as of 14:33, 6 March 2017 by Val1 (talk | contribs)
Jump to: navigation, search

In silico read normalisation:

Large rna-seq data sets will have a large excess of reads corresponding to moderately and highly expressed transcripts, and these are far more than what are needed for their assembly. By removing the excess reads, we can lower memory consumption and speed up the assembly process. In silico normalization is an effective way of identifying and removing those excess reads. (link to diginorm). There is a process integrated into Trinity to perform in silico normalisation.

#$ -S /bin/sh
#$ -cwd
#$ -q large.q
#$ -l h_vmem=210G
#$ -N deNovo
#$ -pe multithread 8
module load perl/5.22.2
module load trinity/2.2.0


Trinity --seqType fq --max_memory 200G --left /ibers/ernie/scratch/userName/Fastq_trimmed/Pn_AL_l1_1.fq.gz.p,\
/ibers/ernie/scratch/userName/Fastq_trimmed/reads_1_1.fq.gz.p,\
/ibers/ernie/scratch/userName/Fastq_trimmed/reads_1_1.fq.gz.p,\
/ibers/ernie/scratch/userName/Fastq_trimmed/reads_1_1.fq.gz.p \ 
--right /ibers/ernie/scratch/userName/Fastq_trimmed/reads_1_2.fq.gz.p,\
/ibers/ernie/scratch/userName/Fastq_trimmed/reads_1_2.fq.gz.p,\
/ibers/ernie/scratch/userName/Fastq_trimmed/reads_1_2.fq.gz.p,\
/ibers/ernie/scratch/userName/Fastq_trimmed/reads_1_2.fq.gz.p --CPU 8 --normalize_reads