Difference between revisions of "6.3 Transcriptome reconstruction and expression using cufflinks"

From IBERS Bioinformatics and HPC Wiki
Jump to: navigation, search
Line 3: Line 3:
 
  $ module load cufflinks/2.1.1
 
  $ module load cufflinks/2.1.1
 
  $ cufflinks --help
 
  $ cufflinks --help
You are going to use cufflinks with the Ensembl annotation file for zebrafish limiting  the  transcriptome  reconstruction  strictly  to  the  available  annotations.  You  could  also  use the annotations as a guide but letting the algorithm look also for transcribed loci not  included in the annotations. In the first case cufflinks will only report isoforms that  are  included  in  the  annotation,  while  in  the  latter  case  it  will  report  novel  isoforms  as  well. First, copy the annotation file from Ensembl for chromosome 12 of Danio rerio:  
+
You are going to use cufflinks with the '''UCSC''' annotation file for zebrafish limiting  the  transcriptome  reconstruction  strictly  to  the  available  annotations.  You  could  also  use the annotations as a guide but letting the algorithm look also for transcribed loci not  included in the annotations. In the first case cufflinks will only report isoforms that  are  included  in  the  annotation,  while  in  the  latter  case  it  will  report  novel  isoforms  as  well. First, copy the annotation file from UCSC for chromosome 12 of Danio rerio:  
 
   
 
   
  $ cp –r /pico/home/userexternal/fferre00/tutorial/annotations .   
+
  $ cp –r /ibers/repository/public/courses/Rna-Seq/annotations .   
 
Then let’s create a folder for the cufflinks output storage:  
 
Then let’s create a folder for the cufflinks output storage:  
 
   
 
   
 
  $ mkdir cuff_out
 
  $ mkdir cuff_out
 +
Now you are ready to run '''cufflinks'''. The general format of the '''cufflinks''' command  is:
 +
 +
cufflinks [options]* <aligned_reads.(sam/bam)>
 +
 +
where the input is the aligned reads (either in '''SAM''' or '''BAM''' format). 
 +
Prepare a script file to launch '''cufflinks''', by opening it with a text editor (you can call  it for example '''cufflinks_2cells.sh'''). Copy the same header for the scheduler that  you  used  previously,  use  the  commands  to  load  the  required  modules,  and  point  the  script to your working space. Your script should look something like this:

Revision as of 16:59, 15 February 2016

There are a number of tools that perform reconstruction of the transcriptome and for this workshop you are going to use cufflinks, which can do transcriptome assembly with and without a reference annotation. It also quantifies the isoform expression in FPKMs. Let’s import the cufflinks module and check its (rather long) list of parameters:

$ module load cufflinks/2.1.1
$ cufflinks --help

You are going to use cufflinks with the UCSC annotation file for zebrafish limiting the transcriptome reconstruction strictly to the available annotations. You could also use the annotations as a guide but letting the algorithm look also for transcribed loci not included in the annotations. In the first case cufflinks will only report isoforms that are included in the annotation, while in the latter case it will report novel isoforms as well. First, copy the annotation file from UCSC for chromosome 12 of Danio rerio:

$ cp –r /ibers/repository/public/courses/Rna-Seq/annotations .  

Then let’s create a folder for the cufflinks output storage:

$ mkdir cuff_out

Now you are ready to run cufflinks. The general format of the cufflinks command is:

cufflinks [options]* <aligned_reads.(sam/bam)>

where the input is the aligned reads (either in SAM or BAM format). Prepare a script file to launch cufflinks, by opening it with a text editor (you can call it for example cufflinks_2cells.sh). Copy the same header for the scheduler that you used previously, use the commands to load the required modules, and point the script to your working space. Your script should look something like this: