Difference between revisions of "8. De novo assembly with Velvet"

From IBERS Bioinformatics and HPC Wiki
Jump to: navigation, search
(Created page with "'''Velvet''' is an assembly algorithm developed for '''genomic assembly''', but it can be applied to '''transcriptome assembly''' as well. A more recent algorithm fr...")
 
Line 1: Line 1:
 
'''Velvet''' is an assembly algorithm developed for '''genomic assembly''', but it can be applied  to  '''transcriptome  assembly'''  as  well.  A  more  recent  algorithm  from  the  same  authors,  called  '''Oases''',  was  developed  specifically  for  '''un-guided  transcriptome  reconstruction''',  but  for  our  purposes  the  original  '''velvet'''  is  enough.  Other  ''de  novo''  transcriptome  reconstruction algorithms exist, for example '''Trinity''', which is generally more accurate  but has very large computational requirements, especially for memory. '''Trinity''' offers  also modules for '''gene expression''' and '''differential expression''' estimation, while velvet  is limited to transcript reconstruction.  
 
'''Velvet''' is an assembly algorithm developed for '''genomic assembly''', but it can be applied  to  '''transcriptome  assembly'''  as  well.  A  more  recent  algorithm  from  the  same  authors,  called  '''Oases''',  was  developed  specifically  for  '''un-guided  transcriptome  reconstruction''',  but  for  our  purposes  the  original  '''velvet'''  is  enough.  Other  ''de  novo''  transcriptome  reconstruction algorithms exist, for example '''Trinity''', which is generally more accurate  but has very large computational requirements, especially for memory. '''Trinity''' offers  also modules for '''gene expression''' and '''differential expression''' estimation, while velvet  is limited to transcript reconstruction.  
 
We will make a similar usage guide for '''Trinity''' in another session.
 
We will make a similar usage guide for '''Trinity''' in another session.
 +
 +
'''Velvet''' is composed by two modules, '''velveth''' and '''velvetg''', that need to be run one  after the other. The first module analyzes the reads, decomposes them in sub-sequences  of  fixed  length  k  (called  '''k-mers''')  and  builds  an  '''index'''  of  all  k-mers.  The  second  module  takes  as  input  the  output  of  '''velveth'''  and  builds  structures  in  which  reads  overlap,  called '''contigs''', corresponding to '''transcripts''' when the input contains RNA-seq reads.  Let’s  open  a  script  for  running  '''velvet''',  and  write  the  header,  load  the  modules  and  point to the working directory:

Revision as of 11:39, 22 February 2016

Velvet is an assembly algorithm developed for genomic assembly, but it can be applied to transcriptome assembly as well. A more recent algorithm from the same authors, called Oases, was developed specifically for un-guided transcriptome reconstruction, but for our purposes the original velvet is enough. Other de novo transcriptome reconstruction algorithms exist, for example Trinity, which is generally more accurate but has very large computational requirements, especially for memory. Trinity offers also modules for gene expression and differential expression estimation, while velvet is limited to transcript reconstruction. We will make a similar usage guide for Trinity in another session.

Velvet is composed by two modules, velveth and velvetg, that need to be run one after the other. The first module analyzes the reads, decomposes them in sub-sequences of fixed length k (called k-mers) and builds an index of all k-mers. The second module takes as input the output of velveth and builds structures in which reads overlap, called contigs, corresponding to transcripts when the input contains RNA-seq reads. Let’s open a script for running velvet, and write the header, load the modules and point to the working directory: