9. Using bwa for read mapping and SNP calling

From IBERS Bioinformatics and HPC Wiki
Jump to: navigation, search

Another popular mapping tool is bwa, which many prefer to bowtie. As an optional exercise, you will learn how to perform the bwa mapping, but you will not use it for the later steps of the pipeline, since bwa is not able to handle spliced reads and as such is not suitable for transcriptome analysis. RNA-seq analysis pipeline require a mapper able to handle reads coming from exon-exon junctions, which would present a very large gap (corresponding to an intron) when mapped onto the genome. The tool bwa is not a spliced aligner, and cannot be used directly for RNA-seq analysis. Nevertheless, it works quite well (as alignment accuracy and computational speed) and learning its usage could help anyone interested in working with NGS data. In this tutorial, you will use the bwa alignment for SNP calling, i.e. to identify genomic variations between the reference genome and your sample. SNP calling using RNA-Seq data is obviously limited to gene products and cannot identify variations in promoters and other regulatory regions, moreover is heavily affected by the gene expression levels, nevertheless in some cases can offer valuable insights. You will use a SNP calling algorithm, varscan, on the output of the bwa mapping.