9.2 Read mapping using bwa

From IBERS Bioinformatics and HPC Wiki
Jump to: navigation, search

Read mapping with bwa is peculiar in which it requires two steps: in the first, bwa produces the mapping using an internal format (called sai format) that is different from the BAM/SAM that you saw before. The sai files need then to be converted in SAM. Moreover, in case of paired end reads, you will need to process them separately then join the outputs. You will do both steps in one script. Let’s open a text file called, for example, bwamapping.sh, in which you can write the usual stuff:

#$ -S /bin/sh
#$ -cwd
#$ -q amd.q,large.q,intel.q
#$ -l h_vmem=16G
#$ -e bwa_mapping.e
#$ -N bwa_mapping
#$ -o bwa_mapping.o
module load bwa/0.7.12
cd /ibers/ernie/scratch/vpl/zebra_fish/bwaMapping

Now let’s write the command to map the forward and reverse fastq files for the 2-cells embryo sample (each command in one line):

bwa aln -n 2 -t 4 ZV9 ../data/2cells_1.trim.fastq > 2cells_1.sai
bwa aln -n 2 -t 4 ZV9 ../data/2cells_2.trim.fastq > 2cells_2.sai

The parameters are:

-n: allowed mismatches

-t: number of threads

danRer10: the index prefix

In the same file you will also write the command to combine the two sai files into one SAM:

bwa sampe danRer10 2cells_1.sai 2cells_2.sai ../data/2cells_1.trim.fastq ../data/2cells_2.trim.fastq > 2cells.bwa.sam

Save and run with qsub. Once finished, check the folder content: you should see two sai files and one large SAM file. The SAM can be converted into BAM format using samtools, and be used for further analyses.