Difference between revisions of "9.2 Read mapping using bwa"

From IBERS Bioinformatics and HPC Wiki
Jump to: navigation, search
(Created page with "Read mapping with '''bwa''' is peculiar in which it requires two steps: in the first, bwa produces the mapping using an internal format (called sai fo...")
 
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
Read  mapping  with  '''bwa'''  is  peculiar  in  which  it  requires  two  steps:  in  the  first,  bwa  produces  the  mapping  using  an  internal  format  (called  sai  format)  that  is  different  from  the '''BAM/SAM'''  that  you  saw  before.  The  sai  files  need  then  to  be  converted  in  SAM.  Moreover, in case of paired end reads, you will need to process them separately then join  the outputs. You will do both steps in one script. Let’s open a text file called, for example,  bwamapping.sh, in which you can write the usual stuff:  
+
Read  mapping  with  '''bwa'''  is  peculiar  in  which  it  requires  two  steps:  in  the  first,  '''bwa''' produces  the  mapping  using  an  internal  format  (called  '''sai''' format)  that  is  different  from  the '''BAM/SAM'''  that  you  saw  before.  The  '''sai''' files  need  then  to  be  converted  in  '''SAM'''.  Moreover, in case of paired end reads, you will need to process them separately then join  the outputs. You will do both steps in one script. Let’s open a text file called, for example,  '''bwamapping.sh''', in which you can write the usual stuff:  
 
   
 
   
#/bin/bash
+
#$ -S /bin/sh
#PBS -l walltime=4:00:00
+
#$ -cwd
#PBS -l select=1:ncpus=4:mpiprocs=1:mem=16GB
+
#$ -q amd.q,large.q,intel.q
#PBS -A <your account>
+
#$ -l h_vmem=16G
#PBS -W group_list="<your account>"
+
#$ -e bwa_mapping.e
module load profile/bio
+
#$ -N bwa_mapping
module load profile/advanced
+
#$ -o bwa_mapping.o
module load bwa
+
module load bwa/0.7.12
 +
cd /ibers/ernie/scratch/vpl/zebra_fish/bwaMapping
 +
Now  let’s  write  the  command  to  map  the  forward  and  reverse  fastq  files  for  the  2-cells  embryo sample (each command in one line):
 +
 +
bwa aln -n 2 -t 4 ZV9 ../data/2cells_1.trim.fastq > 2cells_1.sai
 +
bwa aln -n 2 -t 4 ZV9 ../data/2cells_2.trim.fastq > 2cells_2.sai
 +
The parameters are:
 +
 
 +
'''-n''': allowed mismatches
 +
 
 +
'''-t''': number of threads
 +
 
 +
'''danRer10''': the index prefix
 +
 +
In  the  same  file  you  will  also  write  the  command  to  combine  the  two  sai  files  into  one  SAM:
 +
bwa sampe danRer10 2cells_1.sai 2cells_2.sai ../data/2cells_1.trim.fastq ../data/2cells_2.trim.fastq > 2cells.bwa.sam
 +
Save and run with qsub. Once finished, check the folder content: you should see two '''sai'''  files  and  one  large  '''SAM'''  file.  The  '''SAM'''  can  be  converted  into  '''BAM'''  format  using  samtools, and be used for further analyses.

Latest revision as of 16:07, 20 February 2016

Read mapping with bwa is peculiar in which it requires two steps: in the first, bwa produces the mapping using an internal format (called sai format) that is different from the BAM/SAM that you saw before. The sai files need then to be converted in SAM. Moreover, in case of paired end reads, you will need to process them separately then join the outputs. You will do both steps in one script. Let’s open a text file called, for example, bwamapping.sh, in which you can write the usual stuff:

#$ -S /bin/sh
#$ -cwd
#$ -q amd.q,large.q,intel.q
#$ -l h_vmem=16G
#$ -e bwa_mapping.e
#$ -N bwa_mapping
#$ -o bwa_mapping.o
module load bwa/0.7.12
cd /ibers/ernie/scratch/vpl/zebra_fish/bwaMapping

Now let’s write the command to map the forward and reverse fastq files for the 2-cells embryo sample (each command in one line):

bwa aln -n 2 -t 4 ZV9 ../data/2cells_1.trim.fastq > 2cells_1.sai
bwa aln -n 2 -t 4 ZV9 ../data/2cells_2.trim.fastq > 2cells_2.sai

The parameters are:

-n: allowed mismatches

-t: number of threads

danRer10: the index prefix

In the same file you will also write the command to combine the two sai files into one SAM:

bwa sampe danRer10 2cells_1.sai 2cells_2.sai ../data/2cells_1.trim.fastq ../data/2cells_2.trim.fastq > 2cells.bwa.sam

Save and run with qsub. Once finished, check the folder content: you should see two sai files and one large SAM file. The SAM can be converted into BAM format using samtools, and be used for further analyses.