9.1 Genome indexing for bwa

From IBERS Bioinformatics and HPC Wiki
Revision as of 15:47, 20 February 2016 by Vpl (talk | contribs) (Created page with "Let’s first look at the bwa options: $ module load bwa/0.7.12 $ bwa The '''bwa''' suite provides a number of commands for various steps of the alignment ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Let’s first look at the bwa options:

$ module load bwa/0.7.12
$ bwa

The bwa suite provides a number of commands for various steps of the alignment procedure. The general usage is:

bwa <command> [options]

Among the available commands there is index for genome indexing, and aln, samse, sampe and mem for read mapping. To see the general usage of a particular command type bwa followed by the name of the command:

$ bwa index

Usage:

bwa index [-a bwtsw|is] [-c] <in.fasta>

You need to specify the genomic sequence file (in.fasta) and a label to identify the index (index), which will be the prefix of all files written by bwa index. You already copied the folder containing the genomic sequence of the zebrafish chromosome 12. To keep things well organized, you can create a folder to store the index (name it for example bwaIndex or any other name you like):

$ mkdir bwaMapping

Once you created the folder, let’s open a text file with a text editor. You can call the file any name you want, for example bwaindex.sh. Write at the beginning of the file the