3. Your data
You will use a dataset derived from sequencing of mRNA from zebrafish (Danio rerio) embryos in two different developmental stages. Sequencing was performed on the Illumina platform and generated 76bp pairedAend sequence data using polyAAselected RNA. Due to the time constraints of the practical you will only use a subset of the reads, but the steps that you are going to follow are the same whether you are using the whole data or only a subset. All the data can be found on the repository at the following path:
/ibers/repository/public/courses/Rna-Seq/
(Original data can be found here: http://www.ebi.ac.uk/ena/data/view/ERR022484 and http://www.ebi.ac.uk/ena/data/view/ERR022485)
You need first to create a directory called zebrafish (or whatever other name you like) as a subdirectory of your home, to store all data and results. The sequencing data are in another folder in repository called data. You need to copy this folder into the folder you just created. In this tutorial, each step will be written explicitly, but you should be able by now to perform most of the above and the following steps using Unix commands, so try by yourself before resorting to help. Don’t worry about doing something wrong, you should not be able to do major damage. Below is the list of commands; as before, $ is the shell prompt, press enter after each command, and everything after the # is a comment ignored by the shell, and you don’t need to type it:
$ mkdir zebrafish # create the work folder $ cd zebrafish # and move into it $ cp –r /hove/vasilis/repository/.../tutorial/data . # copy the data in zebrafish folder $ ls –l data # what’s in the folder Note that the –r parameter for the cp command instructs cp to copy entire folders. In the data folder you will find the following data files:- 2cells_1.fastq and 2cells_2.fastq: these files are derived from paired-end RNA-seq data of a 2-cell zebrafish embryo, the first file containing the forward reads and the other the reverse reads;
- 6h_1.fastq and 6h_2.fastq : these files are derived from paired-end RNA-seq data of zebrafish embryos 6 hours post-fertilization, the first file containing the forward reads and the other the reverse reads.
$ pwd # where are you | $ ls |