5.Identification and masking of repeat elements

From IBERS Bioinformatics and HPC Wiki
Revision as of 17:21, 19 March 2016 by Vpl (talk | contribs) (Created page with "'''Repeat identification''' Usually the first step for the genome annotation is the repeat identification and masking. With the term of "repeat" we mean different type of sequ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Repeat identification Usually the first step for the genome annotation is the repeat identification and masking. With the term of "repeat" we mean different type of sequences like: Low complexity sequences as homopolymeric runs of nucleotides, transposable elements, viruses, long interspersed nuclear elements (LINEs) and short interspersed nuclear elements (SINEs).

The masking of a genome consists of two steps: 1) The built of the repeats data base and 2) the masking by using the data base.

For the construction of the repeat database we are using the RepeatModeler. RepeatModeler is a de-novo repeat family identification and modeling package.