baobabLuna: the solution space of sorting by reversals

English
77 Pages
Read an excerpt
Gain access to the library to view online
Learn more

Description

baobabLuna: the solution space of sorting by reversals Documentation Marılia D. V. Braga March 15, 2009

  • co-localised genes

  • classes without

  • rhone-alpes bioinformatics

  • sorting sequences

  • optimal sorting


Subjects

Informations

Published by
Reads 10
Language English
Document size 1 MB
Report a problem
the
solution
baobabLuna: space of sorting
Documentation
Mar´ılia D. V. Braga
March
15,
2009
by
reversals
I
I
Acknowledgments
This work was funded by the European Union Programme Alβan (scholarship no. E05D053131BR), the French projects ANR (REGLIS NT05-3 45205 and MIRI BLAN08-1 335497), INRIA ArcoIris (associated with the UniversityofS˜aoPaulo,Brazil)andRhoˆne-AlpesBioinformaticsCenter(PRABI).
III
IV
Abstract
Calculating the reversal distance and finding one optimal sequence of reversals to transform a genome into another are useful algorithmic tools to analyse real evolutionary scenarios. When gene duplications are not allowed, there are polynomial algorithms to solve both problems. However, the number of different optimal sorting sequences is usually huge and some additional criteria should be taken in consideration in order to obtain a more accurate analysis. One strategy is searching for sequences that respect some biological constraints, such as the common intervals, which are the list of clusters of co-localised genes between the considered genomes - an optimal sequence of reversals that does not break the common intervals may be more realistic than one that does break. Another approach is to explore the whole universe of sorting sequences, but, since this set may be too big to be directly interpreted, a model has been proposed to group the sorting sequences into classes of equivalence, reducing thus the size of the set to be handled. Recently an algorithm to direct generate the classes without enumerating all sequences was proposed (besides one representative, this algorithm is also able to give the number of sequences in each equivalence class). The implementation of this algorithm is one of the most important features of baobabLuna. Although the number of classes is much smaller than the number of sorting sequences, it can also be too big. Thus, to reduce the universe of sequences and classes,baobabLunamakes use of different biological constraints, such as the common intervals (initially and progressively detected). In this work we describebaobabLunaframework to deal with genomes and reversals, that, a java contains the implementations of the mentioned algorithms. We also give details of the technical aspects of baobabLuna, a description of its interface, download and setup instructions and a tutorial of the executable programs.
Keywords:Evolution ; genome rearrangements ; algorithms ; sorting by reversals
V
VI
Contents
1 Introduction
1
2 Methodological background and experiments 3 2.1 Sorting by reversals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.1 Permutations, intervals and reversals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 The breakpoint graph and the reversal distance . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.3 Safe and unsafe reversals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.4 Sorting a signed permutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.5 The symmetry of sorting by reversals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.1.6 Component-specific reversals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 The space of all optimal sorting sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.1 The symmetry of the space of sorting sequences . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.2 An algorithm to enumerate all sorting sequences . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.1 The symmetry of traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.2 Normal form of a trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.3 Computing traces by enumerating all sorting sequences . . . . . . . . . . . . . . . . . . . 17 2.3.4 An algorithm to directly enumerate the traces . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.3.5 Component-specific reversals and trace composition . . . . . . . . . . . . . . . . . . . . . 19 2.3.6 Implementation and performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.7 Final remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.4 Biological constraints and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.4.1 Modeling traces with biological constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.4.2 Common intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.4.3 Stratification on evolution of sexual chromosomes . . . . . . . . . . . . . . . . . . . . . . . 31 2.4.4 Symmetryversus .asymmetry when applying constraints 35 . . . . . . . . . . . . . . . . . . i