A meeting on the occasion of the visit to the University of São Paulo (USP) of the Bamboo Team under the leadership of MarieFrance Sagot, INRIA RhôneAlpes, France.
We are pleased to invite all the interested researchers to this informal meeting. The participants are welcome to attend the lectures as well as the Problem Session (with submissions on the spot). The meeting will focus on algorithmic aspects of problems arising in bioinformatics and combinatorial optimization.
Date  Time  Activity 
April 2 (Monday) 
10:0010:40  Opening (Talk 1) On BAMBOOs, LIRIOs, AMICIs, summer schools on algorithmics, and ciência sem fronteiras MarieFrance Sagot, INRIA and CNRS Univ. Lyon, France 
10:4011:10 
(Talk 2) The maximum
agreement subtree problem Bhalchandra Thatte, IMEUSP  
11:1011:30  = Coffee Break =  
11:3012:10 
(Talk 3) An algorithm to
generate draft genome sequence
scaffolds João Carlos Setubal, IQUSP 

12:1514:00  == Lunch ==  
14:0014:30 
(Talk 4) Algorithms
and complexity of enumerating minimal
precursor sets in genomewide metabolic networks Vicente Acuña, University of Chile 

14:3015:00 
(Talk 5) Telling stories: enumerating compound relationships from a metabolomics experience Paulo V. Milreu, Univ. C. Bernard, Lyon, France 

15:0016:00  Problem Session  
16:0016:20  = Coffee Break =  
16:2017:40  Group Work  
20:00  * Dinner *  
April 3 (Tuesday) 
9:3010:00 
(Talk 6) Detection of local network motifs Etienne Birmelé, Laboratoire Statistique et Génome, Évry, France 
10:0010:30 
(Talk 7) Minimum ratio cover of matrix columns by extreme rays of its induced cone Alexandre da Silva Freire, IMEUSP 

10:3010:50  = Coffee Break =  
10:5011:20  (Talk 8) Sorting by almostsymmetrical reversals
Ulisses Dias, ICUNICAMP 

11:2011:50 
(Talk 9) On computing
the diameter of realworld directed
(weighted) graphs Andrea Marino, Univ. of Florence, Italy 

11:5012:20 
(Talk 10) Next
generation sequencing (NGS) from the
theoretical perspective Gustavo Akio Sacomoto, Univ. C. Bernard, Lyon, France 

12:2014:00  == Lunch ==  
14:0016:00  Group work  
16:0016:20  = Coffee Break =  
16:20xxxx  Group work 
Abstract: In this talk, I will give a general presentation of the team for those who may not know it yet, of our more recent projects for those who know already the team, and of the things we would like to continue doing with you in the future.
Abstract: Given two binary phylogenetic trees on $n$ leaves, we show
that they have a common subtree on at least $O((\log{n})^{1/2 
\epsilon})$ leaves, thus improving on the previously known bound of
$O(\log\log n)$. To achieve this bound, we combine different special
cases: when one of the trees is balanced or when one of the trees is a
caterpillar, we show a lower bound of $O(\log n)$. Another ingredient
is the proof that every binary tree contains a large balanced subtree
or a large caterpillar, a result that is interesting on its own.
Finally, we also show that, there is an $\alpha > 0$ such that when
both the trees are balanced, they have a common subtree on at least
$O(n^\alpha)$ leaves.
[Joint work with Daniel Martin, Centro de Matemática, Computação e Cognição, Universidade Federal do ABC.]
Abstract:
We present a lineartime algorithm that can generate a set of contig scaffolds
for a draft genome sequence represented in contigs given a reference
genome. The algorithm is aimed at prokaryotic genomes and relies on
the presence of matching sequence patterns between the query and
reference genomes that can be interpreted as the result of largescale
inversions; we call these patterns inversion signatures.
Our algorithm is capable of correctly generating a scaffold if at
least one member of every inversion signature pair is present in
contigs and no inversion signatures have been overwritten in
evolution. The algorithm is also capable of generating scaffolds in
the presence of any kind of inversion, even though in this general case
there is no guarantee that all scaffolds in the scaffold set will be correct. We
compare the performance of SIS, the program that implements the
algorithm, to seven other scaffoldgenerating programs. The results of
our tests show that SIS has overall better performance.
[Joint work with Zanoni Dias and Ulisses Dias, UNICAMP.]
Abstract: In the context of studying whole metabolic networks and their interaction with the environment, the following question arises: given a set of target metabolites T and a set of possible external source metabolites S, which are the minimal subsets of S that are able to produce all the metabolites in T ? Such subsets are called the minimal precursor sets of T. The problem is then whether we can enumerate all of them efficiently.
Abstract: We present a constrained version of the problem of enumerating all maximal directed acyclic subgraphs (DAG) of a graph G. In this version, we enumerate maximal DAGs whose sources and targets belong to a predefined subset of the nodes. We call such DAGs stories. We first show how to compute one story in polynomialtime, and then describe two diferent algorithms to tell all possible stories.
Abstract: Network motifs are small subgraphs of networks which are statistically overrepresented, suggesting that they play a role in the organisation of the network. In biological applications, the motifs found by the existing methods appear to be highly concentrated in some areas of the networks. I'll give a definition of a local network motif based on this observation and discuss the statistical and algorithmic means to detect them.
Abstract: Given a matrix "S" and a subset of columns "R", we study the problem of finding a cover of "R" with extreme rays of the cone "F={v in R^n  Sv=0, v >= 0}", where an extreme ray "v" covers a column "k" if "v_k > 0". In order to measure how proportional a cover is, we introduce two different minimization problems, namely the "minimum global ratio cover" (MGRC) and the "minimum local ratio cover" (MLRC) problems. In both cases, we apply the notion of the "ratio" of a vector "v", which is given by "max v_i / min v_j". We show that these two problems are NPhard, even in the case in which "R = 1". We introduce a mixed integer programming formulation for the MGRC problem, which is solvable in polynomial time if all columns should be covered, and introduce a branchandcut algorithm for the MLRC problem. Finally, we present computational experiments on data obtained from real metabolic networks.
Abstract: Inversions are one of the most frequent largescale rearrangements
observed in actual genomes. While a large body of literature exists on
mathematical problems related to the computation of the inversion
distance between abstract genomes, these works generally do not take
into account that most inversions in bacterial chromosomes are
symmetric or roughly symmetric with respect to the origin of
replication. We define a new problem: how to sort genomes (or
permutations) using almostsymmetric inversions. We show an algorithm
that can sort any permutation using only almostsymmetric inversions.
Two variants of this algorithm are presented that have better
performance in practice. We explore the question of determining the
minimum number of almostsymmetric inversions needed to sort a genome
by presenting lower and upper bounds and results for special
permutation families. The results obtained are the first steps in
exploring this interesting new problem.
[Joint work with Zanoni Dias, João C. Setubal and Lenwood S. Heath.]
Abstract: We show a new algorithm for computing the diameter of directed unweighted graphs. Even though, in the worst case, this algorithm has complexity O(nm), where n is the number of nodes and m is the number of edges of the graph, we experimentally show that in practice our method works in linear time. Moreover, we show how to extend our algorithm to the case of directed weighted graphs and, even in this case, we present some preliminary very positive experimental results.
Abstract: We present an overview of some computational methods used to treat NGS data, both genomic and transcriptomic. In the first part, we present two data structures currently used in the sequence assembly problem: overlap graph and de Bruijn graph. Together with some complexity results. In the second part, we present a method to extract meaningful information from RNAseq data (transcriptomic) without assembly.
Chair: Yoshiko Wakabayashi, USP  Combinatorics and
Combinatorial Optimization Research Group  NUMEC
Cochair: MarieFrance Sagot,
Bamboo Team INRIA RhôneAlpes and
CNRS UMR 5558 LBBE University Lyon 1, France
Local committee: Alexandre da Silva Freire (USP), Cristina Gomes Fernandes (USP), Lourdes Vaz da Silva Netto (NUMECUSP)
IMEUSP 
NUMEC 
Project MaClinC

Bamboo
Team
