Auditorium of NUMEC (Núcleo de Apoio à Pesquisa em Modelagem
Estocástica e Complexidade)
at the Institute of Mathematics and Statistics of the University of São Paulo - IME-USP
We are pleased to invite all the interested researchers to this informal meeting. The participants are welcome to attend the lectures as well as the Problem Session (with submissions on the spot). The meeting will focus on algorithmic aspects of problems arising in bioinformatics and combinatorial optimization.
|9:30-10:00||From Bamboo to Erable (Marie-France Sagot, INRIA & LBBE)|
|10:00-10:30||Precursor sets in metabolic networks (Martin Wannagat, INRIA & LBBE)|
|10:30-11:00||Mirinho: efficient precursor miRNA predictor (Susan Higashi, INRIA & LBBE)|
|11:30-12:00||Algorithms to tell stories (Paulo Vieira Milreu, Researcher at TecSinapse)|
|14:00-14:30||Sorting permutations by prefix and suffix versions of reversals and transpositions (Carla Negri Lintzmayer, IC-UNICAMP)|
|14:30-15:00||Convex recoloring of phylogenetic trees (Phablo F.S. Moura, IME-USP)|
(open to all participants who wish to present problems, to discuss ideas, and possibly start new collaborations)
| March 26
|9:30-10:00||Statistical methods in graphs with applications in biological data (André Fujita, IME-USP)|
|10:00-10:30||Algorithmic issues in cophylogenetic analysis (Blerina Sinaimeri, INRIA & LBBE)|
|11:00-11:30||Topological colored motifs in reaction graphs (Marco Aurélio Stefanes, UFMS)|
|11:30-12:00||Efficient algorithms to list st-paths (Gustavo Sacomoto, INRIA & LBBE)|
The talks will be given in the Auditorium of NUMEC (ground floor). For the group work, the participants may use the Auditorium as well as the ground floor of NUMEC. Another blackboard is available next to the Auditorium.
The Laboratory at the first floor may be used by the authorized participants (who have sent their photos to the organization) to connect to the network of IME (with their own notebooks). Eduroam is also available.
Abstract: A short opening section.
Abstract: Metabolic networks give rise to several questions, e.g.: What are the possible pathways explaining differences of concentration of metabolites under two conditions? What are the possible fluxes that maintains the network in steady state? How can one identify sets of metabolites that are stable in time? What are the possible inputs a metabolic network needs to produce a given target metabolite. In this talk I present some algorithms that treat the last question.
Abstract: The vast majority of the methods for pre-miRNA prediction rely on cubic algorithms. They perform well for a small number of sequences; however, at a genome-wide scale they may become impracticable. It is then crucial to develop faster and more reliable methods to tackle such large scale data. To solve this problem, we developed a new method, called Mirinho. The novelty of the method is its efficiency, made possible by the use of an algorithm of quadratic instead of cubic complexity, and the fact that it is a single step approach which does require the use of post-processing filters. Mirinho is thus much faster than the available methods, 70 times more than the quickest one tested. In the majority of the cases, Mirinho also showed better sensitivity while keeping a precision that has a similar order of magnitude as compared to that of other approaches. Mirinho is also flexible in the sense that it was designed to work for the prediction of both animal and plant pre-miRNAs, while other methods are often tuned to a specific clade. Finally, we also provide to the user the option to enrich the prediction by using small RNA sequencing and conservation data.
Abstract: We present a constrained version of the problem of enumerating all maximal directed acyclic subgraphs (DAG) of a graph G. In this version, we enumerate maximal DAGs whose sources and targets belong to a predefined subset of the nodes. We call such DAGs stories. We first show how to compute one story in polynomial-time, and then describe two di erent algorithms to tell all possible stories.
Abstract: Reversals and transpositions are the most common kinds of genome rearrangements, which allow us to establish the divergence between individuals along evolution. When the rearrangements affect segments from the beginning or from the end of the genome, we say they are prefix or suffix rearrangements, respectively. This paper presents the first approximation algorithms for the problems of Sorting by Prefix Reversals and Suffix Reversals, Sorting by Prefix Transpositions and Suffix Transpositions and Sorting by Prefix Reversals, Prefix Transpositions, Suffix Reversals and Suffix Transpositions, all of them with factor 2.
Abstract: Given a graph G with an arbitrary vertex coloring, the Convex Recoloring Problem (CR) consists in recoloring the minimum number of vertices so that each color induces a connected subgraph. This problem is known to be NP-hard even when G is a path. It was introduced by Moran and Snir in 2005, motivated by studies on phylogenetic trees. We show an integer programming formulation for arbitrary graphs, and then specialize it for trees. We study the facial structure of the associated polytope, present several classes of facet-defining inequalities and the corresponding separation algorithms. We also present a branch-and-cut algorithm that we have implemented for the special case of trees, and show the computational results obtained with a large number of instances. We consider instances which are real phylogenetic trees. The experiments show that this approach can be used to solve instances up to 1500 vertices in 2 hours, comparing favorably to other approaches that have been proposed in the literature. This is joint work with Manoel Campêlo (UFC), Alexandre S. Freire (IC-Unicamp), Karla R. Lima (EACH-USP) and Y. Wakabayashi (USP).
Abstract: The understanding of the biological mechanisms underlying human diseases is one of the main topics of study in biological sciences. One of the challenges consists in understanding diseases by developing methods to statistically analyze and computationally manipulate biological data. This difficulty is generated by large data size, heterogeneity, multidimensionality, and presence of intrinsic noise. In this context, I will present formal statistical methods in graphs (hypothesis test, model selection, parameter estimation, statistical test for clusters, etc) with illustrative applications in functional magnetic resonance imaging and gene expression data.
Abstract: Coevolution happens when two (or more) species reciprocally affect each other’s evolution and it is likely to happen when the species have close ecological interactions with one another. These interactions can be of different kind but in this talk we will focus in parasite/host relationships. In this context we will exploit problems in co-phylogeny, that aims to understand how the evolution of the host has driven the evolution of the parasite by reconstructing their ancient relationships using phylogenetic information. We will formulate the co-phyologeny problems as combinatorial problems and show that despite the increasingly research in this area there are still many hurdles remaining. In this direction we propose some challenging open problems.
Abstract: In the context of structural analysis of metabolic networks, the problem of searching occurrences of motifs plays a central role. In metabolic networks, motifs are associated with both the basic modules of molecular information and the functional and structural features. In this talk we address the topological colored motif search problem in metabolic networks, here represented by a reaction graph. Recently, several variations of this problem have been studied. We present some hardness results for finding motifs and we describe a polynomial algorithm for the case in which the motif is a colorful tree.
Abstract: Listing all the simple st-paths (and cycles) in undirected graphs is a classical problem whose efficient solutions date back to the early 70s. For a graph with n vertices and m edges, containing K st-paths, the best known solution in the literature is given by Johnson's algorithm [SIAM J. Computing, 1975] and takes O((K + 1)(m + n)) time. When the graph is weighted, a natural variant is to list only the st-paths with length bounded by some constant. In this talk, I present some techniques that allows us to design an efficient algorithm to list bounded length st-paths [submitted, 2014], and the first optimal algorithm to list st-paths [SODA, 2013].
Chair: Yoshiko Wakabayashi, USP || Combinatorics and
Combinatorial Optimization Research Group || NUMEC
Co-chair: Marie-France Sagot, INRIA Rhône-Alpes, Université Claude Bernard, Lyon I
Alexandre da Silva Freire, IC-UNICAMP
Lourdes Vaz da Silva Netto, NUMEC-USP (firstname.lastname@example.org)