In this work we plan to revise the main techniques for enumeration algorithms and to show four examples of enumeration algorithms that can be applied to efficiently deal with some biological problems modelled by using biological networks: enumerating central and peripheral nodes of a network, enumerating stories, enumerating paths or cycles, and enumerating bubbles. Notice that the corresponding computational problems we define are of more general interest and our results hold in the case of arbitrary graphs. Enumerating all the most and less central vertices in a network according to their eccentricity is an example of an enumeration problem whose solutions are polynomial and can be listed in polynomial time, very often in linear or almost linear time in practice. Enumerating stories, i.e. all maximal directed acyclic subgraphs of a graph G whose sources and targets belong to a predefined subset of the vertices, is on the other hand an example of an enumeration problem with an exponential number of solutions, that can be solved by using a non trivial brute-force approach. Given a metabolic network, each individual story should explain how some interesting metabolites are derived from some others through a chain of reactions, by keeping all alternative pathways between sources and targets. Enumerating cycles or paths in an undirected graph, such as a protein-protein interaction undirected network, is an example of an enumeration problem in which all the solutions can be listed through an optimal algorithm, i.e. the time required to list all the solutions is dominated by the time to read the graph plus the time required to print all of them. By extending this result to directed graphs, it would be possible to deal more efficiently with feedback loops and signed paths analysis in signed or interaction directed graphs, such as gene regulatory networks. Finally, enumerating mouths or bubbles with a source s in a directed graph, that is enumerating all the two vertex-disjoint directed paths between the source s and all the possible targets, is an example of an enumeration problem in which all the solutions can be listed through a linear delay algorithm, meaning that the delay between any two consecutive solutions is linear, by turning the problem into a constrained cycle enumeration problem. Such patterns, in a de Bruijn graph representation of the reads obtained by sequencing, are related to polymorphisms in DNA- or RNA-seq data.
Author(s): Andrea Marino
Series: Atlantis Studies in Computing
Publisher: Atlantis Press
Year: 2015
Language: English
Pages: 151
Tags: Algorithm Analysis and Problem Complexity; Data Mining and Knowledge Discovery; Computational Biology/Bioinformatics
Front Matter....Pages i-xvii
Introduction....Pages 1-9
Front Matter....Pages 11-11
Enumeration Algorithms....Pages 13-35
An Application: Biological Graph Analysis....Pages 37-44
Front Matter....Pages 45-45
Telling Stories: Enumerating Maximal Directed Acyclic Graphs with Constrained Set of Sources and Targets....Pages 47-63
Enumerating Bubbles: Listing Pairs of Vertex Disjoint Paths....Pages 65-77
Enumerating Cycles and (s, t)-Paths in Undirected Graphs....Pages 79-105
Front Matter....Pages 107-107
Enumerating Diametral and Radial Vertices and Computing Diameter and Radius of a Graph....Pages 109-138
Conclusions....Pages 139-140
Back Matter....Pages 141-151