Clustering Challenges In Biological Networks

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This volume presents a collection of papers dealing with various aspects of clustering in biological networks and other related problems in computational biology. It consists of two parts, with the first part containing surveys of selected topics and the second part presenting original research contributions. This book will be a valuable source of material to faculty, students, and researchers in mathematical programming, data analysis and data mining, as well as people working in bioinformatics, computer science, engineering, and applied mathematics. In addition, the book can be used as a supplement to any course in data mining or computational/systems biology.

Contents: Surveys of Selected Topics: Fixed-Parameter Algorithms for Graph-Modeled Data Clustering (Hüffner et al.); Probabilistic Distance Clustering: Algorithm and Applications (C Iyigun & A Ben-Israel); Analysis of Regulatory and Interaction Networks from Clusters of Co-expressed Genes (E Yang et al.); Graph-based Approaches for Motif Discovery (E Zaslavsky); Statistical Clustering Analysis: An Introduction (H Zhang); New Methods and Applications: Diversity Graphs (P Blain et al.); Identifying Critical Nodes in Protein-Protein Interaction Networks (V Boginski & C W Commander); Faster Algorithms for Constructing a Concept (Galois) Lattice (V Choi); A Projected Clustering Algorithm and Its Biomedical Application (P Deng & W Wu); Graph Algorithms for Integrated Biological Analysis, with Applications to Type 1 Diabetes Data (J D Eblen et al.); A Novel Similarity-based Modularity Function for Graph Partitioning (Z Feng et al.); Mechanism-based Clustering of Genome-wide RNA Levels: Roles of Transcription and Transcript-Degradation Rates (S Ji et al.); The Complexity of Feature Selection for Consistent Biclustering (O E Kundakcioglu & P M Pardalos); Clustering Electroencephalogram Recordings to Study Mesial Temporal Lobe Epilepsy (C-C Liu et al.); Relating Subjective and Objective Pharmacovigilance Association Measures (R K Pearson); A Novel Clustering Approach: Global Optimum Search with Enhanced Positioning (M P Tan & C A Floudas).

Author(s): Sergiy Butenko, Sergiy Butenko, W Art Chaovalitwongse, Panos M Pardalos
Publisher: World Scientific
Year: 2009

Language: English
Pages: 347
City: New Jersry

Contents......Page 10
Preface......Page 8
Part 1 Surveys of Selected Topics......Page 14
1. Fixed-Parameter Algorithms for Graph-Modeled Data Clustering F. H¨uffner, R. Niedermeier and S. Wernicke......Page 16
1.1. Introduction......Page 17
1.2. Fixed-Parameter Tractability Basics and Techniques......Page 19
1.2.1. Kernelizations......Page 20
1.2.1.1. An Introductory Example......Page 21
1.2.1.2. The Kernelization Concept......Page 22
1.2.2. Depth-Bounded Search Trees......Page 23
1.3.1. Clique......Page 25
1.3.1.1. Finding Maximum Cardinality Cliques......Page 26
1.3.1.2. Enumerating Maximal Cliques......Page 27
1.3.2. Cluster Editing......Page 28
1.3.3. Clique Cover......Page 32
1.4.1. Practical Guidelines......Page 35
1.4.2. Challenges......Page 36
References......Page 37
2.1. Introduction......Page 42
2.2. Probabilistic {d,q}–Clustering......Page 44
2.2.1. Probabilities......Page 45
2.2.3. An Extremal Principle......Page 46
2.2.4. An Extremal Principle for the Cluster Sizes......Page 48
2.2.5. Centers......Page 49
2.3. ThePDQAlgorithm......Page 51
2.4. Estimation of Parameters of Normal Distribution......Page 53
2.4.1. A Comparison of the PDQ Algorithm (Algorithm 1) and the EM Method (Algorithm 2)......Page 54
2.5. Numerical Experiments......Page 55
2.6.1. Fermat–Weber Location Problem......Page 59
2.6.2. Multiple Facility Location Problem......Page 60
2.7. Determining the “Right”Number of Clusters......Page 63
References......Page 64
3. Analysis of Regulatory and Interaction Networks from Clusters of Co-expressed Genes E. Yang, A. Misra, T. J. Maguire and I. P. Androulakis......Page 66
3.1. Identification of Intervention Targets: Regulatory and Interaction Networks......Page 67
3.1.1. Identification of Informative Temporal Expression Patterns......Page 69
3.2.2. Regulatory Network Construction and Analysis......Page 72
3.3.2. Interaction Network Construction and Analysis......Page 82
3.4. Intervention Strategies......Page 88
References......Page 89
4.1. Introduction......Page 96
4.2. Graph-Theoretic Formulation......Page 99
4.3.1. Edge-Modeling Formulation......Page 101
4.3.1.1. Graph Pruning......Page 102
4.3.2. Cost-Aggregating Formulation......Page 103
4.4. Maximum Density Subgraph-based Algorithm......Page 105
4.5. Subtle Motif Algorithms......Page 106
4.5.2. Clique Finding with Consensus Constraint......Page 107
4.6. Discussion......Page 108
References......Page 109
5.1. Introduction......Page 114
5.2.1.1. Euclidean Distance and Minkowski Distance......Page 116
5.2.1.2. Mahalanobis Distance......Page 118
5.2.2. Measures for Variable Clustering......Page 119
5.2.2.2. Mutual Information......Page 120
5.3.1. K-Means Algorithm......Page 122
5.3.2. E-M Algorithm......Page 124
5.3.3. Hierarchical Clustering......Page 125
5.3.4. Self-Organizing Map......Page 128
5.4. Determining the Number of Clusters......Page 132
5.4.1. Model-based Method......Page 133
5.4.2. Scale-based Method......Page 134
References......Page 138
Part 2 New Methods and Applications......Page 140
6. Diversity Graphs P. Blain, C. Davis, A. Holder, J. Silva and C. Vinzant......Page 142
6.2. Notation, Definitions and Preliminary Results......Page 143
6.3. Graphs That Support Diversity......Page 148
6.4. Algorithms and Solutions for the Pure Parsimony Problem......Page 153
6.5. Directions for Future Research......Page 162
References......Page 163
7.1. Introduction......Page 166
7.2. Protein-Protein Interaction Networks......Page 167
7.3. Optimization Approaches for Critical Node Detection......Page 168
7.3.2. Cardinality Constrained Problem......Page 169
7.4.1. Multi-Start Combinatorial Heuristic......Page 171
7.4.2. Genetic Algorithms......Page 172
7.5. Computational Experiments......Page 173
References......Page 178
8.1. Introduction......Page 182
8.2. Background and Terminology on FCA......Page 185
8.3. BasicProperties......Page 186
8.3.1. Defining the Equivalence Classes......Page 187
8.3.2. Characterizations of Closure......Page 188
8.4.1. High-Level Idea......Page 189
8.4.2. Implementation......Page 190
8.4.2.1. Further Improvement: Dynamically Update Adjacency Lists......Page 191
8.5. Variants of theAlgorithm......Page 192
8.5.1. Algorithm 2: Computing All Concepts or Maximal Bipartite Cliques......Page 193
8.6. Discussion......Page 194
References......Page 195
Appendix......Page 198
9. A Projected Clustering Algorithm and Its Biomedical Application P. Deng, Q. Ma and W. Wu......Page 200
9.1. Introduction......Page 201
9.2.1. Density-based Algorithms......Page 203
9.2.2. Distance-based Algorithms......Page 204
9.3. The IPROCLUSAlgorithm......Page 205
9.3.1. Modified Manhattan Segmental Distance......Page 207
9.3.2. Initialization Phase......Page 208
9.3.3. Iterative Phase......Page 209
9.3.3.1. Simplified Replacing Logic......Page 210
9.3.4.1. Dimension Tuning Process......Page 211
9.4. EmpiricalResults......Page 212
9.4.2. Results on Synthetic Datasets......Page 213
9.5. Conclusion......Page 217
References......Page 218
10. Graph Algorithms for Integrated Biological Analysis, with Applications to Type 1 Diabetes Data J. D. Eblen, I. C. Gerling, A. M. Saxton, J. Wu, J. R. Snoddy and M. A. Langston......Page 220
10.1. Overview......Page 221
10.2. Description ofData......Page 222
10.4. Clique and Its Variants......Page 223
10.5. Statistical Evaluation and Biological Relevance......Page 226
10.6. ProteomicData Integration......Page 228
10.7. Remarks......Page 232
References......Page 233
11.1. Introduction......Page 236
11.2. RelatedWork......Page 238
11.3. A Novel Similarity-based Modularity......Page 240
11.4. A Genetic Graph Partitioning Algorithm......Page 242
11.5. AFastAgglomerativeAlgorithm......Page 243
11.6.1. Tests on Synthetic Graphs......Page 244
11.6.2. Real Applications......Page 246
References......Page 248
12. Mechanism-based Clustering of Genome-wide RNA Levels: Roles of Transcription and Transcript-Degradation Rates S. Ji, W. A. Chaovalitwongse, N. Fefferman, W. Yoo and J. E. Perez-Ortin......Page 250
12.1. Introduction......Page 251
12.2.2. Measuring Transcription Rates (TR) Using the Genomic Run-on (GRO) Method......Page 253
12.2.4. The TL-TR Plots......Page 254
12.3. StatisticalAnalysis......Page 255
12.3.1. Calibration of TL Data......Page 256
12.3.2. Calibration of TR Data......Page 257
12.3.3. Kinetic Analysis of the Changes in mRNA Levels......Page 258
12.3.4. Transcript-Degradation to Transcription (D/T) Ratios......Page 259
12.4. Experimental Results......Page 260
12.5. Conclusion and Discussion......Page 264
References......Page 266
13.1. Introduction......Page 270
13.2. Consistent Biclustering......Page 272
13.3. Complexity Results......Page 276
References......Page 278
14. Clustering Electroencephalogram Recordings to Study Mesial Temporal Lobe Epilepsy C.-C. Liu, W. Suharitdamrong, W. A. Chaovalitwongse, G. A. Ghacibeh and P. M. Pardalos......Page 280
14.1. Introduction......Page 281
14.2. Epilepsy as aDynamicalBrainDisorder......Page 282
14.4. Graph-TheoreticModeling forBrainConnectivity......Page 283
14.4.1. Cross–Mutual Information (CMI)......Page 284
14.4.2. Maximum Clique Algorithm......Page 287
14.5. Results......Page 289
References......Page 291
15.1. Introduction......Page 294
15.2. Aggregate Associations......Page 295
15.3. Subjective Associations......Page 299
15.4. Case-Specific Associations......Page 300
15.5. Relations between Measures......Page 301
15.6. Clustering Drugs......Page 303
15.6.1. The Case Study......Page 304
15.6.2. The Clustering Approach......Page 305
15.6.3. Summary of the Results......Page 308
15.7. Interpreting the Clusters......Page 311
15.8. Summary......Page 315
References......Page 318
16. A Novel Clustering Approach: Global Optimum Search with Enhanced Positioning M. P. Tan and C. A. Floudas......Page 320
16.1. Introduction......Page 321
16.2.1. Experimental Data......Page 323
16.2.2.2. Hard Clustering by Global Optimization......Page 324
16.2.2.3. The GOS Algorithm for Clustering......Page 326
16.2.2.4. Determining the Optimal Number of Clusters......Page 330
16.2.3. Proposed Algorithm......Page 331
16.3.1. Description of Comparative Study......Page 333
16.3.2. Intra-cluster Error Sum......Page 334
16.3.3. Inter-cluster Error Sum......Page 335
16.3.5. Optimal Number of Clusters......Page 336
16.3.6. Coherence and Biological Relevance......Page 337
16.3.7. Additional Constraints for Large Datasets......Page 339
16.5. Computational Resources......Page 340
References......Page 341
Index......Page 346