Data mining in biomedicine using ontologies

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

An ontology is a set of vocabulary terms with explicitly stated meanings and relations with other terms. Presently, a growing number of ontologies are being built and used for annotating data in biomedical research. Thanks to the tremendous amount of data being generated, ontologies are now being used in numerous ways, including connecting different databases, refining search capabilities, interpreting experimental/clinical data, and inferring knowledge. This cutting-edge resource introduces researchers to latest developments in bio-ontologies. The book provides the theoretical foundations and examples of ontologies, as well as applications of ontologies in biomedicine, from molecular levels to clinical levels. Readers also find details on technological infrastructure for bio-ontologies. This comprehensive, one-stop volume presents a wide range of practical bio-ontology information, offering professionals detailed guidance in the clustering of biological data, protein classification, gene and pathway prediction, and text mining.

Author(s): Mihail Popescu, Dong Xu
Series: Artech House Series Bioinformatics & Biomedical Imaging
Publisher: Artech
Year: 2009

Language: English
Pages: 279

Data Mining in Biomedicine Using Ontologies......Page 2
Contents......Page 6
Foreword......Page 12
Preface......Page 14
1.1 Introduction......Page 18
1.2.2 Recent Definition in Computer Science......Page 19
1.2.3 Origins of Bio-Ontologies......Page 20
1.2.5 Recent Advances in Computer Science......Page 21
1.3.1 Basic Components of Ontologies......Page 22
1.3.2 Components for Humans, Components for Computers......Page 23
1.4.1 The OBO Format and the OBO Consortium......Page 24
1.4.3 OWL and RDF/XML......Page 26
1.5 Spotlight on GO and UMLS......Page 27
1.5.1 The Gene Ontology......Page 28
1.5.2 The Unified Medical Language System......Page 29
1.6 Types and Examples of Ontologies......Page 30
1.6.2 Domain Ontologies......Page 31
1.6.4 Informal Ontologies......Page 32
1.6.6 Application Ontologies......Page 33
1.7 Conclusion......Page 34
References......Page 35
2.1 Introduction......Page 40
2.1.1 History......Page 42
2.1.2 Tversky’s Parameterized Ratio Model of Similarity......Page 44
2.1.3 Aggregation in Similarity Assessment......Page 45
2.2.1 Path-Based Measures......Page 47
2.2.2 Information Content Measures......Page 49
2.2.3 A Relationship Between Path-Based and Information-Content Measures......Page 52
2.3.1 Entity Class Similarity in Ontologies......Page 53
2.3.2 Cross-Ontological Similarity Measures......Page 54
2.3.3 Exploiting Common Disjunctive Ancestors......Page 55
2.4 Conclusion......Page 56
References......Page 57
3.1 Introduction......Page 62
3.2 Relational Fuzzy C-Means (NERFCM)......Page 64
3.3 Correlation Cluster Validity (CCV)......Page 66
3.4 Ontological SOM (OSOM)......Page 67
3.5.1 Test Dataset......Page 69
3.5.2 Clustering of the GPD194 Dataset Using NERFCM......Page 70
3.5.3 Determining the Number of Clusters of GPD194 Dataset Using CCV......Page 71
3.5.4 GPD194 Analysis Using OSOM......Page 73
3.6 Conclusion......Page 76
References......Page 77
4.1 Introduction......Page 80
4.1.1 Analyzing Sequence Data......Page 81
4.1.2 The Protein Phosphatase Family......Page 82
4.2.2 The Datasets......Page 83
4.2.3 The Phosphatase Ontology......Page 84
4.3.1 Protein Phosphatases in Humans......Page 87
4.3.2 Results from the Analysis of A. Fumigatus......Page 88
4.3.3 Ontology System Versus A. Fumigatus Automated Annotation Pipeline......Page 89
4.4 Ontology Classification in the Comparative Analysis of Three Protozoan Parasites—A Case Study......Page 90
4.4.2 TriTryps Protein Phosphatases......Page 91
4.4.4 Sequence Analysis Results from the TriTryps Phosphatome Study......Page 92
4.4.5 Evaluation of the Ontology Classification Method......Page 94
4.5 Conclusion......Page 95
References......Page 96
5.1 Introduction......Page 100
5.2.1 GO Index-Based Functional Similarity......Page 101
5.2.2 GO Semantic Similarity......Page 102
5.3.1 Gene-Gene Relationship Revealed in Microarray Data......Page 103
5.4.1 Building the Relationship Among Genes Using One Dataset......Page 104
5.4.2 Meta-Analysis of Microarray Data......Page 106
5.4.3 Function Learning from Data......Page 107
5.4.4 Functional-Linkage Network......Page 109
5.5.1 Local Prediction......Page 110
5.5.2 Global Prediction Using a Boltzmann Machine......Page 112
5.6.2 Sequence-Based Prediction......Page 115
5.6.3 Meta-Analysis of Yeast Microarray Data......Page 116
5.6.4 Case Study: Sin1 and PCBP2 Interactions......Page 118
5.7 Transcription Network Feature Analysis......Page 120
5.7.2 Kinetic Model for Time Series Microarray......Page 121
5.7.3 Regulatory Network Reconstruction......Page 122
5.7.4 GO-Enrichment Analysis......Page 123
5.9 Conclusion......Page 124
References......Page 125
6.1 Rule-Based Representation in Biomedical Applications......Page 130
6.2 Ontological Similarity as a Fuzzy Membership......Page 132
6.3 Ontological Fuzzy Rule System (OFRS)......Page 134
6.4 Application of OFRSs: Mapping Genes to Biological Pathways......Page 137
6.4.1 Mapping Gene to Pathways Using a Disjunctive OFRS......Page 138
6.4.2 Mapping Genes to Pathways Using an OFRS in an Evolutionary Framework......Page 144
References......Page 148
7.1 Association Rule Mining and Fuzzy Association Rule Mining Overview......Page 150
7.1.1 Association Rules: Formal Definition......Page 151
7.1.2 Association Rule Mining Algorithms......Page 154
7.1.3 Apriori Algorithm......Page 155
7.1.4 Fuzzy Association Rules......Page 157
7.2.1 Unveiling Biological Associations by Extracting Rules Involving GO Terms......Page 161
7.2.2 Giving Biological Signifi cance to Rule Sets by Using GO......Page 164
7.2.3 Other Joint Applications of Association Rules and GO......Page 167
7.3 Applications for Extracting Knowledge from Microarray Data......Page 169
7.3.1 Association Rules That Relate Gene Expression Patterns with Other Features......Page 170
7.3.2 Association Rules to Obtain Relations Between Genes and TheirExpression Values......Page 172
References......Page 174
8.1 Introduction......Page 180
8.2 Representing Background Knowledge—Ontology......Page 181
8.2.1 An Algebraic Approach to Ontologies......Page 182
8.2.2 Modeling Ontologies......Page 183
8.3 Referencing the Background Knowledge—Providing Descriptions......Page 184
8.3.1 Instantiated Ontology......Page 187
8.4.1 Connectivity Clustering......Page 190
8.4.2 Similarity Clustering......Page 194
8.5 Conclusion......Page 198
References......Page 199
9.1 Why Reasoning Matters......Page 202
9.2.1 A Taxonomy of Data and Reasoning......Page 204
9.2.2 Contemporary Reasoners......Page 206
9.2.3 Anatomy as a New Frontier for Biological Reasoners......Page 210
9.3.1 Current Practices......Page 212
9.3.2 Structural Issues That Limit Reasoning......Page 213
9.3.3 A Biological Example: The Maize Tassel......Page 214
9.3.4 Representational Issues......Page 216
9.4 Facilitating Reasoning About Anatomy......Page 222
9.4.2 Layer on Top of the Ontology......Page 223
9.4.3 Change the Representation......Page 224
Acknowledgments......Page 225
References......Page 226
10.1.1 What Is Text Mining?......Page 236
10.2 The Importance of Ontology to Text Mining......Page 237
10.3.1 Introduction to Document Clustering......Page 239
10.3.2 The Graphical Representation Model......Page 240
10.3.3 Graph Clustering for Graphical Representations......Page 245
10.3.4 Text Summarization......Page 247
10.3.5 Document Clustering and Summarization with Graphical Representation......Page 250
10.4 Swanson’s Undiscovered Public Knowledge (UDPK)......Page 252
10.4.1 How Does UDPK Work?......Page 253
10.4.2 A Semantic Version of Swanson’s UDPK Model......Page 254
10.4.3 The Bio-SbKDS Algorithm......Page 255
10.5 Conclusion......Page 263
References......Page 264
About the Editors......Page 266
List of Contributors......Page 267
Index......Page 270