Machine learning is the branch of artificial intelligence whose goal is to develop algorithms that add learning capabilities to computers. Ensembles are an integral part of machine learning. A typical ensemble includes several algorithms performing the task of prediction of the class label or the degree of class membership for a given input presented as a set of measurable characteristics, often called features. Feature Selection and Ensemble Methods for Bioinformatics: Algorithmic Classification and Implementations offers a unique perspective on machine learning aspects of microarray gene expression based cancer classification. This multidisciplinary text is at the intersection of computer science and biology and, as a result, can be used as a reference book by researchers and students from both fields. Each chapter describes the process of algorithm design from beginning to end and aims to inform readers of best practices for use in their own research.
Author(s): Oleg Okun, Lambros Skarlas
Edition: 1
Publisher: IGI Global snippet
Year: 2011
Language: English
Pages: 460
Tags: Биологические дисциплины;Матметоды и моделирование в биологии;Биоинформатика;
Title......Page 2
Copyright Page......Page 3
Table of Contents......Page 4
Preface......Page 9
Biological Background......Page 16
Gene Expression Data Sets......Page 21
Introduction to Data Classification......Page 25
Naïve Bayes......Page 28
Nearest Neighbor......Page 47
Classification Tree......Page 68
Support Vector Machines......Page 83
Introduction to Feature and Gene Selection......Page 132
Feature Selection Based on Elements of Game Theory......Page 138
Kernel-Based Feature Selection with the Hilbert-Schmidt Independence Criterion......Page 155
Extreme Value Distribution Based Gene Selection......Page 174
Evolutionary Algorithm for Identifying Predictive Genes......Page 192
Redundancy-Based Feature Selection......Page 218
Unsupervised Feature Selection......Page 238
Differential Evolution for Finding Predictive Gene Subsets......Page 251
Ensembles of Classifiers......Page 267
Classifier Ensembles Built on Subsets of Features......Page 275
Bagging and Random Forests......Page 311
Boosting and AdaBoost......Page 329
Ensemble Gene Selection......Page 344
Introduction to Classification Error Estimation......Page 349
ROC Curve, Area under it, other Classification Performance Characteristics and Statistical Tests......Page 356
Bolstered Resubstitution Error......Page 398
Performance Evaluation......Page 421
Application Examples......Page 429
End Remarks......Page 451
About the Contributors......Page 454
Index......Page 455