Core Concepts in Data Analysis: Summarization, Correlation and Visualization

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Core Concepts in Data Analysis: Summarization, Correlation and Visualizationprovides in-depth descriptions of those data analysis approaches that either summarize data (principal component analysis and clustering, including hierarchical and network clustering) or correlate different aspects of data (decision trees, linear rules, neuron networks, and Bayes rule).

Boris Mirkin takes an unconventional approach and introduces the concept of multivariate data summarization as a counterpart to conventional machine learning prediction schemes, utilizing techniques from statistics, data analysis, data mining, machine learning, computational intelligence, and information retrieval.

Innovations following from his in-depth analysis of the models underlying summarization techniques are introduced, and applied to challenging issues such as the number of clusters, mixed scale data standardization, interpretation of the solutions, as well as relations between seemingly unrelated concepts: goodness-of-fit functions for classification trees and data standardization, spectral clustering and additive clustering, correlation and visualization of contingency data.

The mathematical detail is encapsulated in the so-called “formulation” parts, whereas most material is delivered through “presentation” parts that explain the methods by applying them to small real-world data sets; concise “computation” parts inform of the algorithmic and coding issues.

Four layers of active learning and self-study exercises are provided: worked examples, case studies, projects and questions.

Author(s): Boris Mirkin (auth.)
Series: Undergraduate Topics in Computer Science
Edition: 1
Publisher: Springer-Verlag London
Year: 2011

Language: English
Pages: 390
Tags: Discrete Mathematics in Computer Science; Probability and Statistics in Computer Science; Math Applications in Computer Science; Artificial Intelligence (incl. Robotics); Pattern Recognition

Front Matter....Pages i-xx
Introduction: What Is Core....Pages 1-30
1D Analysis: Summarization and Visualization of a Single Feature....Pages 31-65
2D Analysis: Correlation and Visualization of Two Features....Pages 67-112
Learning Multivariate Correlations in Data....Pages 113-172
Principal Component Analysis and SVD....Pages 173-219
K-Means and Related Clustering Methods....Pages 221-281
Hierarchical Clustering....Pages 283-313
Approximate and Spectral Clustering for Network and Affinity Data....Pages 315-356
Back Matter....Pages 357-390