Making sense of data: a practical guide to exploratory data analysis and data mining

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Making Sense of Data educates readers on the steps and issues that need to be considered in order to successfully complete a data analysis or data mining project. The author provides clear explanations that guide the reader to make timely and accurate decisions from data in almost every field of study. A step-by-step approach aids professionals in carefully analyzing data and implementing results, leading to the development of smarter business decisions. With a comprehensive collection of methods from both data analysis and data mining disciplines, this book successfully describes the issues that need to be considered, the steps that need to be taken, and appropriately treats technical topics to accomplish effective decision making from data.Readers are given a solid foundation in the procedures associated with complex data analysis or data mining projects and are provided with concrete discussions of the most universal tasks and technical solutions related to the analysis of data, including:* Problem definitions* Data preparation* Data visualization* Data mining* Statistics* Grouping methods* Predictive modeling* Deployment issues and applicationsThroughout the book, the author examines why these multiple approaches are needed and how these methods will solve different problems. Processes, along with methods, are carefully and meticulously outlined for use in any data analysis or data mining project.From summarizing and interpreting data, to identifying non-trivial facts, patterns, and relationships in the data, to making predictions from the data, Making Sense of Data addresses the many issues that need to be considered as well as the steps that need to be taken to master data analysis and mining.

Author(s): Glenn J. Myatt
Edition: 1
Publisher: Wiley-Interscience
Year: 2007

Language: English
Pages: 293
City: Hoboken, N.J
Tags: Информатика и вычислительная техника;Искусственный интеллект;Интеллектуальный анализ данных;

Contents......Page 6
Preface......Page 12
1.1 OVERVIEW......Page 14
1.4 IMPLEMENTATION OF THE ANALYSIS......Page 15
1.6 BOOK OUTLINE......Page 18
1.8 FURTHER READING......Page 20
2.2 OBJECTIVES......Page 21
2.3 DELIVERABLES......Page 22
2.4 ROLES AND RESPONSIBILITIES......Page 23
2.5 PROJECT PLAN......Page 24
2.6.2 Problem......Page 25
2.6.5 Current Situation......Page 26
2.7 SUMMARY......Page 27
2.8 FURTHER READING......Page 29
3.2 DATA SOURCES......Page 30
3.3.1 Data Tables......Page 32
3.3.2 Continuous and Discrete Variables......Page 33
3.3.3 Scales of Measurement......Page 34
3.3.4 Roles in Analysis......Page 35
3.3.5 Frequency Distribution......Page 36
3.4.2 Cleaning the Data......Page 37
3.4.4 Data Transformations......Page 39
3.4.5 Segmentation......Page 44
3.6 EXERCISES......Page 46
3.7 FURTHER READING......Page 48
4.2.2 Contingency Tables......Page 49
4.2.3 Summary Tables......Page 52
4.3.2 Frequency Polygrams and Histograms......Page 53
4.3.3 Scatterplots......Page 56
4.3.4 Box Plots......Page 58
4.3.5 Multiple Graphs......Page 59
4.4 SUMMARY......Page 62
4.5 EXERCISES......Page 65
4.6 FURTHER READING......Page 66
5.1 OVERVIEW......Page 67
5.2.1 Overview......Page 68
5.2.2 Central Tendency......Page 69
5.2.3 Variation......Page 70
5.2.4 Shape......Page 74
5.2.5 Example......Page 75
5.3.1 Overview......Page 76
5.3.2 Con.dence Intervals......Page 80
5.3.3 Hypothesis Tests......Page 85
5.3.4 Chi-Square......Page 95
5.3.5 One-Way Analysis of Variance......Page 97
5.4.1 Overview......Page 101
5.4.2 Visualizing Relationships......Page 103
5.4.3 Correlation Coef.cient (r)......Page 105
5.4.4 Correlation Analysis for More Than Two Variables......Page 107
5.5 SUMMARY Central Tendency......Page 109
5.6 EXERCISES......Page 110
5.7 FURTHER READING......Page 113
6.1.1 Overview......Page 115
6.1.2 Grouping by Values or Ranges......Page 116
6.1.3 Similarity Measures......Page 117
6.1.4 Grouping Approaches......Page 121
6.2.1 Overview......Page 123
6.2.2 Hierarchical Agglomerative Clustering......Page 124
6.2.3 K-means Clustering......Page 133
6.3.1 Overview......Page 142
6.3.2 Grouping by Value Combinations......Page 143
6.3.3 Extracting Rules from Groups......Page 144
6.3.4 Example......Page 150
6.4.1 Overview......Page 152
6.4.2 Tree Generation......Page 155
6.4.3 Splitting Criteria......Page 157
6.4.4 Example......Page 164
6.6 EXERCISES......Page 166
6.7 FURTHER READING......Page 168
7.1.1 Overview......Page 169
7.1.2 Classification......Page 171
7.1.3 Regression......Page 175
7.1.4 Building a Prediction Model......Page 179
7.1.5 Applying a Prediction Model......Page 180
7.2.2 Simple Linear Regression......Page 182
7.2.3 Simple Nonlinear Regression......Page 185
7.3.1 Overview......Page 189
7.3.2 Learning......Page 191
7.3.3 Predicting......Page 193
7.4 CLASSIFICATION AND REGRESSION TREES 7.4.1 Overview......Page 194
7.4.2 Predicting Using Decision Trees......Page 195
7.4.3 Example......Page 197
7.5.2 Neural Network Layers......Page 200
7.5.3 Node Calculations......Page 201
7.5.4 Neural Network Predictions......Page 203
7.5.5 Learning Process......Page 204
7.5.6 Backpropagation......Page 205
7.5.7 Using Neural Networks......Page 209
7.5.8 Example......Page 210
7.6 OTHER METHODS......Page 212
7.7 SUMMARY......Page 217
7.8 EXERCISES......Page 218
7.9 FURTHER READING......Page 222
8.2 DELIVERABLES......Page 223
8.3 ACTIVITIES......Page 224
8.4 DEPLOYMENT SCENARIOS......Page 225
8.6 FURTHER READING......Page 226
9.1 SUMMARY OF PROCESS......Page 228
9.2.2 Problem De.nition......Page 231
9.2.3 Data Preparation......Page 233
9.2.4 Implementation of the Analysis......Page 240
9.3.1 Overview......Page 250
9.3.2 Text Data Mining......Page 252
9.4 FURTHER READING......Page 253
A.2 STUDENT’S T-DISTRIBUTION......Page 254
A.3 CHI-SQUARE DISTRIBUTION......Page 258
A.4 F-DISTRIBUTION......Page 262
Appendix B - Answers to Exercises......Page 271
Glossary......Page 278
Bibliography......Page 286
Index......Page 288