Data Mining for Business Analytics: Concepts, Techniques and Applications in Python

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Data Mining for Business Analytics: Concepts, Techniques, and Applications in Python presents an applied approach to data mining concepts and methods, using Python software for illustration

Readers will learn how to implement a variety of popular data mining algorithms in Python (a free and open-source software) to tackle business problems and opportunities.

This is the sixth version of this successful text, and the first using Python. It covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, recommender systems, clustering, text mining and network analysis. It also includes:

  • A new co-author, Peter Gedeck, who brings both experience teaching business analytics courses using Python, and expertise in the application of machine learning methods to the drug-discovery process
  • A new section on ethical issues in data mining
  • Updates and new material based on feedback from instructors teaching MBA, undergraduate, diploma and executive courses, and from their students
  • More than a dozen case studies demonstrating applications for the data mining techniques described
  • End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented
  • A companion website with more than two dozen data sets, and instructor materials including exercise solutions, PowerPoint slides, and case solutions

Data Mining for Business Analytics: Concepts, Techniques, and Applications in Python is an ideal textbook for graduate and upper-undergraduate level courses in data mining, predictive analytics, and business analytics. This new edition is also an excellent reference for analysts, researchers, and practitioners working with quantitative methods in the fields of business, finance, marketing, computer science, and information technology.

“This book has by far the most comprehensive review of business analytics methods that I have ever seen, covering everything from classical approaches such as linear and logistic regression, through to modern methods like neural networks, bagging and boosting, and even much more business specific procedures such as social network analysis and text mining. If not the bible, it is at the least a definitive manual on the subject.”

—Gareth M. James, University of Southern California and co-author (with Witten, Hastie and Tibshirani) of the best-selling book An Introduction to Statistical Learning, with Applications in R 

Author(s): Galit Shmueli, Peter C. Bruce, Peter Gedeck, Nitin R. Patel
Publisher: Wiley
Year: 2019

Language: English
Pages: 605
City: Hoboken

DATA MINING FOR BUSINESS ANALYTICS
Contents
Foreword by Gareth James
Foreword by Ravi Bapna
Preface to the Python Edition
Acknowledgments
PART I PRELIMINARIES
CHAPTER 1 Introduction
1.1 What Is Business Analytics?
1.2 What Is Data Mining?
1.3 Data Mining and Related Terms
1.4 Big Data
1.5 Data Science
1.6 Why Are There So Many Different Methods?
1.7 Terminology and Notation
1.8 Road Maps to This Book
Order of Topics
CHAPTER 2 Overview of the Data Mining Process
2.1 Introduction
2.2 Core Ideas in Data Mining
Classification
Prediction
Association Rules and Recommendation Systems
Predictive Analytics
Data Reduction and Dimension Reduction
Data Exploration and Visualization
Supervised and Unsupervised Learning
2.3 The Steps in Data Mining
2.4 Preliminary Steps
Organization of Datasets
Predicting Home Values in the West Roxbury Neighborhood
Loading and Looking at the Data in Python
Python Imports
Sampling from a Database
Oversampling Rare Events in Classification Tasks
Preprocessing and Cleaning the Data
2.5 Predictive Power and Overfitting
Overfitting
Creation and Use of Data Partitions
2.6 Building a Predictive Model
Modeling Process
2.7 Using Python for Data Mining on a Local Machine
2.8 Automating Data Mining Solutions
2.9 Ethical Practice in Data Mining
Data Mining Software: The State of the Market (by Herb Edelstein)
Problems
PART II DATA EXPLORATION AND DIMENSION REDUCTION
CHAPTER 3 Data Visualization
3.1 Introduction
3.2 Data Examples
Example 1: Boston Housing Data
Example 2: Ridership on Amtrak Trains
3.3 Basic Charts: Bar Charts, Line Graphs, and Scatter Plots
Distribution Plots: Boxplots and Histograms
Heatmaps: Visualizing Correlations and Missing Values
3.4 Multidimensional Visualization
Adding Variables: Color, Size, Shape, Multiple Panels, and Animation
Manipulations: Rescaling, Aggregation and Hierarchies, Zooming, Filtering
Reference: Trend Lines and Labels
Scaling Up to Large Datasets
Multivariate Plot: Parallel Coordinates Plot
Interactive Visualization
3.5 Specialized Visualizations
Visualizing Networked Data
Visualizing Hierarchical Data: Treemaps
Visualizing Geographical Data: Map Charts
3.6 Summary: Major Visualizations and Operations, by Data Mining Goal
Prediction
Classification
Time Series Forecasting
Unsupervised Learning
Problems
CHAPTER 4 Dimension Reduction
4.1 Introduction
4.2 Curse of Dimensionality
4.3 Practical Considerations
Example 1: House Prices in Boston
4.4 Data Summaries
Summary Statistics
Aggregation and Pivot Tables
4.5 Correlation Analysis
4.6 Reducing the Number of Categories in Categorical Variables
4.7 Converting a Categorical Variable to a Numerical Variable
4.8 Principal Components Analysis
Example 2: Breakfast Cereals
Principal Components
Normalizing the Data
Using Principal Components for Classification and Prediction
4.9 Dimension Reduction Using Regression Models
4.10 Dimension Reduction Using Classification and Regression Trees
Problems
PART III PERFORMANCE EVALUATION
CHAPTER 5 Evaluating Predictive Performance
5.1 Introduction
5.2 Evaluating Predictive Performance
Naive Benchmark: The Average
Prediction Accuracy Measures
Comparing Training and Validation Performance
Cumulative Gains and Lift Charts
5.3 Judging Classifier Performance
Benchmark: The Naive Rule
Class Separation
The Confusion (Classification) Matrix
Using the Validation Data
Accuracy Measures
Propensities and Cutoff for Classification
Performance in Case of Unequal Importance of Classes
Asymmetric Misclassification Costs
Generalization to More Than Two Classes
5.4 Judging Ranking Performance
Gains and Lift Charts for Binary Data
Decile Lift Charts
Beyond Two Classes
Gains and Lift Charts Incorporating Costs and Benefits
Cumulative Gains as a Function of Cutoff
5.5 Oversampling
Oversampling the Training Set
Evaluating Model Performance Using a Non-oversampled Validation Set
Evaluating Model Performance if Only Oversampled Validation Set Exists
Problems
PART IV PREDICTION AND CLASSIFICATION METHODS
CHAPTER 6 Multiple Linear Regression
6.1 Introduction
6.2 Explanatory vs. Predictive Modeling
6.3 Estimating the Regression Equation and Prediction
Example: Predicting the Price of Used Toyota Corolla Cars
6.4 Variable Selection in Linear Regression
Reducing the Number of Predictors
How to Reduce the Number of Predictors
Regularization (Shrinkage Models)
Appendix: Using Statmodels
Problems
CHAPTER 7 k-Nearest Neighbors (kNN)
7.1 The k-NN Classifier (Categorical Outcome)
Determining Neighbors
Classification Rule
Example: Riding Mowers
Choosing k
Setting the Cutoff Value
k-NN with More Than Two Classes
Converting Categorical Variables to Binary Dummies
7.2 k-NN for a Numerical Outcome
7.3 Advantages and Shortcomings of k-NN Algorithms
Problems
CHAPTER 8 The Naive Bayes Classifier
8.1 Introduction
Cutoff Probability Method
Conditional Probability
Example 1: Predicting Fraudulent Financial Reporting
8.2 Applying the Full (Exact) Bayesian Classifier
Using the “Assign to the Most Probable Class” Method
Using the Cutoff Probability Method
Practical Difficulty with the Complete (Exact) Bayes Procedure
Solution: Naive Bayes
The Naive Bayes Assumption of Conditional Independence
Using the Cutoff Probability Method
Example 2: Predicting Fraudulent Financial Reports, Two Predictors
Example 3: Predicting Delayed Flights
8.3 Advantages and Shortcomings of the Naive Bayes Classifier
Problems
CHAPTER 9 Classification and Regression Trees
9.1 Introduction
Tree Structure
Decision Rules
Classifying a New Record
9.2 Classification Trees
Recursive Partitioning
Example 1: Riding Mowers
Measures of Impurity
9.3 Evaluating the Performance of a Classification Tree
Example 2: Acceptance of Personal Loan
Sensitivity Analysis Using Cross Validation
9.4 Avoiding Overfitting
Stopping Tree Growth
Fine-tuning Tree Parameters
Other Methods for Limiting Tree Size
9.5 Classification Rules from Trees
9.6 Classification Trees for More Than Two Classes
9.7 Regression Trees
Prediction
Measuring Impurity
Evaluating Performance
9.8 Improving Prediction: Random Forests and Boosted Trees
Random Forests
Boosted Trees
9.9 Advantages and Weaknesses of a Tree
Problems
CHAPTER 10 Logistic Regression
10.1 Introduction
10.2 The Logistic Regression Model
10.3 Example: Acceptance of Personal Loan
Model with a Single Predictor
Estimating the Logistic Model from Data: Computing Parameter Estimates
Interpreting Results in Terms of Odds (for a Profiling Goal)
10.4 Evaluating Classification Performance
Variable Selection
10.5 Logistic Regression for Multi-class Classification
Ordinal Classes
Nominal Classes
Comparing Ordinal and Nominal Models
10.6 Example of Complete Analysis: Predicting Delayed Flights
Data Preprocessing
Model Training
Model Interpretation
Model Performance
Variable Selection
Appendix: Using Statmodels
Problems
CHAPTER 11 Neural Nets
11.1 Introduction
11.2 Concept and Structure of a Neural Network
11.3 Fitting a Network to Data
Example 1: Tiny Dataset
Computing Output of Nodes
Preprocessing the Data
Training the Model
Example 2: Classifying Accident Severity
Avoiding Overfitting
Using the Output for Prediction and Classification
11.4 Required User Input
11.5 Exploring the Relationship Between Predictors and Outcome
11.6 Deep Learning
Convolutional Neural Networks (CNNs)
Local Feature Map
A Hierarchy of Features
The Learning Process
Unsupervised Learning
Conclusion
11.7 Advantages and Weaknesses of Neural Networks
Problems
CHAPTER 12 Discriminant Analysis
12.1 Introduction
Example 1: Riding Mowers
Example 2: Personal Loan Acceptance
12.2 Distance of a Record from a Class
12.3 Fisher’s Linear Classification Functions
12.4 Classification Performance of Discriminant Analysis
12.5 Prior Probabilities
12.6 Unequal Misclassification Costs
12.7 Classifying More Than Two Classes
Example 3: Medical Dispatch to Accident Scenes
12.8 Advantages and Weaknesses
Problems
CHAPTER 13 Combining Methods: Ensembles and Uplift Modeling
13.1 Ensembles
Why Ensembles Can Improve Predictive Power
Simple Averaging
Bagging
Boosting
Bagging and Boosting in Python
Advantages and Weaknesses of Ensembles
13.2 Uplift (Persuasion) Modeling
A–B Testing
Uplift
Gathering the Data
A Simple Model
Modeling Individual Uplift
Computing Uplift with Python
Using the Results of an Uplift Model
13.3 Summary
Problems
PART V MINING RELATIONSHIPS AMONG RECORDS
CHAPTER 14 Association Rules and Collaborative Filtering
14.1 Association Rules
Discovering Association Rules in Transaction Databases
Example 1: Synthetic Data on Purchases of Phone Faceplates
Generating Candidate Rules
The Apriori Algorithm
Selecting Strong Rules
Data Format
The Process of Rule Selection
Interpreting the Results
Rules and Chance
Example 2: Rules for Similar Book Purchases
14.2 Collaborative Filtering
Data Type and Format
Example 3: Netflix Prize Contest
User-Based Collaborative Filtering: “People Like You”
Item-Based Collaborative Filtering
Advantages and Weaknesses of Collaborative Filtering
Collaborative Filtering vs. Association Rules
14.3 Summary
Problems
CHAPTER 15 Cluster Analysis
15.1 Introduction
Example: Public Utilities
15.2 Measuring Distance Between Two Records
Euclidean Distance
Normalizing Numerical Measurements
Other Distance Measures for Numerical Data
Distance Measures for Categorical Data
Distance Measures for Mixed Data
15.3 Measuring Distance Between Two Clusters
Minimum Distance
Maximum Distance
Average Distance
Centroid Distance
15.4 Hierarchical (Agglomerative) Clustering
Single Linkage
Complete Linkage
Average Linkage
Centroid Linkage
Ward’s Method
Dendrograms: Displaying Clustering Process and Results
Validating Clusters
Limitations of Hierarchical Clustering
15.5 Non-Hierarchical Clustering: The k-Means Algorithm
Choosing the Number of Clusters (k)
Problems
PART VI FORECASTING TIME SERIES
CHAPTER 16 Handling Time Series
16.1 Introduction
16.2 Descriptive vs. Predictive Modeling
16.3 Popular Forecasting Methods in Business
Combining Methods
16.4 Time Series Components
Example: Ridership on Amtrak Trains
16.5 Data-Partitioning and Performance Evaluation
Benchmark Performance: Naive Forecasts
Generating Future Forecasts
Problems
CHAPTER 17 Regression-Based Forecasting
17.1 A Model with Trend
Linear Trend
Exponential Trend
Polynomial Trend
17.2 A Model with Seasonality
17.3 A Model with Trend and Seasonality
17.4 Autocorrelation and ARIMA Models
Computing Autocorrelation
Improving Forecasts by Integrating Autocorrelation Information
Evaluating Predictability
Problems
CHAPTER 18 Smoothing Methods
18.1 Introduction
18.2 Moving Average
Centered Moving Average for Visualization
Trailing Moving Average for Forecasting
Choosing Window Width (w)
18.3 Simple Exponential Smoothing
Choosing Smoothing Parameter
Relation Between Moving Average and Simple Exponential Smoothing
18.4 Advanced Exponential Smoothing
Series with a Trend
Series with a Trend and Seasonality
Series with Seasonality (No Trend)
Problems
PART VII DATA ANALYTICS
CHAPTER 19 Social Network Analytics
19.1 Introduction
19.2 Directed vs. Undirected Networks
19.3 Visualizing and Analyzing Networks
Plot Layout
Edge List
Adjacency Matrix
Using Network Data in Classification and Prediction
19.4 Social Data Metrics and Taxonomy
Node-Level Centrality Metrics
Egocentric Network
Network Metrics
19.5 Using Network Metrics in Prediction and Classification
Link Prediction
Entity Resolution
Collaborative Filtering
19.6 Collecting Social Network Data with Python
19.7 Advantages and Disadvantages
Problems
CHAPTER 20 Text Mining
20.1 Introduction
20.2 The Tabular Representation of Text: Term-Document Matrix and “Bag-of-Words’’
20.3 Bag-of-Words vs. Meaning Extraction at Document Level
20.4 Preprocessing the Text
Tokenization
Text Reduction
Presence/Absence vs. Frequency
Term Frequency–Inverse Document Frequency (TF-IDF)
From Terms to Concepts: Latent Semantic Indexing
Extracting Meaning
20.5 Implementing Data Mining Methods
20.6 Example: Online Discussions on Autos and Electronics
Importing and Labeling the Records
Text Preprocessing in Python
Producing a Concept Matrix
Fitting a Predictive Model
Prediction
20.7 Summary
Problems
PART VIII CASES
CHAPTER 21 Cases
21.1 Charles Book Club
The Book Industry
Database Marketing at Charles
Data Mining Techniques
Assignment
21.2 German Credit
Background
Data
Assignment
21.3 Tayko Software Cataloger
Background
The Mailing Experiment
Data
Assignment
21.4 Political Persuasion
Background
Predictive Analytics Arrives in US Politics
Political Targeting
Uplift
Data
Assignment
21.5 Taxi Cancellations
Business Situation
Assignment
21.6 Segmenting Consumers of Bath Soap
Business Situation
Key Problems
Data
Measuring Brand Loyalty
Assignment
21.7 Direct-Mail Fundraising
Background
Data
Assignment
21.8 Catalog Cross-Selling
Background
Assignment
21.9 Time Series Case: Forecasting Public Transportation Demand
Background
Problem Description
Available Data
Assignment Goal
Assignment
Tips and Suggested Steps
References
Data Files Used in the Book
Python Utilities Functions
Index
EULA