This book provides comprehensive coverage of combined Artificial Intelligence (AI) and Machine Learning (ML) theory and applications. Rather than looking at the field from only a theoretical or only a practical perspective, this book unifies both perspectives to give holistic understanding. The first part introduces the concepts of AI and ML and their origin and current state. The second and third parts delve into conceptual and theoretic aspects of static and dynamic ML techniques. The forth part describes the practical applications where presented techniques can be applied. The fifth part introduces the user to some of the implementation strategies for solving real life ML problems.
The book is appropriate for students in graduate and upper undergraduate courses in addition to researchers and professionals. It makes minimal use of mathematics to make the topics more intuitive and accessible.
- Presents a full reference to artificial intelligence and machine learning techniques - in theory and application;
- Provides a guide to AI and ML with minimal use of mathematics to make the topics more intuitive and accessible;
- Connects all ML and AI techniques to applications and introduces implementations.
Author(s): Ameet V Joshi
Edition: 2
Publisher: Springer
Year: 2022
Language: English
Pages: 278
City: Singapore
Preface to the Second Edition
Preface to the First Edition
Acknowledgments
Contents
Part I Introduction
1 Introduction to AI and ML
1.1 Introduction
1.2 What Is AI
1.3 What Is ML
1.4 Organization of the Book
1.4.1 Introduction
1.4.2 Machine Learning
1.4.3 Building End-to-End Pipelines
1.4.4 Artificial Intelligence
1.4.5 Conclusion
2 Essential Concepts in Artificial Intelligence and Machine Learning
2.1 Introduction
2.2 Big Data and Not-So-Big Data
2.2.1 What Is Big Data
2.2.2 Why Should We Treat Big Data Differently?
2.3 Machine Learning Algorithms, Models, and Types of Learning
2.3.1 Supervised Learning
2.3.2 Unsupervised Learning
2.3.3 Reinforcement Learning
2.4 Machine Learning Methods Based on Time
2.4.1 Static Learning
2.4.2 Dynamic Learning
2.5 Dimensionality
2.5.1 Curse of Dimensionality
2.6 Linearity and Nonlinearity
2.7 Occam's Razor
2.8 No Free Lunch Theorem
2.9 Law of Diminishing Returns
2.10 Early Trends in Machine Learning
2.10.1 Expert Systems
2.11 Conclusion
2.12 Exercise
3 Data Understanding, Representation, and Visualization
3.1 Introduction
3.2 Understanding the Data
3.2.1 Understanding Entities
3.2.2 Understanding Attributes
3.2.3 Understanding Data Types
3.3 Representation and Visualization of the Data
3.3.1 Principal Component Analysis
3.3.2 Linear Discriminant Analysis
3.4 Conclusion
4 Implementing Machine Learning Algorithms
4.1 Introduction
4.2 Use of Google Colab
4.2.1 scikit-learn Python Library
4.3 Azure Machine Learning (AML)
4.3.1 How to Start with AML?
4.4 Conclusion
4.5 Exercises
Part II Machine Learning
5 Linear Methods
5.1 Introduction
5.2 Linear and Generalized Linear Models
5.3 Linear Regression
5.3.1 Defining the Problem
5.3.2 Solving the Problem
5.4 Example of Linear Regression
5.5 Regularized Linear Regression
5.5.1 Regularization
5.5.2 Ridge Regression
5.5.3 Lasso Regression
5.6 Generalized Linear Models (GLM)
5.6.1 Logistic Regression
5.7 K-Nearest Neighbor (KNN) Algorithm
5.7.1 Definition of KNN
5.7.2 Classification and Regression
5.7.3 Other Variations of KNN
5.8 Conclusion
5.9 Exercises
6 Perceptron and Neural Networks
6.1 Introduction
6.1.1 Biological Neuron
6.2 Perceptron
6.2.1 Implementing Perceptron
6.3 Multilayered Perceptron or Artificial Neural Network
6.3.1 Feedforward Operation
6.3.2 Nonlinear MLP or Nonlinear ANN
6.3.2.1 Activation Functions
6.3.3 Training MLP
6.3.3.1 Online or Stochastic Learning
6.3.3.2 Batch Learning
6.3.4 Hidden Layers
6.3.5 Implementing MLP
6.4 Radial Basis Function Networks
6.4.1 Interpretation of RBF Networks
6.4.2 Implementing RBF Networks
6.5 Overfitting
6.5.1 Concept of Regularization
6.5.2 L1 and L2 Regularization
6.5.3 Dropout Regularization
6.6 Conclusion
6.7 Exercises
7 Decision Trees
7.1 Introduction
7.2 Why Decision Trees?
7.2.1 Types of Decision Trees
7.3 Algorithms for Building Decision Trees
7.4 Regression Tree
7.4.1 Implementing Regression Tree
7.5 Classification Tree
7.5.1 Defining the Terms
7.5.2 Misclassification Error
7.5.3 Gini Index
7.5.4 Cross-Entropy or Deviance
7.6 CHAID
7.6.1 CHAID Algorithm
7.7 Training Decision Tree
7.7.1 Depth of Decision Tree
7.8 Ensemble Decision Trees
7.8.1 Bagging Ensemble Trees
7.8.2 Random Forest Trees
7.8.2.1 Decision Jungles
7.8.3 Boosted Ensemble Trees
7.8.3.1 AdaBoost
7.8.3.2 Gradient Boosting
7.9 Implementing a Classification Tree
7.10 Conclusion
7.11 Exercises
8 Support Vector Machines
8.1 Introduction
8.2 Motivation and Scope
8.2.1 Extension to Multi-class Classification
8.3 Theory of SVM
8.4 Separability and Margins
8.4.1 Use of Slack Variables
8.5 Implementing Linear SVMs
8.6 Nonlinearity and Use of Kernels
8.6.1 Radial Basis Function
8.6.2 Polynomial
8.6.3 Sigmoid
8.7 Implementing Nonlinear SVMs with Kernels
8.8 Risk Minimization
8.9 Conclusion
8.10 Exercises
Appendix
9 Probabilistic Models
9.1 Introduction
9.2 Discriminative Models
9.2.1 Maximum Likelihood Estimation
9.2.2 Bayesian Approach
9.2.3 Comparison of MLE and Bayesian Approach
9.2.3.1 Solution Using MLE
9.2.3.2 Solution using Bayes' Approach
9.3 Implementing Probabilistic Models
9.4 Generative Models
9.4.1 Mixture Models
9.4.2 Bayesian Networks
9.5 Some Useful Probability Distributions
9.5.1 Normal or Gaussian Distribution
9.5.2 Bernoulli Distribution
9.5.3 Binomial Distribution
9.5.4 Gamma Distribution
9.5.5 Poisson Distribution
9.6 Conclusion
9.7 Exercises
10 Dynamic Programming and Reinforcement Learning
10.1 Introduction
10.2 Fundamental Equation of Dynamic Programming
10.3 Classes of Problems Under Dynamic Programming
10.4 Reinforcement Learning
10.4.1 Characteristics of Reinforcement Learning
10.4.2 Framework and Algorithm
10.5 Exploration and Exploitation
10.6 Applications of Reinforcement Learning
10.7 Theory of Reinforcement Learning
10.7.1 Variations in Learning
10.7.1.1 Q-Learning
10.7.1.2 SARSA
10.8 Implementing Reinforcement Learning
10.9 Conclusion
10.10 Exercises
11 Evolutionary Algorithms
11.1 Introduction
11.2 Bottleneck with Traditional Methods
11.3 Darwin's Theory of Evolution
11.4 Genetic Programming
11.4.1 Implementing Genetic Programming
11.5 Swarm Intelligence
11.6 Ant Colony Optimization
11.7 Simulated Annealing
11.8 Conclusion
11.9 Exercises
12 Time Series Models
12.1 Introduction
12.2 Stationarity
12.3 Autoregressive Moving Average Models
12.3.1 Autoregressive (AR) Process
12.3.2 Moving Average (MA) Process
12.3.3 Autoregressive Moving Average (ARMA) Process
12.4 Autoregressive Integrated Moving Average (ARIMA) Models
12.5 Implementing AR, MA, ARMA, and ARIMA in Python
12.6 Hidden Markov Models (HMMs)
12.6.1 Applications
12.7 Conditional Random Fields (CRFs)
12.8 Conclusion
12.9 Exercises
13 Deep Learning
13.1 Introduction
13.2 Why Deep Neural Networks?
13.3 Types of Deep Neural Networks
13.4 Convolutional Neural Networks (CNNs)
13.4.1 One-Dimensional Convolution
13.4.2 Two-Dimensional Convolution
13.4.3 Architecture of CNN
13.4.3.1 Convolutional Layer
13.4.3.2 Rectified Linear Unit (ReLU)
13.4.3.3 Pooling
13.4.3.4 Fully Connected Layer
13.4.4 Training CNN
13.4.5 Applications of CNN
13.5 Recurrent Neural Networks (RNNs)
13.5.1 Limitation of RNN
13.5.2 Long Short-Term Memory RNN
13.5.2.1 Forget Gate
13.5.2.2 Input Gate
13.5.2.3 Output Gate
13.5.3 Advantages of LSTM
13.5.4 Applications of LSTM-RNN
13.6 Attention-Based Networks
13.7 Generative Adversarial Networks (GANs)
13.8 Implementing Deep Learning Models
13.9 Conclusion
13.10 Exercises
14 Unsupervised Learning
14.1 Introduction
14.2 Clustering
14.2.1 k-Means Clustering
14.2.2 Improvements to k-Means Clustering
14.2.2.1 Hierarchical k-Means Clustering
14.2.2.2 Fuzzy k-Means Clustering
14.2.3 Implementing k-Means Clustering
14.3 Component Analysis
14.3.1 Independent Component Analysis (ICA)
14.4 Self-Organizing maps (SOMs)
14.5 Autoencoding Neural Networks
14.6 Conclusion
14.7 Exercises
Part III Building End-to-End Pipelines
15 Featurization
15.1 Introduction
15.2 UCI: Adult Salary Predictor
15.2.1 Feature Details
15.3 Identifying the Raw Data: Separating Information from Noise
15.3.1 Correlation and Causality
15.4 Building Feature Set
15.4.1 Standard Options of Feature Building
15.4.1.1 Numeric features
15.4.1.2 Categorical Features
15.4.1.3 String Features
15.4.1.4 Datetime Features
15.4.2 Custom Options of Feature Building
15.5 Handling Missing Values
15.6 Visualizing the Features
15.6.1 Numeric Features
15.6.2 Categorical Features
15.6.2.1 Feature: Workclass
15.6.2.2 Feature: Education
15.6.2.3 Other Features
15.7 Conclusion
15.8 Exercises
16 Designing and Tuning Model Pipelines
16.1 Introduction
16.2 Choosing the Technique or Algorithm
16.2.1 Choosing Technique for Adult Salary Classification
16.3 Splitting the Data
16.3.1 Stratified Sampling
16.4 Training
16.4.1 Tuning the Hyperparameters
16.4.2 Cross-Validation
16.5 Implementing Machine Learning Pipeline
16.6 Accuracy Measurement
16.7 Explainability of Features
16.8 Practical Considerations
16.8.1 Data Leakage
16.8.2 Coincidence and Causality
16.9 Conclusion
16.10 Exercises
17 Performance Measurement
17.1 Introduction
17.2 Sample Size
17.3 Metrics Based on Numerical Error
17.3.1 Mean Absolute Error
17.3.2 Mean Squared Error
17.3.3 Root Mean Squared Error
17.3.4 Normalized Error
17.4 Metrics Based on Categorical Error
17.4.1 Accuracy
17.4.2 Precision and Recall
17.4.2.1 F-Score
17.4.2.2 Confusion Matrix
17.4.3 Receiver Operating Characteristic (ROC) Curve Analysis
17.4.4 Precision-Recall Curve
17.5 Hypothesis Testing
17.5.1 Background
17.5.2 Steps in Hypothesis Testing
17.5.3 A/B Testing
17.6 Conclusion
17.7 Exercises
Part IV Artificial Intelligence
18 Classification
18.1 Introduction
18.2 Examples of Real-World Problems in Classification
18.3 Spam Email Detection
18.3.1 Defining Scope
18.3.2 Assumptions
18.3.2.1 Assumptions About the Spam Emails
18.3.2.2 Assumptions About the Genuine Emails
18.3.2.3 Assumptions About Precision and Recall Trade-Off
18.3.3 Skew in the Data
18.3.4 Supervised Learning
18.3.5 Feature Engineering
18.4 Implementing Spam Filter Classifier
18.4.1 Using Azure ML
18.5 Conclusion
18.6 Exercises
19 Regression
19.1 Introduction
19.2 Examples of Real-World Problems
19.3 Predicting Real Estate Prices
19.3.1 Defining Regression-Specific Problem
19.3.2 Aspects in Real Estate Value Prediction
19.3.3 Gather Labelled Data
19.4 Implementing Regression
19.4.1 Model Performance
19.5 Other Applications of Regression
19.6 Conclusion
19.7 Exercises
20 Ranking
20.1 Introduction
20.2 Measuring Ranking Performance
20.3 Ranking Search Results and Google's PageRank
20.4 Techniques Used in Text-Based Ranking Systems
20.4.1 Keyword Identification/Extraction
20.5 Implementing Keyword Extraction
20.6 Implementing Ranking System
20.7 Conclusion
20.8 Exercises
21 Recommendation Systems
21.1 Introduction
21.2 Collaborative Filtering
21.2.1 Solution Approaches
21.2.2 Information Types
21.2.3 Algorithm Types
21.3 Amazon's Personal Shopping Experience
21.3.1 Context-Based Recommendation
21.3.2 Aspects Guiding the Context-BasedRecommendations
21.3.3 Personalization-Based Recommendation
21.4 Netflix's Streaming Video Recommendations
21.5 Implementing Recommendation System
21.5.1 MovieLens Data
21.5.2 Planning the Recommendation Approach
21.5.2.1 Movies
21.5.2.2 Tags
21.5.2.3 Ratings
21.5.3 Implementing Recommendation System
21.5.3.1 K-Nearest Neighbors
21.5.3.2 Algebraic Least Squares (ALS) Algorithm
21.6 Conclusion
21.7 Exercises
Part V Conclusion
22 Conclusion and Next Steps
22.1 Overview
References
Index