Deep Learning with Structured Data

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Deep learning offers the potential to identify complex patterns and relationships hidden in data of all sorts. Deep Learning with Structured Data shows you how to apply powerful deep learning analysis techniques to the kind of structured, tabular data you'll find in the relational databases that real-world businesses depend on. Filled with practical, relevant applications, this book teaches you how deep learning can augment your existing machine learning and business intelligence systems. About the Technology Here’s a dirty secret: Half of the time in most data science projects is spent cleaning and preparing data. But there’s a better way: Deep learning techniques optimized for tabular data and relational databases deliver insights and analysis without requiring intense feature engineering. Learn the skills to unlock deep learning performance with much less data filtering, validating, and scrubbing. About the book Deep Learning with Structured Data teaches you powerful data analysis techniques for tabular data and relational databases. Get started using a dataset based on the Toronto transit system. As you work through the book, you’ll learn how easy it is to set up tabular data for deep learning, while solving crucial production concerns like deployment and performance monitoring. What's inside • When and where to use deep learning • The architecture of a Keras deep learning model • Training, deploying, and maintaining models • Measuring performance About the reader For readers with intermediate Python and machine learning skills. About the author Mark Ryan has 20 years of experience leading technical teams in the areas of relational database and machine learning.

Author(s): Mark Ryan
Edition: 1
Year: 2020

Language: English
Commentary: Vector PDF
Pages: 264
Tags: Machine Learning; Deep Learning; Python; Keras; TensorFlow; Data Cleaning; pandas; PyTorch; Data Preparation; Model Training; Model Deployment; Data Exploration

Deep Learning with Structured Data
brief contents
contents
preface
acknowledgments
about this book
Who should read this book
How this book is organized: A roadmap
About the code
liveBook discussion forum
about the author
about the cover illustration
Chapter 1: Why deep learning with structured data?
1.1 Overview of deep learning
1.2 Benefits and drawbacks of deep learning
1.3 Overview of the deep learning stack
1.4 Structured vs. unstructured data
1.5 Objections to deep learning with structured data
1.6 Why investigate deep learning with a structured data problem?
1.7 An overview of the code accompanying this book
1.8 What you need to know
Summary
Chapter 2: Introduction to the example problem and Pandas dataframes
2.1 Development environment options for deep learning
2.2 Code for exploring Pandas
2.3 Pandas dataframes in Python
2.4 Ingesting CSV files into Pandas dataframes
2.5 Using Pandas to do what you would do with SQL
2.6 The major example: Predicting streetcar delays
2.7 Why is a real-world dataset critical for learning about deep learning?
2.8 Format and scope of the input dataset
2.9 The destination: An end-to-end solution
2.10 More details on the code that makes up the solutions
2.11 Development environments: Vanilla vs. deep-learning-enabled
2.12 A deeper look at the objections to deep learning
2.13 How deep learning has become more accessible
2.14 A first taste of training a deep learning model
Summary
Chapter 3: Preparing the data, part 1: Exploring and cleansing the data
3.1 Code for exploring and cleansing the data
3.2 Using config files with Python
3.3 Ingesting XLS files into a Pandas dataframe
3.4 Using pickle to save your Pandas dataframe from one session to another
3.5 Exploring the data
3.6 Categorizing data into continuous, categorical, and text categories
3.7 Cleaning up problems in the dataset: missing data, errors, and guesses
3.8 Finding out how much data deep learning needs
Summary
Chapter 4: Preparing the data, part 2: Transforming the data
4.1 Code for preparing and transforming the data
4.2 Dealing with incorrect values: Routes
4.3 Why only one substitute for all bad values?
4.4 Dealing with incorrect values: Vehicles
4.5 Dealing with inconsistent values: Location
4.6 Going the distance: Locations
4.7 Fixing type mismatches
4.8 Dealing with rows that still contain bad data
4.9 Creating derived columns
4.10 Preparing non-numeric data to train a deep learning model
4.11 Overview of the end-to-end solution
Summary
Chapter 5: Preparing and building the model
5.1 Data leakage and features that are fair game for training the model
5.2 Domain expertise and minimal scoring tests to prevent data leakage
5.3 Preventing data leakage in the streetcar delay prediction problem
5.4 Code for exploring Keras and building the model
5.5 Deriving the dataframe to use to train the model
5.6 Transforming the dataframe into the format expected by the Keras model
5.7 A brief history of Keras and TensorFlow
5.8 Migrating from TensorFlow 1.x to TensorFlow 2
5.9 TensorFlow vs. PyTorch
5.10 The structure of a deep learning model in Keras
5.11 How the data structure defines the Keras model
5.12 The power of embeddings
5.13 Code to build a Keras model automatically based on the data structure
5.14 Exploring your model
5.15 Model parameters
Summary
Chapter 6: Training the model and running experiments
6.1 Code for training the deep learning model
6.2 Reviewing the process of training a deep learning model
6.3 Reviewing the overall goal of the streetcar delay prediction model
6.4 Selecting the train, validation, and test datasets
6.5 Initial training run
6.6 Measuring the performance of your model
6.7 Keras callbacks: Getting the best out of your training runs
6.8 Getting identical results from multiple training runs
6.9 Shortcuts to scoring
6.10 Explicitly saving trained models
6.11 Running a series of training experiments
Summary
Chapter 7: More experiments with the trained model
7.1 Code for more experiments with the model
7.2 Validating whether removing bad values improves the model
7.3 Validating whether embeddings for columns improve the performance of the model
7.4 Comparing the deep learning model with XGBoost
7.5 Possible next steps for improving the deep learning model
Summary
Chapter 8: Deploying the model
8.1 Overview of model deployment
8.2 If deployment is so important, why is it so hard?
8.3 Review of one-off scoring
8.4 The user experience with web deployment
8.5 Steps to deploy your model with web deployment
8.6 Behind the scenes with web deployment
8.7 The user experience with Facebook Messenger deployment
8.8 Behind the scenes with Facebook Messenger deployment
8.9 More background on Rasa
8.10 Steps to deploy your model in Facebook Messenger with Rasa
8.11 Introduction to pipelines
8.12 Defining pipelines in the model training phase
8.13 Applying pipelines in the scoring phase
8.14 Maintaining a model after deployment
Summary
Chapter 9: Recommended next steps
9.1 Reviewing what we have covered so far
9.2 What we could do next with the streetcar delay prediction project
9.3 Adding location details to the streetcar delay prediction project
9.4 Training our deep learning model with weather data
9.5 Adding season or time of day to the streetcar delay prediction project
9.6 Imputation: An alternative to removing records with bad values
9.7 Making the web deployment of the streetcar delay prediction model generally available
9.8 Adapting the streetcar delay prediction model to a new dataset
9.9 Preparing the dataset and training the model
9.10 Deploying the model with web deployment
9.11 Deploying the model with Facebook Messenger
9.12 Adapting the approach in this book to a different dataset
9.13 Resources for additional learning
Summary
appendix: Using Google Colaboratory
A.1 Introduction to Colab
A.2 Making Google Drive available in your Colab session
A.3 Making the repo available in Colab and running notebooks
A.4 Pros and cons of Colab and Paperspace
index
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
R
S
T
V
W
X
Y