Deep learning methods are achieving state-of-the-art results on challenging machine learning problems such as describing photos and translating text from one language to another.
In this new laser-focused Ebook written in the friendly Machine Learning Mastery style that you’re used to, finally cut through the math, research papers and patchwork descriptions about natural language processing.
Using clear explanations, standard Python libraries (Keras and TensorFlow 2) and step-by-step tutorial lessons you will discover what natural language processing is, the promise of deep learning in the field, how to clean and prepare text data for modeling, and how to develop deep learning models for your own natural language processing projects.
Author(s): Jason Brownlee
Series: Machine Learning Mastery
Edition: 1.1
Publisher: Independently Published
Year: 2017
Language: English
Pages: 397
Copyright
Contents
Preface
I Introductions
Welcome
Who Is This Book For?
About Your Outcomes
How to Read This Book
About the Book Structure
About Python Code Examples
About Further Reading
About Getting Help
Summary
II Foundations
Natural Language Processing
Natural Language
Challenge of Natural Language
From Linguistics to Natural Language Processing
Natural Language Processing
Further Reading
Summary
Deep Learning
Deep Learning is Large Neural Networks
Deep Learning is Hierarchical Feature Learning
Deep Learning as Scalable Learning Across Domains
Further Reading
Summary
Promise of Deep Learning for Natural Language
Promise of Deep Learning
Promise of Drop-in Replacement Models
Promise of New NLP Models
Promise of Feature Learning
Promise of Continued Improvement
Promise of End-to-End Models
Further Reading
Summary
How to Develop Deep Learning Models With Keras
Keras Model Life-Cycle
Keras Functional Models
Standard Network Models
Further Reading
Summary
III Data Preparation
How to Clean Text Manually and with NLTK
Tutorial Overview
Metamorphosis by Franz Kafka
Text Cleaning Is Task Specific
Manual Tokenization
Tokenization and Cleaning with NLTK
Additional Text Cleaning Considerations
Further Reading
Summary
How to Prepare Text Data with scikit-learn
The Bag-of-Words Model
Word Counts with CountVectorizer
Word Frequencies with TfidfVectorizer
Hashing with HashingVectorizer
Further Reading
Summary
How to Prepare Text Data With Keras
Tutorial Overview
Split Words with text_to_word_sequence
Encoding with one_hot
Hash Encoding with hashing_trick
Tokenizer API
Further Reading
Summary
IV Bag-of-Words
The Bag-of-Words Model
Tutorial Overview
The Problem with Text
What is a Bag-of-Words?
Example of the Bag-of-Words Model
Managing Vocabulary
Scoring Words
Limitations of Bag-of-Words
Further Reading
Summary
How to Prepare Movie Review Data for Sentiment Analysis
Tutorial Overview
Movie Review Dataset
Load Text Data
Clean Text Data
Develop Vocabulary
Save Prepared Data
Further Reading
Summary
Project: Develop a Neural Bag-of-Words Model for Sentiment Analysis
Tutorial Overview
Movie Review Dataset
Data Preparation
Bag-of-Words Representation
Sentiment Analysis Models
Comparing Word Scoring Methods
Predicting Sentiment for New Reviews
Extensions
Further Reading
Summary
V Word Embeddings
The Word Embedding Model
Overview
What Are Word Embeddings?
Word Embedding Algorithms
Using Word Embeddings
Further Reading
Summary
How to Develop Word Embeddings with Gensim
Tutorial Overview
Word Embeddings
Gensim Python Library
Develop Word2Vec Embedding
Visualize Word Embedding
Load Google's Word2Vec Embedding
Load Stanford's GloVe Embedding
Further Reading
Summary
How to Learn and Load Word Embeddings in Keras
Tutorial Overview
Word Embedding
Keras Embedding Layer
Example of Learning an Embedding
Example of Using Pre-Trained GloVe Embedding
Tips for Cleaning Text for Word Embedding
Further Reading
Summary
VI Text Classification
Neural Models for Document Classification
Overview
Word Embeddings + CNN = Text Classification
Use a Single Layer CNN Architecture
Dial in CNN Hyperparameters
Consider Character-Level CNNs
Consider Deeper CNNs for Classification
Further Reading
Summary
Project: Develop an Embedding + CNN Model for Sentiment Analysis
Tutorial Overview
Movie Review Dataset
Data Preparation
Train CNN With Embedding Layer
Evaluate Model
Extensions
Further Reading
Summary
Project: Develop an n-gram CNN Model for Sentiment Analysis
Tutorial Overview
Movie Review Dataset
Data Preparation
Develop Multichannel Model
Evaluate Model
Extensions
Further Reading
Summary
VII Language Modeling
Neural Language Modeling
Overview
Problem of Modeling Language
Statistical Language Modeling
Neural Language Models
Further Reading
Summary
How to Develop a Character-Based Neural Language Model
Tutorial Overview
Sing a Song of Sixpence
Data Preparation
Train Language Model
Generate Text
Further Reading
Summary
How to Develop a Word-Based Neural Language Model
Tutorial Overview
Framing Language Modeling
Jack and Jill Nursery Rhyme
Model 1: One-Word-In, One-Word-Out Sequences
Model 2: Line-by-Line Sequence
Model 3: Two-Words-In, One-Word-Out Sequence
Further Reading
Summary
Project: Develop a Neural Language Model for Text Generation
Tutorial Overview
The Republic by Plato
Data Preparation
Train Language Model
Use Language Model
Extensions
Further Reading
Summary
VIII Image Captioning
Neural Image Caption Generation
Overview
Describing an Image with Text
Neural Captioning Model
Encoder-Decoder Architecture
Further Reading
Summary
Neural Network Models for Caption Generation
Image Caption Generation
Inject Model
Merge Model
More on the Merge Model
Further Reading
Summary
How to Load and Use a Pre-Trained Object Recognition Model
Tutorial Overview
ImageNet
The Oxford VGG Models
Load the VGG Model in Keras
Develop a Simple Photo Classifier
Further Reading
Summary
How to Evaluate Generated Text With the BLEU Score
Tutorial Overview
Bilingual Evaluation Understudy Score
Calculate BLEU Scores
Cumulative and Individual BLEU Scores
Worked Examples
Further Reading
Summary
How to Prepare a Photo Caption Dataset For Modeling
Tutorial Overview
Download the Flickr8K Dataset
How to Load Photographs
Pre-Calculate Photo Features
How to Load Descriptions
Prepare Description Text
Whole Description Sequence Model
Word-By-Word Model
Progressive Loading
Further Reading
Summary
Project: Develop a Neural Image Caption Generation Model
Tutorial Overview
Photo and Caption Dataset
Prepare Photo Data
Prepare Text Data
Develop Deep Learning Model
Evaluate Model
Generate New Captions
Extensions
Further Reading
Summary
IX Machine Translation
Neural Machine Translation
What is Machine Translation?
What is Statistical Machine Translation?
What is Neural Machine Translation?
Further Reading
Summary
What are Encoder-Decoder Models for Neural Machine Translation
Encoder-Decoder Architecture for NMT
Sutskever NMT Model
Cho NMT Model
Further Reading
Summary
How to Configure Encoder-Decoder Models for Machine Translation
Encoder-Decoder Model for Neural Machine Translation
Baseline Model
Word Embedding Size
RNN Cell Type
Encoder-Decoder Depth
Direction of Encoder Input
Attention Mechanism
Inference
Final Model
Further Reading
Summary
Project: Develop a Neural Machine Translation Model
Tutorial Overview
German to English Translation Dataset
Preparing the Text Data
Train Neural Translation Model
Evaluate Neural Translation Model
Extensions
Further Reading
Summary
X Appendix
Getting Help
Official Keras Destinations
Where to Get Help with Keras
Where to Get Help with Natural Language
How to Ask Questions
Contact the Author
How to Setup a Workstation for Deep Learning
Overview
Download Anaconda
Install Anaconda
Start and Update Anaconda
Install Deep Learning Libraries
Further Reading
Summary
How to Use Deep Learning in the Cloud
Overview
Setup Your AWS Account
Launch Your Server Instance
Login, Configure and Run
Build and Run Models on AWS
Close Your EC2 Instance
Tips and Tricks for Using Keras on AWS
Further Reading
Summary
XI Conclusions
How Far You Have Come