This textbook presents a concise, accessible and engaging first introduction to deep learning, offering a wide range of connectionist models which represent the current state-of-the-art. The text explores the most popular algorithms and architectures in a simple and intuitive style, explaining the mathematical derivations in a step-by-step manner. The content coverage includes convolutional networks, LSTMs, Word2vec, RBMs, DBNs, neural Turing machines, memory networks and autoencoders. Numerous examples in working Python code are provided throughout the book, and the code is also supplied separately at an accompanying website. Topics and features: introduces the fundamentals of machine learning, and the mathematical and computational prerequisites for deep learning; discusses feed-forward neural networks, and explores the modifications to these which can be applied to any neural network; examines convolutional neural networks, and the recurrent connections to a feed-forward neural network; describes the notion of distributed representations, the concept of the autoencoder, and the ideas behind language processing with deep learning; presents a brief history of artificial intelligence and neural networks, and reviews interesting open research problems in deep learning and connectionism. This clearly written and lively primer on deep learning is essential reading for graduate and advanced undergraduate students of computer science, cognitive science and mathematics, as well as fields such as linguistics, logic, philosophy, and psychology.
Author(s): Sandro Skansi
Publisher: Springer
Year: 2018
Language: English
Pages: 191
Preface
References
Contents
1 From Logic to Cognitive Science
1.1 The Beginnings of Artificial Neural Networks
1.2 The XOR Problem
1.3 From Cognitive Science to Deep Learning
1.4 Neural Networks in the General AI Landscape
1.5 Philosophical and Cognitive Aspects
2 Mathematical and Computational Prerequisites
2.1 Derivations and Function Minimization
2.2 Vectors, Matrices and Linear Programming
2.3 Probability Distributions
2.4 Logic and Turing Machines
2.5 Writing Python Code
2.6 A Brief Overview of Python Programming
3 Machine Learning Basics
3.1 Elementary Classification Problem
3.2 Evaluating Classification Results
3.3 A Simple Classifier: Naive Bayes
3.4 A Simple Neural Network: Logistic Regression
3.5 Introducing the MNIST Dataset
3.6 Learning Without Labels: K-Means
3.7 Learning Different Representations: PCA
3.8 Learning Language: The Bag of Words Representation
4 Feedforward Neural Networks
4.1 Basic Concepts and Terminology for Neural Networks
4.2 Representing Network Components with Vectors and Matrices
4.3 The Perceptron Rule
4.4 The Delta Rule
4.5 From the Logistic Neuron to Backpropagation
4.6 Backpropagation
4.7 A Complete Feedforward Neural Network
5 Modifications and Extensions to a Feed-Forward Neural Network
5.1 The Idea of Regularization
5.2 L1 and L2 Regularization
5.3 Learning Rate, Momentum and Dropout
5.4 Stochastic Gradient Descent and Online Learning
5.5 Problems for Multiple Hidden Layers: Vanishing and Exploding Gradients
6 Convolutional Neural Networks
6.1 A Third Visit to Logistic Regression
6.2 Feature Maps and Pooling
6.3 A Complete Convolutional Network
6.4 Using a Convolutional Network to Classify Text
7 Recurrent Neural Networks
7.1 Sequences of Unequal Length
7.2 The Three Settings of Learning with Recurrent Neural Networks
7.3 Adding Feedback Loops and Unfolding a Neural Network
7.4 Elman Networks
7.5 Long Short-Term Memory
7.6 Using a Recurrent Neural Network for Predicting Following Words
8 Autoencoders
8.1 Learning Representations
8.2 Different Autoencoder Architectures
8.3 Stacking Autoencoders
8.4 Recreating the Cat Paper
9 Neural Language Models
9.1 Word Embeddings and Word Analogies
9.2 CBOW and Word2vec
9.3 Word2vec in Code
9.4 Walking Through the Word-Space: An Idea That Has Eluded Symbolic AI
10 An Overview of Different Neural Network Architectures
10.1 Energy-Based Models
10.2 Memory-Based Models
10.3 The Kernel of General Connectionist Intelligence: The bAbI Dataset
11 Conclusion
11.1 An Incomplete Overview of Open Research Questions
11.2 The Spirit of Connectionism and Philosophical Ties
Index