This book provides an overview of the recent advances in representation learning theory, algorithms, and applications for natural language processing (NLP), ranging from word embeddings to pre-trained language models. It is divided into four parts. Part I presents the representation learning techniques for multiple language entries, including words, sentences and documents, as well as pre-training techniques. Part II then introduces the related representation techniques to NLP, including graphs, cross-modal entries, and robustness. Part III then introduces the representation techniques for the knowledge that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, legal domain knowledge and biomedical domain knowledge. Lastly, Part IV discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing. As compared to the first edition, the second edition (1) provides a more detailed introduction to representation learning in Chapter 1; (2) adds four new chapters to introduce pre-trained language models, robust representation learning, legal knowledge representation learning and biomedical knowledge representation learning; (3) updates recent advances in representation learning in all chapters; and (4) corrects some errors in the first edition. The new contents will be approximately 50%+ compared to the first edition. This is an open access book.
Author(s): Zhiyuan Liu, Yankai Lin, Maosong Sun
Publisher: Springer
Year: 2023
Language: English
Pages: 541
Preface
Acknowledgements
Contents
Acronyms
Symbols and Notations
1 Representation Learning and NLP
1.1 Motivation
1.2 Why Representation Learning Is Important for NLP
1.3 Basic Ideas of Representation Learning
1.4 Development of Representation Learning for NLP
1.5 Learning Approaches to Representation Learning for NLP
1.6 Applications of Representation Learning for NLP
1.7 The Organization of This Book
References
2 Word Representation
2.1 Introduction
2.2 One-Hot Word Representation
2.3 Distributed Word Representation
2.3.1 Brown Cluster
2.3.2 Latent Semantic Analysis
2.3.3 Word2vec
2.3.4 GloVe
2.4 Contextualized Word Representation
2.5 Extensions
2.5.1 Word Representation Theories
2.5.2 Multi-prototype Word Representation
2.5.3 Multisource Word Representation
2.5.4 Multilingual Word Representation
2.5.5 Task-Specific Word Representation
2.5.6 Time-Specific Word Representation
2.6 Evaluation
2.6.1 Word Similarity/Relatedness
2.6.2 Word Analogy
2.7 Summary
References
3 Compositional Semantics
3.1 Introduction
3.2 Semantic Space
3.2.1 Vector Space
3.2.2 Matrix-Vector Space
3.3 Binary Composition
3.3.1 Additive Model
3.3.2 Multiplicative Model
3.4 N-Ary Composition
3.4.1 Recurrent Neural Network
3.4.2 Recursive Neural Network
3.4.3 Convolutional Neural Network
3.5 Summary
References
4 Sentence Representation
4.1 Introduction
4.2 One-Hot Sentence Representation
4.3 Probabilistic Language Model
4.4 Neural Language Model
4.4.1 Feedforward Neural Network Language Model
4.4.2 Convolutional Neural Network Language Model
4.4.3 Recurrent Neural Network Language Model
4.4.4 Transformer Language Model
4.4.5 Extensions
4.5 Applications
4.5.1 Text Classification
4.5.2 Relation Extraction
4.6 Summary
References
5 RETRACTED CHAPTER: Document Representation
6 Sememe Knowledge Representation
6.1 Introduction
6.1.1 Linguistic Knowledge Graphs
6.2 Sememe Knowledge Representation
6.2.1 Simple Sememe Aggregation Model
6.2.2 Sememe Attention over Context Model
6.2.3 Sememe Attention over Target Model
6.3 Applications
6.3.1 Sememe-Guided Word Representation
6.3.2 Sememe-Guided Semantic Compositionality Modeling
6.3.3 Sememe-Guided Language Modeling
6.3.4 Sememe Prediction
6.3.5 Other Sememe-Guided Applications
6.4 Summary
References
7 World Knowledge Representation
7.1 Introduction
7.1.1 World Knowledge Graphs
7.2 Knowledge Graph Representation
7.2.1 Notations
7.2.2 TransE
7.2.3 Extensions of TransE
7.2.4 Other Models
7.3 Multisource Knowledge Graph Representation
7.3.1 Knowledge Graph Representation with Texts
7.3.2 Knowledge Graph Representation with Types
7.3.3 Knowledge Graph Representation with Images
7.3.4 Knowledge Graph Representation with Logic Rules
7.4 Applications
7.4.1 Knowledge Graph Completion
7.4.2 Knowledge-Guided Entity Typing
7.4.3 Knowledge-Guided Information Retrieval
7.4.4 Knowledge-Guided Language Models
7.4.5 Other Knowledge-Guided Applications
7.5 Summary
References
8 Network Representation
8.1 Introduction
8.2 Network Representation
8.2.1 Spectral Clustering Based Methods
8.2.2 DeepWalk
8.2.3 Matrix Factorization Based Methods
8.2.4 Structural Deep Network Methods
8.2.5 Extensions
8.2.6 Applications
8.3 Graph Neural Networks
8.3.1 Motivations
8.3.2 Graph Convolutional Networks
8.3.3 Graph Attention Networks
8.3.4 Graph Recurrent Networks
8.3.5 Extensions
8.3.6 Applications
8.4 Summary
References
9 Cross-Modal Representation
9.1 Introduction
9.2 Cross-Modal Representation
9.2.1 Visual Word2vec
9.2.2 Cross-Modal Representation for Zero-Shot Recognition
9.2.3 Cross-Modal Representation for Cross-Media Retrieval
9.3 Image Captioning
9.3.1 Retrieval Models for Image Captioning
9.3.2 Generation Models for Image Captioning
9.3.3 Neural Models for Image Captioning
9.4 Visual Relationship Detection
9.4.1 Visual Relationship Detection with Language Priors
9.4.2 Visual Translation Embedding Network
9.4.3 Scene Graph Generation
9.5 Visual Question Answering
9.5.1 VQA and VQA Datasets
9.5.2 VQA Models
9.6 Summary
References
10 Resources
10.1 Open-Source Frameworks for Deep Learning
10.1.1 Caffe
10.1.2 Theano
10.1.3 TensorFlow
10.1.4 Torch
10.1.5 PyTorch
10.1.6 Keras
10.1.7 MXNet
10.2 Open Resources for Word Representation
10.2.1 Word2Vec
10.2.2 GloVe
10.3 Open Resources for Knowledge Graph Representation
10.3.1 OpenKE
10.3.2 Scikit-Kge
10.4 Open Resources for Network Representation
10.4.1 OpenNE
10.4.2 GEM
10.4.3 GraphVite
10.4.4 CogDL
10.5 Open Resources for Relation Extraction
10.5.1 OpenNRE
References
11 Outlook
11.1 Introduction
11.2 Using More Unsupervised Data
11.3 Utilizing Fewer Labeled Data
11.4 Employing Deeper Neural Architectures
11.5 Improving Model Interpretability
11.6 Fusing the Advances from Other Areas
References
Correction to: Z. Liu et al., Representation Learning for Natural Language Processing, https://doi.org/10.1007/978-981-15-5573-2