Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Embeddings have undoubtedly been one of the most influential research areas in Natural Language Processing (NLP). Encoding information into a low-dimensional vector representation, which is easily integrable in modern machine learning models, has played a central role in the development of NLP. Embedding techniques initially focused on words, but the attention soon started to shift to other forms: from graph structures, such as knowledge bases, to other types of textual content, such as sentences and documents. This book provides a high-level synthesis of the main embedding techniques in NLP, in the broad sense. The book starts by explaining conventional word vector space models and word embeddings (e.g., Word2Vec and GloVe) and then moves to other types of embeddings, such as word sense, sentence and document, and graph embeddings. The book also provides an overview of recent developments in contextualized representations (e.g., ELMo and BERT) and explains their potential in NLP. Throughout the book, the reader can find both essential information for understanding a certain topic from scratch and a broad overview of the most successful techniques developed in the literature.

Author(s): Mohammad Taher Pilehvar, Jose Camacho-Collados
Series: Synthesis Lectures on Human Language Technologies
Edition: 1
Publisher: Morgan & Claypool
Year: 2020

Language: English
Commentary: Vector PDF
Pages: 175
City: San Rafael, CA
Tags: Machine Learning; Natural Language Processing; Semantic Analysis; word2vec; GloVe; Vector Space Model; Word Embeddings; Contextualized Embeddings; Sense Embeddings; Graph Embeddings; Sentence Embeddings

Preface
Introduction
Semantic Representation
Vector Space Models
The Evolution Path of Representations
Background
Natural Language Processing Fundamentals
Linguistic Fundamentals
Language Models
Deep Learning for NLP
Sequence Encoding
Recurrent Neural Networks (RNNs)
Transformers
Knowledge Resources
WordNet
Wikipedia, Freebase, Wikidata, DBpedia, and Wiktionary
BabelNet and ConceptNet
PPDB: The Paraphrase Database
Word Embeddings
Count-Based Models
Pointwise Mutual Information
Dimensionality Reduction
Random Indexing
Predictive Models
Character Embeddings
Knowledge-Enhanced Word Embeddings
Cross-Lingual Word Embeddings
Sentence-Level Supervision
Document-Level Supervision
Word-Level Supervision
Unsupervised
Evaluation
Intrinsic Evaluation
Extrinsic Evaluation
Graph Embeddings
Node Embedding
Matrix Factorization Methods
Random Walk Methods
Incorporating Node Attributes
Graph Neural Network Methods
Knowledge-Based Relation Embeddings
Unsupervised Relation Embeddings
Applications and Evaluation
Node Embedding
Relation Embedding
Sense Embeddings
Unsupervised Sense Embeddings
Sense Representations Exploiting Monolingual Corpora
Sense Representations Exploiting Multilingual Corpora
Knowledge-Based Sense Embeddings
Evaluation and Application
Contextualized Embeddings
The Need for Contextualization
Background: Transformer Model
Self-Attention
Encoder
Decoder
Positional Encoding
Contextualized Word Embeddings
Earlier Methods
Language Models for Word Representation
RNN-Based Models
Transformer-Based Models: BERT
Masked Language Modeling
Next Sentence Prediction
Training
Extensions
Translation Language Modeling
Context Fragmentation
Permutation Language Modeling
Reducing Model Size
Feature Extraction and Finetuning
Analysis and Evaluation
Self Attention Patterns
Syntactic Properties
Depth-Wise Information Progression
Multilinguality
Lexical Contextualization
Evaluation
Sentence and Document Embeddings
Unsupervised Sentence Embeddings
Bag of Words
Sentence-Level Training
Supervised Sentence Embeddings
Document Embeddings
Application and Evaluation
Ethics and Bias
Bias in Word Embeddings
Debiasing Word Embeddings
Conclusions
Bibliography
Authors' Biographies