Natural Language Processing and Information Retrieval; Principles and Applications

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book presents the basics and recent advancements in natural language processing and information retrieval in a single volume. It will serve as an ideal reference text for graduate students and academic researchers in interdisciplinary areas of electrical engineering, electronics engineering, computer engineering, and information technology. This text emphasizes the existing problem domains and possible new directions in natural language processing and information retrieval. It discusses the importance of information retrieval with the integration of machine learning, deep learning, and word embedding. This approach supports the quick evaluation of real-time data. It covers important topics including rumor detection techniques, sentiment analysis using graph-based techniques, social media data analysis, and language-independent text mining. Features • Covers aspects of information retrieval in different areas including healthcare, data analysis, and machine translation • Discusses recent advancements in language- and domain-independent information extraction from textual and/or multimodal data • Explains models including decision making, random walk, knowledge graphs, word embedding, n-grams, and frequent pattern mining • Provides integrated approaches of machine learning, deep learning, and word embedding for natural language processing • Covers latest datasets for natural language processing and information retrieval for social media like Twitter The text is primarily written for graduate students and academic researchers in interdisciplinary areas of electrical engineering, electronics engineering, computer engineering, and information technology.

Author(s): Muskan Garg, Sandeep Kumar, Abdul Khader Jilani Saudagar
Publisher: CRC Press
Year: 2023

Language: English
Pages: 271

Cover
Half Title
Series Page
Title Page
Copyright Page
Table of Contents
Preface
Editors
Contributors
1 Federated Learning for Natural Language Processing
1.1 Introduction
1.1.1 Centralized (Standard) Federated Learning
1.1.2 Aggregation Algorithms
1.1.3 DSSGD
1.1.4 FedAvg
1.1.5 FedProx
1.1.6 SCAFFOLD
1.1.7 FedOpt
1.1.8 Other Algorithms
1.1.9 The Cross-Device Setting
1.1.10 The Cross-Silo Setting
1.1.11 Horizontal Federated Learning
1.1.12 Vertical Federated Learning
1.1.13 Federated Transfer Learning
1.2 Split Learning and Split Federated Learning
1.2.1 Vanilla Split Learning
1.2.2 Configurations of Split Learning
1.2.3 SplitFed Learning
1.3 Decentralized (Peer-to-Peer) Federated Learning
1.4 Summary
1.5 NLP Through FL
1.6 Security and Privacy Protection
1.7 FL Platforms and Datasets
1.8 Conclusion
References
2 Utility-Based Recommendation System for Large Datasets Using EAHUIM
2.1 Introduction
2.2 Related Work
2.2.1 Recommendation System
2.2.2 Recommendation Systems in E-Commerce
2.2.3 Recommendation System with Utility Itemset Mining
2.2.4 Prerequisites of Utility Itemset Mining
2.2.5 Problem Definition
2.3 Proposed Model
2.3.1 Design of the Model – Recommendation System with EAHUIM
2.4 Results and Discussion
2.4.1 Setup
2.4.2 Data Collection and Preprocessing
2.4.3 Performance Evaluation
2.4.4 Limitations of the Model
2.4.5 Discussions
2.5 Conclusion
References
3 Anaphora Resolution: A Complete View with Case Study
3.1 Introduction
3.1.1 Issues and Challenges of an Anaphora
3.1.2 Need for Anaphora in NLP Applications
3.1.3 Anaphora
3.1.4 Discourse Anaphora
3.2 Approaches to Anaphora Resolution
3.2.1 Knowledge-Rich Approaches
3.2.2 Corpus Based Approaches
3.2.3 Knowledge-Poor Approaches
3.3 Case Study of Anaphora Resolution in Marathi Text
3.3.1 Development of POS Tagger
3.3.2 Anaphora Resolution System Architecture
3.4 Case Study of Anaphora Resolution in Marathi with Python
3.4.1 Database
3.4.2 Preprocessing
3.4.3 Post-Processing
3.5 Conclusion
References
4 A Review of the Approaches to Neural Machine Translation
4.1 Introduction
4.2 Machine Translation Approaches
4.3 Formulation of the NMT Task
4.4 The Encoder-Decoder Model
4.4.1 Encoder
4.4.2 Decoder
4.5 RNNs as Encoder-Decoder Models
4.5.1 One-Hot Encoding
4.5.2 Variations of RNNs
4.5.3 Discussion and Inferences
4.6 LSTMs: Dealing with Long-Term Dependencies and Vanishing Gradients
4.6.1 GRUs
4.6.2 Limitations of LSTMs
4.7 NMT with Attention
4.8 Recent Developments in NMT
4.8.1 Word Embeddings
4.8.2 CNN-Based NMT
4.8.3 Fully Attention-Based NMT
4.8.4 Transformer-Based Pre-Trained Models
4.8.5 Improved Transformer Models
4.9 NMT in Low-Resource Languages
4.10 Vocabulary Coverage Problem
4.11 Datasets for Machine Translation
4.12 Challenges and Future Scope
References
5 Evolution of Question-Answering System From Information Retrieval: A Scientific Time Travel for Bangla
5.1 Introduction
5.2 The Meaning and Various Ways of Research Done in the Field of Semantics
5.3 Semantic Text Retrieval: Automatic Questionanswering System or Query System
5.4 State-of-the-Art Performance
5.5 Latest Research Works on Question-Answering Systems in Related Major Global Languages
5.6 Latest Research Works on Question-Answering System in Major Indian Languages
5.7 Backend Database or Repository Used in the Question-Answering System
5.8 Different Approaches of Algorithms Used in Bangla QA System Research
5.9 Different Algorithms Used in Bangla QA System Research
5.10 Results Achieved by Various Bangla QA Systems
5.11 Conclusion
References
6 Recent Advances in Textual Code-Switching
6.1 Introduction
6.2 Background
6.2.1 Theoretical Approaches and Types of Code-Switching
6.2.2 Measuring Code-Switching Complexity
6.3 Code-Switching Datasets
6.3.1 Language Identification
6.3.2 POS Tagging
6.3.3 Named Entity Recognition
6.3.4 Chunking and Dependency Parsing
6.3.5 Sentiment Analysis
6.3.6 Question-Answering
6.3.7 Conversational Systems
6.3.8 Machine Translation
6.3.9 Natural Language Inference
6.4 NLP Techniques for Textual Code-Switching
6.4.1 Language Modeling
6.4.2 Language Identification
6.4.3 POS Tagging
6.4.4 Named Entity Recognition
6.4.5 Dependency Parsing
6.4.6 Sentiment Analysis
6.4.7 Natural Language Inference
6.4.8 Machine Translation
6.4.9 Question-Answering
6.5 Evaluation Of Code-Switched Systems
6.6 Current Limitations and Future Work
References
7 Legal Document Summarization Using Hybrid Model
7.1 Introduction
7.1.1 Background
7.1.2 Motivation
7.1.3 Problem Definition
7.1.4 Objectives and Scopes
7.1.5 Organization
7.2 Literature Review
7.2.1 Automatic Text Summarization in the Legal Domain
7.3 Methodology
7.3.1 Legal Document Summary
7.3.2 Evaluation
7.4 Experiments and Results
7.4.1 Evaluating Extractive Model
7.4.2 Effects of Different K Values (Summary Length)
7.4.3 Evaluating Abstractive Summary Model
7.4.4 Comparison with Extractive Summarization Models
7.4.5 Comparison with Abstractive Model
7.5 Conclusion
References
8 Concept Network Using Network Text Analysis
8.1 Introduction
8.2 Literature Review
8.3 The Concept Network
8.3.1 Concept-Based Information Retrieval
8.3.2 Concept Networks and Extended Fuzzy Concept Networks
8.3.3 Applications for Fuzzy Concept Knowledge
8.3.4 Building WikiNet: Using Wikipedia as the Source
8.4 Network Text Analysis
8.4.1 Extracting Context Words From Training Documents
8.4.2 Building Bigram Frequency for Text Classification
8.4.3 Detecting Related Articles by Using Bigrams
8.5 Conclusion and Future Direction
References
9 Question-Answering Versus Machine Reading Comprehension: Neural Machine Reading Comprehension Using Transformer Models
9.1 Introduction
9.2 Architecture of Machine Reading Comprehension
9.2.1 Word Embedding
9.2.2 Feature Extraction
9.2.3 Context-Question Interaction
9.2.4 Answer Prediction
9.3 Machine Reading Comprehension Tasks and Classification
9.3.1 Cloze Tests
9.3.2 Multiple-Choice Questions
9.3.3 Span Extraction
9.3.4 Free-Form Answering
9.3.5 Attribute-Based Classification
9.4 Datasets
9.5 Performance Evaluation Metrics
9.5.1 Accuracy
9.5.2 Exact Match
9.5.3 Precision and Recall
9.5.4 F1 Score
9.5.6 ROUGE (Recall-Oriented Understudy for Gisting Evaluation)
9.5.7 BLEU (Bilingual Evaluation Understudy)
9.5.8 METEOR (Metric for Evaluation of Translation with Explicit Ordering)
9.5.9 HEQ (Human Equivalence Score)
9.6 Transformer and BERT
9.6.1 BERT-Based Models
9.7 Results and Discussion
9.8 Conclusion and Future Enhancement
References
10 Online Subjective Question-Answering System Necessity of Education System
10.1 Introduction
10.1.1 Brief on NLP (Natural Language Processing)
10.1.2 Question
10.1.3 Answer
10.1.4 Question-Answering System
10.1.5 Types of Question-Answering
10.1.6 Approaches Used for Developing QA System
10.1.7 Components of Question-Answering
10.1.8 Need for Question-Answering System
10.2 Question Types Follow Into Two Categories
10.2.1 Objective Examination
10.2.2 Subjective Examination
10.2.3 Subjective Examination-Related Work
10.2.4 Why is Subjective Examination Important
10.2.5 Online Education System
10.3 Proposed Model
10.3.1 Text Document
10.3.2 Preprocessing
10.3.3 POS Tagging
10.3.4 Question Generation
10.3.5 User (Students) and User (Teacher)
10.3.6 Model Answer
10.3.7 Answer
10.3.8 Evaluation
10.3.9 Result
10.4 Conclusion
References
Index