Natural Language Processing and Information Retrieval; Principles and Applications

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book presents the basics and recent advancements in natural language processing and information retrieval in a single volume. It will serve as an ideal reference text for graduate students and academic researchers in interdisciplinary areas of electrical engineering, electronics engineering, computer engineering, and information technology. This text emphasizes the existing problem domains and possible new directions in natural language processing and information retrieval. It discusses the importance of information retrieval with the integration of machine learning, deep learning, and word embedding. This approach supports the quick evaluation of real-time data. It covers important topics including rumor detection techniques, sentiment analysis using graph-based techniques, social media data analysis, and language-independent text mining. Features • Covers aspects of information retrieval in different areas including healthcare, data analysis, and machine translation • Discusses recent advancements in language- and domain-independent information extraction from textual and/or multimodal data • Explains models including decision making, random walk, knowledge graphs, word embedding, n-grams, and frequent pattern mining • Provides integrated approaches of machine learning, deep learning, and word embedding for natural language processing • Covers latest datasets for natural language processing and information retrieval for social media like Twitter The text is primarily written for graduate students and academic researchers in interdisciplinary areas of electrical engineering, electronics engineering, computer engineering, and information technology.

Author(s): Muskan Garg, Sandeep Kumar, Abdul Khader Jilani Saudagar
Publisher: Taylor & Francis Group
Year: 2023

Language: English
Pages: 252

1 Federated learning for natural language processing

Sergei Ternovykh and Anastasia Nikiforova

1.1 Introduction

1.1.1 Centralized (standard) federated learning

1.1.2 Aggregation algorithms

1.1.3 DSSGD

1.1.4 FedAvg

1.1.5 FedProx

1.1.6 SCAFFOLD

1.1.7 FedOpt

1.1.8 Other algorithms

1.1.9 The cross-device setting

1.1.10 The cross-silo setting

1.1.11 Horizontal federated learning

1.1.12 Vertical federated learning

1.1.13 Federated transfer learning

1.2 Split learning and split federated learning 15

1.2.1 Vanilla split learning

1.2.2 Configurations of split learning

1.2.3 SplitFed learning

1.3 Decentralized (peer-to-peer) federated learning

1.4 Summary

1.5 NLP through FL

1.6 Security and privacy protection

1.7 FL platforms and datasets

1.8 Conclusion

References

2 Utility-based recommendation system for large datasets using EAHUIM

Vandna Dahiya

2.1 Introduction

2.2 Related work

2.2.1 Recommendation system

2.2.2 Recommendation systems in E-commerce

2.2.3 Recommendation system with utility itemset mining

2.2.4 Prerequisites of utility itemset mining

2.2.5 Problem definition

2.3 Proposed model

2.3.1 Design of the model – recommendation system with EAHUIM

2.4 Results and discussion

2.4.1 Setup

2.4.2 Data collection and preprocessing

2.4.3 Performance evaluation

2.4.4 Limitations of the model

2.4.5 Discussions

2.5 Conclusion

References

3 Anaphora resolution: A complete view with case study

Kalpana B. Khandale and C. Namrata Mahender

3.1 Introduction

3.1.1 Issues and challenges of an Anaphora

3.1.2 Need for anaphora in NLP applications

3.1.3 Anaphora

3.1.4 Discourse anaphora

3.2 Approaches to Anaphora resolution

3.2.1 Knowledge-rich approaches

3.2.2 Corpus based approaches

3.2.3 Knowledge-poor approaches

3.3 Case study of anaphora resolution in Marathi text

3.3.1 Development of POS tagger

3.3.2 Anaphora resolution system architecture

3.4 Case Study of Anaphora Resolution in Marathi with Python

3.4.1 Database

3.4.2 Preprocessing

3.4.3 Post-processing

3.5 Conclusion

References

4 A review of the approaches to neural machine translation

Preetpal Kaur Buttar and Manoj Kumar Sachan

4.1 Introduction

4.2 Machine translation approaches

4.3 Formulation of the NMT task

4.4 The encoder-decoder model

4.4.1 Encoder

4.4.2 Decoder

4.5 RNNs as encoder-decoder models

4.5.1 One-hot encoding

4.5.2 Variations of RNNs

4.5.3 Discussion and inferences

4.6 LSTMs: dealing with long-term dependencies and vanishing gradients

4.6.1 GRUs

4.6.2 Limitations of LSTMs

4.7 NMT with attention

4.8 Recent developments in NMT

4.8.1 Word embeddings

4.8.2 CNN-based NMT

4.8.3 Fully attention-based NMT

4.8.4 Transformer-based pre-trained models

4.8.5 Improved transformer models

4.9 NMT in low-resource languages

4.10 Vocabulary coverage problem

4.11 Datasets for machine translation

4.12 Challenges and future scope

References

5 Evolution of question-answering system from information retrieval: A scientific time travel for Bangla

Arijit Das and Diganta Saha

5.1 Introduction

5.2 The meaning and various ways of research done in the field of semantics

5.3 Semantic text retrieval: automatic question-answering system or query system

5.4 State-of-the-art performance

5.5 Latest research works on question-answering systems in related major global languages

5.6 Latest research works on question-answering system in major Indian languages

5.7 Backend database or repository used in the question-answering system

5.8 Different approaches of algorithms used in Bangla QA system research

5.9 Different algorithms used in Bangla QA system research

5.10 Results achieved by various Bangla QA systems

5.11 Conclusion

References

6 Recent advances in textual code-switching

Sergei Ternovykh and Anastasia Nikiforova

6.1 Introduction

6.2 Background

6.2.1 Theoretical approaches and types of code-switching

6.2.2 Measuring code-switching complexity

6.3 Code-switching datasets

6.3.1 Language identification

6.3.2 POS tagging

6.3.3 Named entity recognition

6.3.4 Chunking and dependency parsing

6.3.5 Sentiment analysis

6.3.6 Question-answering

6.3.7 Conversational systems

6.3.8 Machine translation

6.3.9 Natural language inference

6.4 NLP techniques for textual code-switching

6.4.1 Language modeling

6.4.2 Language identification

6.4.3 POS tagging

6.4.4 Named entity recognition

6.4.5 Dependency parsing

6.4.6 Sentiment analysis

6.4.7 Natural language inference

6.4.8 Machine translation

6.4.9 Question-answering

6.5 Evaluation of code-switched systems

6.6 Current limitations and future work

References

7 Legal document summarization using hybrid model

Deekshitha and Nandhini K.

7.1 Introduction

7.1.1 Background

7.1.2 Motivation

7.1.3 Problem definition

7.1.4 Objectives and scopes

7.1.5 Organization

7.2 Literature review

7.2.1 Automatic text summarization in the legal domain

7.3 Methodology

7.3.1 Legal document summary

7.3.2 Evaluation

7.4 Experiments and results

7.4.1 Evaluating extractive model

7.4.2 Effects of different K values (summary length)

7.4.3 Evaluating abstractive summary model

7.4.4 Comparison with extractive summarization models

7.4.5 Comparison with abstractive model

7.5 Conclusion

References

8 Concept network using network text analysis

Md Masum Billah, Dipanita Saha, Farzana Bhuiyan, and Mohammed Kaosar

8.1 Introduction

8.2 Literature review

8.3 The concept network

8.3.1 Concept-based information retrieval

8.3.2 Concept networks and extended fuzzy concept networks

8.3.3 Applications for fuzzy concept knowledge

8.3.4 Building WikiNet: using Wikipedia as the source

8.4 Network text analysis

8.4.1 Extracting context words from training documents

8.4.2 Building bigram frequency for text classification

8.4.3 Detecting related articles by using bigrams

8.5 Conclusion and future direction

References

9 Question-answering versus machine reading comprehension: Neural machine reading comprehension using transformer models

Nisha Varghese and M. Punithavalli

9.1 Introduction

9.2 Architecture of machine reading comprehension

9.2.1 Word embedding

9.2.2 Feature extraction

9.2.3 Context-question interaction

9.2.4 Answer prediction

9.3 Machine reading comprehension tasks and classification

9.3.1 Cloze tests

9.3.2 Multiple-choice questions

9.3.3 Span extraction

9.3.4 Free-form answering

9.3.5 Attribute-based classification

9.4 Datasets

9.5 Performance evaluation metrics

9.5.1 Accuracy

9.5.2 Exact match

9.5.3 Precision and recall

9.5.4 F1 score

9.5.6 ROUGE (recall-oriented understudy for gisting evaluation)

9.5.7 BLEU (bilingual evaluation understudy)

9.5.8 METEOR (Metric for Evaluation of Translation with Explicit ORdering)

9.5.9 HEQ (human equivalence score)

9.6 Transformer and BERT

9.6.1 BERT-based models

9.7 Results and discussion

9.8 Conclusion and future enhancement

References

10 Online subjective question-answering system necessity of education system

Madhav A. Kankhar, Bharat A. Shelke, and C. Namrata Mahender

10.1 Introduction

10.1.1 Brief on NLP (natural language processing)

10.1.2 Question

10.1.3 Answer

10.1.4 Question-answering system

10.1.5 Types of question-answering

10.1.6 Approaches used for developing QA system

10.1.7 Components of question-answering

10.1.8 Need for question-answering system

10.2 Question types follow into two categories

10.2.1 Objective examination

10.2.2 Subjective examination

10.2.3 Subjective examination-related work

10.2.4 Why is subjective examination important

10.2.5 Online education system

10.3 Proposed model

10.3.1 Text document

10.3.2 Preprocessing

10.3.3 POS tagging

10.3.4 Question generation

10.3.5 User (students) and user (teacher)

10.3.6 Model answer

10.3.7 Answer

10.3.8 Evaluation

10.3.9 Result

10.4 Conclusion

References

Index