Natural Language Processing Practical using Transformers with Python

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Learn how you can perform named entity recognition using HuggingFace Transformers and spaCy libraries in Python. Named Entity Recognition (NER) is a typical natural language processing (NLP) task that automatically identifies and recognizes predefined entities in a given text. Entities like person names, organizations, dates and times, and locations are valuable information to extract from unstructured and unlabeled raw text. At the end of this tutorial, you will be able to perform named entity recognition on any given English text with HuggingFace Transformers and SpaCy in Python. SpaCy is an open-source library in Python for advanced Natural Language Processing (NLP). It is built on the latest research and designed to be used in real-world products. We'll be using two NER models on SpaCy, namely the regular en_core_web_sm and the transformer en_core_web_trf. We'll also use spaCy's NER amazing visualizer. To get started, let's install the required libraries for this tutorial. Fake News Detection in Python: Exploring the fake news dataset, performing data analysis such as word clouds and ngrams, and fine-tuning BERT transformer to build a fake news detector in Python using transformers library. Fake news is the intentional broadcasting of false or misleading claims as news, where the statements are purposely deceitful. Newspapers, tabloids, and magazines have been supplanted by digital news platforms, blogs, social media feeds, and a plethora of mobile news applications. News organizations benefitted from the increased use of social media and mobile platforms by providing subscribers with up-to-the-minute information. Consumers now have instant access to the latest news. These digital media platforms have increased in prominence due to their easy connectedness to the rest of the world and allow users to discuss and share ideas and debate topics such as democracy, education, health, research, and history. Fake news items on digital platforms are getting more popular and are used for profit, such as political and financial gain. It is vital to recognize and differentiate between false and accurate news. One method is to have an expert decide, and fact checks every piece of information, but this takes time and needs expertise that cannot be shared. Secondly, we can use Machine Learning (ML) and Artificial Intelligence (AI) tools to automate the identification of fake news. Online news information includes various unstructured format data (such as documents, videos, and audio), but we will concentrate on text format news here. With the progress of Machine Learning and Natural Language Processing, we can now recognize the misleading and false character of an article or statement. Several studies and experiments are being conducted to detect fake news across all mediums. Paraphrase Text using Transformers in Python: Explore different pre-trained transformer models in transformers library to paraphrase sentences in Python. Paraphrasing is the process of coming up with someone else's ideas in your own words. To paraphrase a text, you have to rewrite it without changing its meaning. In this tutorial, we will explore different pre-trained transformer models for automatically paraphrasing text using the Huggingface transformers library in Python. It includes topics: 1. Named Entity Recognition 2. Fake News Detection in Python 3. Paraphrase Text using Transformers in Python 4. Text Generation 5. Speech Recognition 6. Machine Translation 7. Train BERT from Scratch 8. Conversational AI Chatbot 9. Fine Tune BERT 10. Perform Text Summarization 11. Sentiment Analysis 12. Translate Languages 13. Perform Text Classification 14. Build a Text Generator 15. Build a Spam Classifier I have explained every topic in the most simplest way and you can use these topics in multiple place. Who this book is for: This book is highly appealing to all tech-savvy students, programming enthusiasts, IT graduates, and computer science professionals who want to build strong proficiency in building Python applications. Prior understanding of Python basic coding concepts like variables, expressions, and control structures is required to begin with this book. You can also read Basic Core Python.

Author(s): Tony Snake
Publisher: Independently published
Year: 2022

Language: english
Pages: 275

About the Authors
Table of Contents
Natural Language Processing Practical using Transformers with Python
CHAPTER 1: Named Entity Recognition using Transformers and Spacy in Python
NER with Transformers
NER with SpaCy
Conclusion
SourceCode:
CHAPTER 2: Fake News Detection in Python
Introduction
How Big is this Problem?
The Solution
Data Exploration
Distribution of Classes
Data Cleaning for Analysis
Explorative Data Analysis
Single-word Cloud
Most Frequent Bigram (Two-word Combination)
Most Frequent Trigram (Three-word combination)
Building a Classifier by Fine-tuning BERT
Data Preparation
Tokenizing the Dataset
Loading and Fine-tuning the Model
Model Evaluation
Appendix: Creating a Submission File for Kaggle
Conclusion
SourceCode:
CHAPTER 3: Paraphrase Text using Transformers in Python
Pegasus Transformer
T5 Transformer
Parrot Paraphraser
Conclusion
SourceCode:
CHAPTER 4: Text Generation with Transformers in Python
Conclusion
SourceCode:
CHAPTER 5: Speech Recognition using Transformers in Python
Getting Started
Preparing the Audio File
Performing Inference
Wrapping up the Code
Conclusion
SourceCode:
CHAPTER 6: Machine Translation using Transformers in Python
Using Pipeline API
Manually Loading the Model
Conclusion
SourceCode:
CHAPTER 7: Train BERT from Scratch using Transformers in Python
Picking a Dataset
Training the Tokenizer
Tokenizing the Dataset
Loading the Model
Training
Using the Model
Conclusion
SourceCode:
CHAPTER 8: Conversational AI Chatbot with Transformers in Python
Generating Responses with Greedy Search
Generating Responses with Beam Search
Generating Responses with Sampling
Nucleus Sampling
Conclusion
SourceCode:
CHAPTER 9: Fine Tune BERT for Text Classification using Transformers in Python
Loading the Dataset
Training the Model
Performing Inference
Conclusion
SourceCode:
CHAPTER 10: Perform Text Summarization using Transformers in Python
Using pipeline API
Using T5 Model
Conclusion
SourceCode:
CHAPTER 11: Sentiment Analysis using VADER in Python
Conclusion
SourceCode:
CHAPTER 12: Translate Languages in Python
Translating Text
Translating List of Phrases
Language Detection
Supported Languages
Conclusion
SourceCode:
CHAPTER 13: Perform Text Classification in Python using Tensorflow 2 and Keras
Data Preparation
Building the Model
Training the Model
Testing the Model
Hyperparameter Tuning
Integrating Custom Datasets
SourceCode:
CHAPTER 14: Build a Text Generator using TensorFlow 2 and Keras in Python
Getting Started
Preparing the Dataset
Building the Model
Training the Model
Generating New Text
Conclusion
SourceCode:
CHAPTER 15: Build a Spam Classifier using Keras and TensorFlow in Python
1. Installing and Importing Dependencies
2. Loading the Dataset
3. Preparing the Dataset
4. Building the Model
5. Training the Model
6. Evaluating the Model
SourceCode:
Summary