Applied Text Analysis with Python: Enabling Language-Aware Data Products with Machine Learning

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

From news and speeches to informal chatter on social media, natural language is one of the richest and most underutilized sources of data. Not only does it come in a constant stream, always changing and adapting in context; it also contains information that is not conveyed by traditional data sources. The key to unlocking natural language is through the creative application of text analytics. This practical book presents a data scientist’s approach to building language-aware products with applied machine learning. You’ll learn robust, repeatable, and scalable techniques for text analysis with Python, including contextual and linguistic feature engineering, vectorization, classification, topic modeling, entity resolution, graph analysis, and visual steering. By the end of the book, you’ll be equipped with practical methods to solve any number of complex real-world problems. ● Preprocess and vectorize text into high-dimensional feature representations ● Perform document classification and topic modeling ● Steer the model selection process with visual diagnostics ● Extract key phrases, named entities, and graph structures to reason about data in text ● Build a dialog framework to enable chatbots and language-driven interaction ● Use Spark to scale processing power and neural networks to scale model complexity

Author(s): Benjamin Bengfort, Tony Ojeda, Rebecca Bilbro
Edition: 1
Publisher: O’Reilly Media
Year: 2018

Language: English
Commentary: True PDF
Pages: 332
Tags: Deep Learning; Natural Language Processing; Python; Chatbots; Graphs; Classification; Clustering; Parallel Programming; Data Visualization; Apache Spark; scikit-learn; Text Wrangling; Text Analysis; Language Corpus