Advanced Data Science and Analytics with Python
enables data scientists to continue developing their skills and apply them in business as well as academic settings. The subjects discussed in this book are complementary and a follow-up to the topics discussed in Data Science and Analytics with Python. The aim is to cover important advanced areas in data science using tools developed in Python such as SciKit-learn, Pandas, Numpy, Beautiful Soup, NLTK, NetworkX and others. The model development is supported by the use of frameworks such as Keras, TensorFlow and Core ML, as well as Swift for the development of iOS and MacOS applications.
Features:
Targets readers with a background in programming, who are interested in the tools used in data analytics and data science Uses Python throughout Presents tools, alongside solved examples, with steps that the reader can easily reproduce and adapt to their needs Focuses on the practical use of the tools rather than on lengthy explanations Provides the reader with the opportunity to use the book whenever needed rather than following a sequential path
The book can be read independently from the previous volume and each of the chapters in this volume is sufficiently independent from the others, providing flexibility for the reader. Each of the topics addressed in the book tackles the data science workflow from a practical perspective, concentrating on the process and results obtained. The implementation and deployment of trained models are central to the book.
Time series analysis, natural language processing, topic modelling, social network analysis, neural networks and deep learning are comprehensively covered. The book discusses the need to develop data products and addresses the subject of bringing models to their intended audiences - in this case, literally to the users' fingertips in the form of an iPhone app.
About the Author
Dr. Jesús Rogel-Salazar is a lead data scientist in the field, working for companies such as Tympa Health Technologies, Barclays, AKQA, IBM Data Science Studio and Dow Jones. He is a visiting researcher at the Department of Physics at Imperial College London, UK and a member of the School of Physics, Astronomy and Mathematics at the University of Hertfordshire, UK.
Author(s): Jesús Rogel-Salazar
Series: Chapman & Hall/CRC Data Mining and Knowledge Series
Publisher: CRC Press
Year: 2020
Language: English
Pages: xl+383
Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Table of Contents
1: No Time to Lose: Time Series Analysis
1.1 Time Series
1.2 One at a Time: Some Examples
1.3 Bearing with Time: Pandas Series
1.3.1 Pandas Time Series in Action
1.3.2 Time Series Data Manipulation
1.4 Modelling Time Series Data
1.4.1 Regression. . . (Not) a Good Idea?
1.4.2 Moving Averages and Exponential Smoothing
1.4.3 Stationarity and Seasonality
1.4.4 Determining Stationarity
1.4.5 Autoregression to the Rescue
1.5 Autoregressive Models
1.6 Summary
2: Speaking Naturally: Text and Natural Language Processing
2.1 Pages and Pages: Accessing Data from the Web
2.1.1 Beautiful Soup in Action
2.2 Make Mine a Regular: Regular Expressions
2.2.1 Regular Expression Patterns
2.3 Processing Text with Unicode
2.4 Tokenising Text
2.5 Word Tagging
2.6 What Are You Talking About?: Topic Modelling
2.6.1 Latent Dirichlet Allocation
2.6.2 LDA in Action
2.7 Summary
3: Getting Social: Graph Theory and Social Network Analysis
3.1 Socialising Among Friends and Foes
3.2 Let’s Make a Connection: Graphs and Networks
3.2.1 Taking the Measure: Degree, Centrality and More
3.2.2 Connecting the Dots: Network Properties
3.3 Social Networks with Python: NetworkX
3.3.1 NetworkX: A Quick Intro
3.4 Social Network Analysis in Action
3.4.1 Karate Kids: Conflict and Fission in a Network
3.4.2 In a Galaxy Far, Far Away: Central Characters in a Network
3.5 Summary
4: Thinking Deeply: Neural Networks and Deep Learning
4.1 A Trip Down Memory Lane
4.2 No-Brainer: What Are Neural Networks?
4.2.1 Neural Network Architecture: Layers and Nodes
4.2.2 Firing Away: Neurons, Activate!
4.2.3 Going Forwards and Backwards
4.3 Neural Networks: From the Ground up
4.3.1 Going Forwards
4.3.2 Learning the Parameters
4.3.3 Backpropagation and Gradient Descent
4.3.4 Neural Network: A First Implementation
4.4 Neural Networks and Deep Learning
4.4.1 Convolutional Neural Networks
4.4.2 Convolutional Neural Networks in Action
4.4.3 Recurrent Neural Networks
4.4.4 Long Short-Term Memory
4.4.5 Long Short-Term Memory Networks in Action
4.5 Summary
5: Here Is One I Made Earlier: Machine Learning Deployment
5.1 The Devil in the Detail: Data Products
5.2 Apples and Snakes: Core ML + Python
5.3 Machine Learning at the Core: Apps and ML
5.3.1 Environment Creation
5.3.2 Eeny, Meeny, Miny, Moe: Model Selection
5.3.3 Location, Location, Location: Exploring the Data
5.3.4 Modelling and Core ML: A Crucial Step
5.3.5 Model Properties in Core ML
5.4 Surprise and Delight: Build an iOS App
5.4.1 New Project: Xcode
5.4.2 Push My Buttons: Adding Functionality
5.4.3 Being Picky: The Picker View
5.4.4 Model Behaviour: Core ML + SwiftUI
5.5 Summary
A: Information Criteria
B: Power Iteration
C: The Softmax Function and Its Derivative
C.1 Numerical Stability
D: The Derivative of the Cross-Entropy Loss Function
Bibliography
Index