Gain insight into essential data science skills in a holistic manner using data engineering and associated scalable computational methods. This book covers the most popular Python 3 frameworks for both local and distributed (in premise and cloud based) processing. Along the way, you will be introduced to many popular open-source frameworks, like, SciPy, scikitlearn, Numba, Apache Spark, etc. The book is structured around examples, so you will grasp core concepts via case studies and Python 3 code.
As data science projects gets continuously larger and more complex, software engineering knowledge and experience is crucial to produce evolvable solutions. You'll see how to create maintainable software for data science and how to document data engineering practices.
This book is a good starting point for people who want to gain practical skills to perform data science. All the code will be available in the form of IPython notebooks and Python 3 programs, which allow you to reproduce all analyses from the book and customize them for your own purpose. You'll also benefit from advanced topics like Machine Learning, Recommender Systems, and Security in Data Science.
Practical Data Science with Python will empower you analyze data, formulate proper questions, and produce actionable insights, three core stages in most data science endeavors.
What You'll Learn
Play the role of a data scientist when completing increasingly challenging exercises using Python 3
Work work with proven data science techniques/technologies
Review scalable software engineering practices to ramp up data analysis abilities in the realm of Big Data
Apply theory of probability, statistical inference, and algebra to understand the data science practices
Who This Book Is For
Anyone who would like to embark into the realm of data science using Python 3.
Author(s): Ervin Varga
Publisher: Apress
Year: 2019
Language: English
Pages: 462
Front Matter
1. Introduction to Data Science
2. Data Engineering
3. Software Engineering
4. Documenting Your Work
5. Data Processing
6. Data Visualization
7. Machine Learning
8. Recommender Systems
9. Data Security
10. Graph Analysis
11. Complexity and Heuristics
12. Deep Learning
Back Matter