Become a Python Data Analyst: Perform exploratory data analysis and gain insight into scientific computing using Python

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

In this book, we will cover Python libraries such as NumPy, pandas, matplotlib, seaborn, SciPy, and scikit-learn, and apply them in practical data analysis and statistics examples. As you make your way through the chapters, you will learn to efficiently use the Jupyter Notebook to operate and manipulate data using NumPy and the pandas library. In the concluding chapters, you will gain experience in building simple predictive models and carrying out statistical computation and analysis using rich Python tools and proven data analysis techniques.

Author(s): Alvaro Fuentes
Publisher: Packt Publishing
Year: 2018

Language: English
Pages: 180

Cover
Title Page
Copyright and Credits
Packt Upsell
Contributor
Table of Contents
Preface
Chapter 1: The Anaconda Distribution and Jupyter Notebook
The Anaconda distribution
Installing Anaconda
Jupyter Notebook
Creating your own Jupyter Notebook
Notebook user interfaces
Using the Jupyter Notebook
Running code in a code cell
Running markdown syntax in a text cell
Styles and formats
Lists
Useful keyboard shortcuts
Summary
Chapter 2: Vectorizing Operations with NumPy
Introduction to NumPy
Problems and solutions
 NumPy arrays
Creating arrays in NumPy
Creating arrays from lists
Creating arrays from built-in NumPy functions
Attributes of arrays
Basic math with arrays
Common manipulations with arrays
Indexing arrays
Slicing arrays
Reshaping arrays
Using NumPy for simulations
Coin flips
Simulating stock returns
Summary
Chapter 3: Pandas - Everyone's Favorite Data Analysis Library
Introduction to the pandas library
Important objects in pandas
Series
Creating a pandas series
DataFrames
Creating a pandas DataFrame
Anatomy of a DataFrame
Operations and manipulations of pandas
Inspection of data
Selection, addition, and deletion of data
Slicing DataFrames
Selection by labels
Answering simple questions about a dataset
Total employees by department in the dataset
Overall attrition rate
Average hourly rate
Average number of years
Employees with the most number of years
Overall employee satisfaction
Answering further questions
Employees with Low JobSatisfaction
Employees with both Low JobSatisfaction and JobInvolvement
Employee comparison
Summary
Chapter 4: Visualization and Exploratory Data Analysis
Introducing Matplotlib
Terminologies in Matplotlib
Introduction to pyplot
Object-oriented interface
Common customizations
Colors
Colornames
Setting axis limits
Setting ticks and tick labels
Legend
Annotations
Producing grids, horizontal, and vertical lines
EDA with seaborn and pandas
Understanding the seaborn library
Performing exploratory data analysis
Key objectives when performing data analysis
Types of variable
Analyzing variables individually
Understanding the main variable
Numerical variables
Categorical variables
Relationships between variables
Scatter plot
Box plot
Complex conditional plots
Summary
Chapter 5: Statistical Computing with Python
Introduction to SciPy
Statistics subpackage 
Confidence intervals
Probability calculations
Hypothesis testing
Performing statistical tests 
Summary
Chapter 6: Introduction to Predictive Analytics Models
Predictive analytics and machine learning
Understanding the scikit-learn library
scikit-learn
Building a regression model using scikit-learn
Regression model to predict house prices
Summary
Other Books You May Enjoy
Index