Effective Pandas: Patterns for Data Manipulation (Treading on Python)

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Best practices for manipulating data with Pandas. This book will arm you with years of knowledge and experience that are condensed into an easy to follow format. Rather than taking months reading blogs and websites and searching mailing lists and groups, this book will teach you how to write good Pandas code.

Author(s): Matt Harrison
Edition: 1
Publisher: ‎Independently published
Year: 2021

Language: English
Pages: 379
Tags: Pandas

Introduction
Who this book is for
Data in this Book
Hints, Tables, and Images
Installation
Anaconda
Pip
Jupyter Overview
Summary
Exercises
Data Structures
Summary
Exercises
Series Introduction
The index abstraction
The pandas Series
The NaN value
Optional Integer Support for NaN
Similar to NumPy
Categorical Data
Summary
Exercises
Series Deep Dive
Loading the Data
Series Attributes
Summary
Exercises
Operators (& Dunder Methods)
Introduction
Dunder Methods
Index Alignment
Broadcasting
Iteration
Operator Methods
Chaining
Summary
Exercises
Aggregate Methods
Aggregations
Count and Mean of an Attribute
.agg and Aggregation Strings
Summary
Exercises
Conversion Methods
Automatic Conversion
Memory Usage
String and Category Types
Ordered Categories
Converting to Other Types
Summary
Exercises
Manipulation Methods
.apply and .where
If Else with Pandas
Missing Data
Filling In Missing Data
Interpolating Data
Clipping Data
Sorting Values
Sorting the Index
Dropping Duplicates
Ranking Data
Replacing Data
Binning Data
Summary
Exercises
Indexing Operations
Prepping the Data and Renaming the Index
Resetting the Index
The .loc Attribute
The .iloc Attribute
Heads and Tails
Sampling
Filtering Index Values
Reindexing
Summary
Exercises
String Manipulation
Strings and Objects
Categorical Strings
The .str Accessor
Searching
Splitting
Optimizing .apply with Cython
Replacing Text
Summary
Exercises
Date and Time Manipulation
Date Theory
Loading UTC Time Data
Loading Local Time Data
Converting Local time to UTC
Converting to Epochs
Manipulating Dates
Summary
Exercises
Dates in the Index
Finding Missing Data
Filling In Missing Data
Interpolation
Dropping Missing Values
Shifting Data
Rolling Average
Resampling
Gathering Aggregate Values (But Keeping Index)
Groupby Operations
Cumulative Operations
Summary
Exercises
Plotting with a Series
Plotting in Jupyter
The .plot Attribute
Histograms
Box Plot
Kernel Density Estimation Plot
Line Plots
Line Plots with Multiple Aggregations
Bar Plots
Pie Plots
Styling
Summary
Exercises
Categorical Manipulation
Categorical Data
Frequency Counts
Benefits of Categories
Conversion to Ordinal Categories
The .cat Accessor
Category Gotchas
Generalization
Summary
Exercises
Dataframes
Database and Spreadsheet Analogues
A Simple Python Version
Dataframes
Construction
Dataframe Axis
Summary
Exercises
Similarities with Series and DataFrame
Getting the Data
Viewing Data
Summary
Exercises
Math Methods in DataFrames
Index Alignment
Duplicate Index Entries
Summary
Exercises
Looping and Aggregation
For Loops
Aggregations
The .apply Method
Summary
Exercises
Columns Types, .assign, and Memory Usage
Conversion Methods
Memory Usage
Summary
Exercises
Creating and Updating Columns
Loading the Data
More Column Cleanup
Summary
Exercises
Dealing with Missing and Duplicated Data
Missing Data
Duplicates
Summary
Exercises
Sorting Columns and Indexes
Sorting Columns
Sorting Column Order
Setting and Sorting the Index
Summary
Exercises
Filtering and Indexing Operations
Renaming an Index
Resetting the Index
Dataframe Indexing, Filtering, & Querying
Indexing by Position
Indexing by Name
Filtering with Functions & .loc
.query vs .loc
Summary
Exercises
Plotting with Dataframes
Lines Plots
Bar Plots
Scatter Plots
Area Plots and Stacked Bar Plots
Column Distributions with KDEs, Histograms, and Boxplots
Summary
Exercises
Reshaping Dataframes with Dummies
Dummy Columns
Undoing Dummy Columns
Summary
Exercises
Reshaping By Pivoting and Grouping
A Basic Example
Using a Custom Aggregation Function
Multiple Aggregations
Per Column Aggregations
Grouping by Hierarchy
Grouping with Functions
Summary
Exercises
More Aggregations
Aggregations while Keeping Rows
Filtering Parts of Groups
Summary
Exercises
Cross-tabulation Deep Dive
Cross-tabulation Summaries
Adding Margins
Normalizing Results
Hierarchical Columns with Cross Tabulations
Heatmaps
Summary
Exercises
Melting, Transposing, and Stacking Data
Melting Data
Un-melting Data
Transposing Data
Stacking & Unstacking
Stacking
Flattening Hierarchical Indexes and Columns
Summary
Exercises
Working with Time Series
Loading the Data
Adding Timezone Information
Exploring the Data
Slicing Time Series
Missing Timeseries Data
Exploring Seasonality
Resampling Data
Rules with Offset Aliases
Combining Offset Aliases
Anchored Offset Aliases
Resampling to Finer-grain Frequency
Grouping a Date Column with pd.Grouper
Summary
Exercises
Joining Dataframes
Adding Rows to Dataframes
Adding Columns to Dataframes
Joins
Join Indicators
Merge Validation
Joining Data Example
Dirty Devil Flow and Weather Data
Joining Data
Validating Joined Data
Visualization of Merged Data
Summary
Exercises
Exporting Data
Dirty Devil Data
Reading and Writing
Creating CSV Files
Exporting to Excel
Feather
SQL
JSON
Summary
Exercises
Styling Dataframes
Loading the Data
Sparklines
The .style Attribute
Formatting
Embedding Bar Plots
Highlighting
Heatmaps and Gradients
Captions
CSS Properties
Stickiness and Hiding
Hiding the Index
Summary
Exercises
Debugging Pandas
Checking if Dataframes are Equal
Debugging Chains
Debugging Chains Part II
Debugging Chains Part III
Debugging Chains Part IV
Debugging Apply (and Friends)
Memory Usage
Timing Information
Summary
Exercises
Summary
About the Author
Index
Also Available
One more thing