Pandas has became an important and a must tool for data analysis. This book covers various scenarios that occurs in real life data and thus helps reader to relate the issues directly and apply the solution. The book has been developed in such a way that everyone can easily understand and run through the concepts. This book has been created for both beginners and experienced ones as the beginners can learn the technology from scratch and experienced ones can brush their concepts and can relate their issues and understand the concept in depth.
Table Of content
1Introduction
2Advantages
2.1 Speed
2.2 Short code:
2.3 Saves time:
2.4 Easy:
2.5 Versatile:
2.6 Efficiency:
2.7 Customizable:
2.8 Supports multiple formats for I/O:
2.9 Python support:
3Installation
3.1 Install Pandas
3.1.1Installing with Anaconda
3.1.2Installing with PyPI
3.2Install Jupyter Notebook
4Creating DataFrames
4.1Creating DataFrame using dictionary data
4.2Creating DataFrame using list data
4.2.1Creating single column
4.2.2Creating multiple columns
4.3Adding custom column name
4.4creating DataFrame from list of dictionaries
4.5creating DataFrame from other files
4.6Creating blank DataFrame
5Basics of DataFrames
5.1Read data
5.2Shape of the DataFrame
5.3Top ‘n’ rows
5.4Last ‘n’ rows
5.5Range of entries
5.6Accessing the columns
5.7Accessing ‘n’ columns
5.8Type of column
5.9Basic operations on column
5.9.1maximum
5.9.2minimum
5.9.3mean
5.9.4standard deviation
5.10Describe the DataFrame
5.11Conditional operation on columns
5.12accessing row with loc and iloc
5.13Set index
6Reading and writing files
6.1Reading CSV
6.1.1Reading
6.1.2Removing header
6.1.3Adding custom header
6.1.4Reading specific rows
6.1.5Reading the data from specific row
6.1.6Cleaning NA data
6.1.7Reference
6.2Writing to CSV
6.2.1Avoiding index
6.2.2Writing only specific columns
6.2.3Avoid writing headers
6.2.4reference
6.3Read Excel
6.3.1Reading sheets
6.3.2Passing function to columns
6.3.3Some basic common functions in read_excel and read_csv are:
6.3.4Reference
6.4Writing excels
6.4.1Writing to a custom sheet name
6.4.2Avoid index
6.4.3Avoid headers
6.4.4write at a particular row and column
6.4.5Writing multiple sheets to the same excel file
6.4.6reference
6.5Reading and writing txt file
6.5.1Reading txt
6.5.2Writing to txt
6.6Reference
6.6.1https://Pandas.pydata.org/Pandas-docs/stable/user_guide/io.html
7working with missing data
7.1Managing timestamp
7.2Set index
7.3Check if data is “na” or “notna”
7.4Check if a data has missing datetime
7.5Inserting missing date
7.6Filling the missing data
7.6.1Filling a common value to all missing data
7.6.2Adding missing data to individual columns
7.6.3Forward fill (row)
7.6.4Backward fill (row)
7.6.5Forward fill (column)
7.6.6Limiting the forward/backward fill
7.6.7Filling with Pandas objects
7.6.8Filling for specific range of columns
7.7Interpolate missing value
7.7.1Linear interpolate
7.7.2Time interpolate
7.7.3Other methods of interpolation
7.7.4Limiting the interpolation
7.7.5Interpolation direction
7.7.6Limit area of interpolation
7.8Drop the missing value
7.8.1Drop row with at least 1 missing value
7.8.2Drop row with all missing values
7.8.3Set threshold to drop
7.9Replace the data
7.9.1Replace a column with new column
7.9.2Replace with mapping dictionary
7.9.3replacing value with NaN
7.9.4Replace multiple values with NaN
7.9.5Replacing data as per columns
7.9.6Regex and replace
7.9.7Regex on specific columns
8Groupby
8.1Creating group object
8.2Simple operations with group
8.2.1First
8.2.2Last
8.2.3Max
8.2.4Min
8.2.5Mean
8.3Working of groupby
8.4Iterate through groups
8.4.1Group details
8.4.2Iterate for groups
8.5Get a specific group
8.6Detailed view of the groups data
8.7Group by sorting
8.7.1Sorted data (default)
8.7.2Unsorted data
8.8Various functions associated with groupby object
8.9Length
8.9.1Len of an object
8.9.2Length of each group
8.10Groupby with multi-index
8.10.1Grouping on level numbers
8.10.2grouping on level names
8.11Grouping DataFrame with index level and columns
8.12Aggregation
8.12.1Applying multiple aggregate functions at once
8.12.2Multiple aggregate function to selected columns
8.12.3Renaming the column names for aggregate functions
8.12.4Named aggregation
8.12.5Custom agg function on various columns
8.13Transformation
8.13.1Custom functions in transformation
8.13.2Filling missing data
8.14Window operations
8.14.1Rolling
8.14.2Expanding
8.15Filtration
8.16Instance methods
8.16.1Sum, mean, max, min etc
8.16.2Fillna
8.16.3Fetching nth row
8.17Apply
8.18Plotting
8.18.1Lineplot
8.18.2Boxplot
9Concatenation
9.1Concatenate series
9.2Concatenate DataFrames
9.3Managing duplicate index
9.4Adding keys to DataFrames
9.5Use of keys
9.6Adding DataFrame as a new column
9.6.1Removing unwanted columns in column concatenation
9.6.2Series in columns
9.7Rearranging the order of column
9.8Join DataFrame and series
9.9Concatenating multiple DataFrames /series
10Merge
10.1Merging DataFrames
10.2Merging different values of “ON” (joining) column
10.2.1Merging with Inner join
10.2.2Merging with outer join
10.2.3Merging with left join
10.2.4Merging with right join
10.3Knowing the source DataFrame after merge
10.4Merging DataFrames with same column names
10.5Other ways of joining
10.5.1Join
10.5.2Append
11Pivot
11.1Multilevel columns
11.2Data for selected value (column)
11.3Error from duplicate values
12Pivot table
12.1Aggregate function
12.1.1List of function to aggfunc
12.1.2Custom functions to individual columns
12.2Apply pivot_table() on desired columns
12.3Margins
12.3.1Naming the margin column
12.4Grouper
12.5Filling the missing value in pivot table
13Reshape DataFrame using melt
13.1Use of melt
13.2Melt for only one column
13.3Melt multiple columns
13.4Custom column name
13.4.1Custom variable name
13.4.2Custom value name
14Reshaping using stack and unstack
14.1Stack the DataFrame
14.2Stack custom level of column
14.3Stack on multiple levels of column
14.4Dropping missing values
14.5Unstack the stacked DataFrame
14.5.1Default unstack
14.5.2Converting other index levels to column
14.5.3Unstack multiple indexes
15Frequency distribution of DataFrame column
15.1Apply crosstab
15.2Get total of rows/columns
15.3Multilevel columns
15.4Multilevel indexes
15.5Custom name to rows/columns
15.6Normalize (percentage) of the frequency
15.7Analysis using custom function
16Drop unwanted rows/columns
16.1Delete row
16.1.1Delete rows of custom index level
16.1.2Delete multiple rows
16.2Drop column
16.2.1Delete multiple columns
16.2.2Delete multilevel columns
16.3Delete both rows & columns
17Remove duplicate values
17.1Remove duplicate
17.2Fetch custom occurrence of data
17.2.1First occurrence
17.2.2Last occurrence
17.2.3Remove all duplicates
17.3Ignore index
18Sort the data
18.1Sort columns
18.2Sorting multiple columns
18.3Sorting order
18.4Positioning missing value
19Working with date and time
19.1Creation, working and use of DatetimeIndex
19.1.1Converting date to timestamp and set as index
19.1.2Access data for particular year
19.1.3Access data for particular month
19.1.4Calculating average closing price for any month
19.1.5Access a date range
19.1.6Resampling the data
19.1.7Plotting the resampled data
19.1.8Quarterly frequency
19.2Working with date ranges
19.2.1Adding dates to the data
19.2.2Apply the above date range to our data
19.2.3Generate the missing data with missing dates
19.2.4Date range with periods
19.3Working with custom holidays
19.3.1Adding US holidays
19.3.2Creating custom calendar
19.3.3Observance rule
19.3.4Custom week days
19.3.5Custom holiday
19.4Working with date formats
19.4.1Converting to a common format
19.4.2Time conversion
19.4.3Dayfirst formats
19.4.4Remove custom delimiter in date
19.4.5Remove custom delimiter in time
19.4.6Handling errors in datetime
19.4.7Epoch time
19.5Working with periods
19.5.1Annual period
19.5.2Monthly period
19.5.3Daily period
19.5.4Hourly period
19.5.5Quarterly period
19.5.6Converting one frequency to another
19.5.7Arithmetic between two periods
19.6Period Index
19.6.1Getting given number of periods
19.6.2Period index to DataFrame
19.6.3Extract annual data
19.6.4Extract a range of periods data
19.6.5Convert periods to datetime index
19.6.6Convert DatetimeIndex to PeriodIndex
19.7Working with time zones
19.7.1Make naïve time to time zone aware
19.7.2Available timezones
19.7.3Convert on time zone to other
19.7.4Time zone in a date range
19.7.5Time zone with dateutil
19.8Data shifts in DataFrame
19.8.1Shifting the price down
19.8.2Shifting by multiple rows
19.8.3Reverse shifting
19.8.4Use of shift
19.8.5DatetimeIndex shift
19.8.6Reverse DatetimeIndex shift
20Database
20.1Working with MySQL
20.1.1Installations
20.1.2Create connection
20.1.3Read table data
20.1.4Fetching specific columns from table
20.1.5Execute a query
20.1.6Insert data to table
20.1.7Common function to read table and query
20.2Working with MongoDB
20.2.1Installations
20.2.2Create connection
20.2.3Get records
20.2.4Fetching specific columns
20.2.5Insert records
20.2.6Delete records
21About Author