Pandas is a Python package that provides fast and flexible data structures designed to make working with "relational" or "labeled" data easy and intuitive. Its goal is to be the fundamental, high-level building block for doing practical analysis of real-world data in Python. Furthermore, it has the larger goal of becoming the most powerful, flexible, and available in any language open source data manipulation/analysis tool. The two main data structures in Pandas are: Series for one-dimensional data and DataFrames for two-dimensional data. Both frameworks handle the vast majority of typical use cases in finance, statistics, social sciences, and many areas of engineering. For R users, the DataFrame provides everything that R data.frame offers, and much more. pandas is based on NumPy and is designed to integrate well into a scientific computing environment with many other third-party libraries. Pandas facilitates the work in Data Science. For data scientists, working with data is typically divided into several stages: collecting and cleaning data, analyzing/modeling it, and then organizing the analysis results in a form suitable for graphing or displaying in tabular form. pandas is a help tool for all these tasks. Also Pandas has been widely used in the production of financial applications. Also pandas works with big data
Author(s): CESAR PEREZ LOPEZ
Publisher: CESAR PEREZ
Year: 2023
Language: English
Pages: 396
Cross-section
introduction
DATA STRUCTURES IN PANDAS. WORKING WITH SERIES AND DATAFRAMES
1.1 DATA STRUCTURES: DATAFRAMES AND SERIES
1.1.1 Starting with Pandas. Data structures
1.1.2 DataFrames
1.1.3 Series
1.1.4 Methods
1.2 Reading and writing tabular data
1.3 SUBSETS OF DATA IN DATAFRAMES
1.3.1 Selection of columns of a DataFrame.
1.3.2 Selection of rows of a DataFrame.
1.3.3 Selection of specific rows and columns of a dataframe
1.4 INTRODUCTION TO VISUALISATIONS
1.5 CREATION OF DERIVED COLUMNS
1.6 CALCULATION OF SUMMARY STATISTICS
1.6.1 Summary statistics by category. Groupby() method
1.6.2 Count number of records per category. Method value_counts()
1.7 REMODELLING THE DESIGN OF TABLES
1.7.1 Sort rows of a table. sort.value() method
1.7.2 Transform long table format to wide
1.7.3 Pivot tables. Method pivot()
1.7.4 Transform table format width to length
1.8 combine data from several tables
1.8.1 Concatenate tables using a common identifier. Concat() method
1.8.2 Join tables using a common identifier. merge() method
1.9 TIME SERIES DATA
1.9.1 Date and time properties. to.datetime() method
1.9.2 Date and time as index. Method pivot()
1.9.3 Resample a time series to another frequency. Resample() method
1.10 TEXTUAL DATA: CHAINS
BASIC METHODS IN PANDAS
2.1 CREATING OBJECTS, WORKING WITH DATA AND OPERATIONS
2.1.1 Creation of objects
2.1.2 Showing data
2.1.3 Selection
2.1.4 Missing data
2.1.5 Statistical operations
2.2 Methods DATA PARATRANSFORMATION
2.2.1 Merge: concat() method
2.2.2 Merge: merge() method
2.2.3 Grouping. Groupby() method
2.2.4 Hierarchical indexing and remodelling
2.2.5 Pivot tables. Method pivot()
2.3 TIME SERIES AND CATEGORICAL DATA
2.3.1 Time series
2.3.2 Categorical data
2.4 Data representation
2.5 Data input/output
METHODS FOR DATA STRUCTURES IN PANDAS
3.1 Introduction to data structures