If you're like many of Excel's 750 million users, you want to do more with your data--like repeating similar analyses over hundreds of files, or combining data in many files for analysis at one time. This practical guide shows ambitious non-programmers how to automate and scale the processing and analysis of data in different formats--by using Python.
After author Clinton Brownley takes you through Python basics, you'll be able to write simple scripts for processing data in spreadsheets as well as databases. You'll also learn how to use several Python modules for parsing files, grouping data, and producing statistics. No programming experience is necessary.
Create and run your own Python scripts by learning basic syntax
Use Python's csv module to read and parse CSV files
Read multiple Excel worksheets and workbooks with the xlrd module
Perform database operations in MySQL or with the mysqlclient module
Create Python applications to find specific records, group data, and parse text files
Build statistical graphs and plots with matplotlib, pandas, ggplot, and seaborn
Produce summary statistics, and estimate regression and classification models
Schedule your scripts to run automatically in both Windows and Mac environments
Author(s): Clinton W. Brownley
Publisher: O'Reilly Media
Year: 2016
Language: English
Pages: 352
Copyright
Table of Contents
Preface
Why Read This Book? Why Learn These Skills?
Who Is This Book For?
Why Windows?
Why Python?
Base Python and pandas
Anaconda Python
Installing Anaconda Python (Windows or Mac)
Text Editors
Download Book Materials
Overview of Chapters
Conventions Used in This Book
Using Code Examples
Safari® Books Online
How to Contact Us
Acknowledgments
Chapter 1. Python Basics
How to Create a Python Script
How to Run a Python Script
Useful Tips for Interacting with the Command Line
Python’s Basic Building Blocks
Numbers
Strings
Regular Expressions and Pattern Matching
Dates
Lists
Tuples
Dictionaries
Control Flow
Reading a Text File
Create a Text File
Script and Input File in Same Location
Modern File-Reading Syntax
Reading Multiple Text Files with glob
Create Another Text File
Writing to a Text File
Add Code to first_script.py
Writing to a Comma-Separated Values (CSV) File
print Statements
Chapter Exercises
Chapter 2. Comma-Separated Values (CSV) Files
Base Python Versus pandas
Read and Write a CSV File (Part 1)
How Basic String Parsing Can Fail
Read and Write a CSV File (Part 2)
Filter for Specific Rows
Value in Row Meets a Condition
Value in Row Is in a Set of Interest
Value in Row Matches a Pattern/Regular Expression
Select Specific Columns
Column Index Values
Column Headings
Select Contiguous Rows
Add a Header Row
Reading Multiple CSV Files
Count Number of Files and Number of Rows and Columns in Each File
Concatenate Data from Multiple Files
Sum and Average a Set of Values per File
Chapter Exercises
Chapter 3. Excel Files
Introspecting an Excel Workbook
Processing a Single Worksheet
Read and Write an Excel File
Filter for Specific Rows
Select Specific Columns
Reading All Worksheets in a Workbook
Filter for Specific Rows Across All Worksheets
Select Specific Columns Across All Worksheets
Reading a Set of Worksheets in an Excel Workbook
Filter for Specific Rows Across a Set of Worksheets
Processing Multiple Workbooks
Count Number of Workbooks and Rows and Columns in Each Workbook
Concatenate Data from Multiple Workbooks
Sum and Average Values per Workbook and Worksheet
Chapter Exercises
Chapter 4. Databases
Python’s Built-in sqlite3 Module
Insert New Records into a Table
Update Records in a Table
MySQL Database
Insert New Records into a Table
Query a Table and Write Output to a CSV File
Update Records in a Table
Chapter Exercises
Chapter 5. Applications
Find a Set of Items in a Large Collection of Files
Calculate a Statistic for Any Number of Categories from Data in a CSV File
Calculate Statistics for Any Number of Categories from Data in a Text File
Chapter Exercises
Chapter 6. Figures and Plots
matplotlib
Bar Plot
Histogram
Line Plot
Scatter Plot
Box Plot
pandas
ggplot
seaborn
Chapter 7. Descriptive Statistics and Modeling
Datasets
Wine Quality
Customer Churn
Wine Quality
Descriptive Statistics
Grouping, Histograms, and t-tests
Pairwise Relationships and Correlation
Linear Regression with Least-Squares Estimation
Interpreting Coefficients
Standardizing Independent Variables
Making Predictions
Customer Churn
Logistic Regression
Interpreting Coefficients
Making Predictions
Chapter 8. Scheduling Scripts to Run Automatically
Task Scheduler (Windows)
The cron Utility (macOS and Unix)
Crontab File: One-Time Set-up
Adding Cron Jobs to the Crontab File
Chapter 9. Where to Go from Here
Additional Standard Library Modules and Built-in Functions
Python Standard Library (PSL): A Few More Standard Modules
Built-in Functions
Python Package Index (PyPI): Additional Add-in Modules
NumPy
SciPy
Scikit-Learn
A Few Additional Add-in Packages
Additional Data Structures
Stacks
Queues
Graphs
Trees
Where to Go from Here
Appendix A. Download Instructions
Download Python 3
Windows
macOS
Download the xlrd Package
Windows
macOS
Download the MySQL Database Server
Windows
macOS
Setting Up MySQL
Download mysqlclient (Python 3.x)/MySQL-python (Python 2.x)
Windows
macOS
Appendix B. Answers to Exercises
Chapter 1
Bibliography
Index
About the Author
Colophon