Develop insights from data with tidy tools. Import, wrangle, visualize, and model data with the Tidyverse R packages.
This book is intended for data scientists with some familiarity with the R programming language who are seeking to do Data Science using the Tidyverse family of packages. Through 5 chapters, you will cover importing, wrangling, visualizing, and modeling data using the powerful Tidyverse packages, including the new Tidymodels framework. The Tidyverse packages provide a simple but powerful approach to Data Science which scales from the most basic analyses to massive data deployments. This book covers the entire life cycle of a Data Science project and presents specific tidy tools for each stage.
This course introduces a powerful set of Data Science tools known as the Tidyverse. The Tidyverse has revolutionized the way in which data scientists do almost every aspect of their job. We will cover the simple idea of “tidy data” and how this idea serves to organize data for analysis and modeling. We will also cover how non-tidy data can be transformed to tidy data, the Data Science project life cycle, and the ecosystem of Tidyverse R packages that can be used to execute a Data Science project.
Functional programming is an approach to programming in which the code evaluated is treated as a mathematical function. It is declarative, so expressions (or declarations) are used instead of statements. Functional programming is often touted and used due to the fact that cleaner, shorter code can be written. In this shorter code, functional programming allows for code that is elegant but also understandable. Ultimately, the goal is to have simpler code that minimizes time required for debugging, testing, and maintaining.
R at its core is a functional programming language. If you’re familiar with the apply() family of functions in base R, you’ve carried out some functional programming! Here, we’ll discuss functional programming and utilize the purrr package, designed to enhance functional programming in R. By utilizing functional programming, you’ll be able to minimize redundancy within your code. The way this happens in reality is by determining what small building blocks your code needs. These will each be a function. These small building block functions are then combined into more complex structures to be your final program.
Author(s): Carrie Wright, Shannon Ellis, Stephanie Hicks, Roger D. Peng
Publisher: Leanpub
Year: 2021
Language: English
Pages: 780
Table of Contents
Introduction to the Tidyverse
About This Course
Tidy Data
From Non-Tidy –> Tidy
The Data Science Life Cycle
The Tidyverse Ecosystem
Data Science Project Organization
Data Science Workflows
Case Studies
Importing Data in the Tidyverse
About This Course
Tibbles
Spreadsheets
CSVs
TSVs
Delimited Files
Exporting Data from R
JSON
XML
Databases
Web Scraping
APIs
Foreign Formats
Images
googledrive
Case Studies
Wrangling Data in the Tidyverse
About This Course
Tidy Data Review
Reshaping Data
Data Wrangling
Working With Factors
Working With Dates and Times
Working With Strings
Working With Text
Functional Programming
Exploratory Data Analysis
Case Studies
Visualizing Data in the Tidyverse
About This Course
Data Visualization Background
Plot Types
Making Good Plots
Plot Generation Process
ggplot2: Basics
ggplot2: Customization
Tables
ggplot2: Extensions
Case Studies
Modeling Data in the Tidyverse
About This Course
The Purpose of Data Science
Types of Data Science Questions
Data Needs
Descriptive and Exploratory Analysis
Inference
Linear Modeling
Multiple Linear Regression
Beyond Linear Regression
More Statistical Tests
Hypothesis Testing
Prediction Modeling
The tidymodels Ecosystem
Case Studies
Summary of tidymodels
About the Authors