Modern Data Visualization with R describes the many ways that raw and summary data can be turned into visualizations that convey meaningful insights. It starts with basic graphs such as bar charts, scatter plots, and line charts, but progresses to less well-known visualizations such as tree maps, alluvial plots, radar charts, mosaic plots, effects plots, correlation plots, biplots, and the mapping of geographic data. Both static and interactive graphics are described and the use of color, shape, shading, grouping, annotation, and animations are covered in detail. The book moves from a default look and feel for graphs, to graphs with customized colors, fonts, legends, annotations, and organizational themes.
Features
• Contains a wide breadth of graph types including newer and less well-known approaches
• Connects each graph type to the characteristics of the data and the goals of the analysis
• Moves the reader from simple graphs describing one variable to building visualizations that describe complex relationships among many variables
• Provides newer approaches to creating interactive web graphics via JavaScript libraries
• Details how to customize each graph type to meet users’ needs and those of their audiences
• Gives methods for creating visualizations that are publication ready for print (in color or black and white) and the web
• Suggests best practices
• Offers examples from a wide variety of fields
The book is written for those new to data analysis as well as the seasoned data scientist. It can be used for both teaching and research, and will particularly appeal to anyone who needs to describe data visually and wants to find and emulate the most appropriate method quickly. The reader should have some basic coding experience, but expertise in R is not required. Some of the later chapters (e.g., visualizing statistical models) assume exposure to statistical inference at the level of analysis of variance and regression.
Author(s): Robert Kabacoff
Publisher: CRC Pressr
Year: 2024
Language: English
Pages: 272
Cover
Half Title
Series Page
Title Page
Copyright Page
Contents
Preface
0.1. Why This Book?
0.2. Acknowledgments
0.3. Supporting Website
1. Introduction
1.1. How to Use This Book
1.2. Pre-requisites
1.3. Setup
2. Data Preparation
2.1. Importing Data
2.1.1. Text Files
2.1.2. Excel Spreadsheets
2.1.3. Statistical Packages
2.1.4. Databases
2.2. Cleaning Data
2.2.1. Selecting Variables
2.2.2. Selecting Observations
2.2.3. Creating/Recoding Variables
2.2.4. Summarizing Data
2.2.5. Using Pipes
2.2.6. Processing Dates
2.2.7. Reshaping Data
2.2.8. Missing Data
3. Introduction to ggplot2
3.1. A Worked Example
3.1.1. ggplot
3.1.2. geoms
3.1.3. grouping
3.1.4. scales
3.1.5. facets
3.1.6. labels
3.1.7. themes
3.2. Placing the data and mapping Options
3.3. Graphs as Objects
4. Univariate Graphs
4.1. Categorical
4.1.1. Bar Chart
4.1.2. Pie Chart
4.1.3. Tree Map
4.1.4. Waffle Chart
4.2. Quantitative
4.2.1. Histogram
4.2.2. Kernel Density Plot
4.2.3. Dot Chart
5. Bivariate Graphs
5.1. Categorical vs. Categorical
5.1.1. Stacked Bar Chart
5.1.2. Grouped Bar Chart
5.1.3. Segmented Bar Chart
5.1.4. Improving the Color and Labeling
5.1.5. Other Plots
5.2. Quantitative vs. Quantitative
5.2.1. Scatterplot
5.2.2. Line Plot
5.3. Categorical vs. Quantitative
5.3.1. Bar Chart (on Summary Statistics)
5.3.2. Grouped Kernel Density Plots
5.3.3. Box Plots
5.3.4. Violin Plots
5.3.5. Ridgeline Plots
5.3.6. Mean/SEM Plots
5.3.7. Strip Plots
5.3.8. Cleveland Dot Charts
6. Multivariate Graphs
6.1. Grouping
6.2. Faceting
7. Maps
7.1. Geocoding
7.2. Dot Density Maps
7.2.1. Interactive Maps with Mapview
7.2.2. Static Maps with ggmap
7.3. Choropleth Maps
7.3.1. Data by Country
7.3.2. Data by US State
7.3.3. Data by US County
7.3.4. Building a Choropleth Map Using the sf and ggplot2 Packages and a Shapefile
7.4. Going Further
8. Time-Dependent Graphs
8.1. Time Series
8.2. Dumbbell Charts
8.3. Slope Graphs
8.4. Area Charts
8.5. Stream Graphs
9. Statistical Models
9.1. Correlation Plots
9.2. Linear Regression
9.3. Logistic Regression
9.4. Survival Plots
9.5. Mosaic Plots
10. Other Graphs
10.1. 3-D Scatterplot
10.2. Bubble Charts
10.3. Biplots
10.4. Alluvial Diagrams
10.5. Heatmaps
10.6. Radar Charts
10.7. Scatterplot Matrix
10.8. Waterfall Charts
10.9. Word Clouds
11. Customizing Graphs
11.1. Axes
11.1.1. Quantitative Axes
11.1.2. Categorical Axes
11.1.3. Date Axes
11.2. Colors
11.2.1. Specifying Colors Manually
11.2.2. Color Palettes
11.3. Points and Lines
11.3.1. Points
11.3.2. Lines
11.4. Fonts
11.5. Legends
11.5.1. Legend Location
11.5.2. Legend Title
11.6. Labels
11.7. Annotations
11.7.1. Adding Text
11.7.2. Adding Lines
11.7.3. Highlighting a Single Group
11.8. Themes
11.8.1. Altering Theme Elements
11.8.2. Pre-Packaged Themes
11.9. Combining Graphs
12. Saving Graphs
12.1. Via Menus
12.2. Via Code
12.3. File Formats
12.4. External Editing
13. Interactive Graphs
13.1. plotly
13.2. ggiraph
13.3. Other Approaches
13.3.1. rbokeh
13.3.2. rCharts
13.3.3. highcharter
14. Advice Best Practices
14.1. Labeling
14.2. Signal-to-Noise-Ratio
14.3. Color Choice
14.4. y-Axis Scaling
14.5. Attribution
14.6. Going Further
A. Datasets
A.1. Academic Salaries
A.2. Star Wars
A.3. Mammal Sleep
A.4. Medical Insurance Costs
A.5. Marriage Records
A.6. Fuel Economy Data
A.7. Literacy Rates
A.8. Gapminder Data
A.9. Current Population Survey (1985)
A.10. Houston Crime Data
A.11. Hispanic and Latino Populations
A.12. US Economic Timeseries
A.13. US Population by Age and Year
A.14. Saratoga Housing Data
A.15. NCCTG Lung Cancer Data
A.16. Titanic Data
A.17. JFK Cuban Missle Speech
B. About the Author
C. About the QAC
Bibliography
Index