Urban Informatics: Using Big Data to Understand and Serve Communities

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Urban Informatics: Using Big Data to Understand and Serve Communities introduces the reader to the tools of data management, analysis, and manipulation using R statistical software. Designed for undergraduate and above level courses, this book is an ideal onramp for the study of urban informatics and how to translate novel data sets into new insights and practical tools.

Each chapter has an Exploratory Data Assignment that prompts readers to practice their new skills on a data set of their choice. These assignments guide readers through the process of becoming familiar with the contents of a novel data set and communicating meaningful insights from the data to others.

Key Features:

  • The technical curriculum consists of both data management and analytics, including both as needed to become acquainted with and reveal the content of a new data set.
  • Content that is contextualized in real-world applications relevant to community concerns.
  • Unit-level assignments that educators might use as midterms or otherwise. These include Community Experience assignments that prompt students to evaluate the assumptions they have made about their data against real world information.
  • All data sets are publicly available through the Boston Data Portal.

Author(s): Daniel T. O'Brien
Series: Chapman & Hall/CRC Data Science Series
Edition: 1
Publisher: CRC Press/Chapman & Hall
Year: 2022

Language: English
Pages: 339
City: Boca Raton

Cover
Half Title
Series Page
Title Page
Copyright Page
Contents
Preface
1. Introduction
1.1. This Book: The Practice of Urban Informatics
1.2. The Themes of Urban Informatics
1.3. Novel Digital Data: “Big” Data or Something More?
1.3.1. What Are “Big Data”?
1.4. “Sensing” the Pulse of Communities
1.5. Civic Data Ecosystem
1.6. Policy Innovations: Changing How the City Works
1.7. The New Urban Science: In Search of a Paradigm
1.8. This Book: Learning Objectives and Structure
1.8.1. Learning Objectives
1.8.2. Organization of the Book
1.8.3. Worked Examples in this Book
1.9. Exercises
1.9.1. Problem Set
1.9.2. Exploratory Data Assignment
I. Information
2. Welcome to R
2.1. Worked Example and Learning Objectives
2.2. Getting Set Up with R
2.2.1. What is R?
2.2.2. Installing R
2.2.3. The R Interface
2.2.4. Installing RStudio
2.2.5. The RStudio Interface
2.3. Creating a Project in RStudio
2.3.1. Creating a New Syntax
2.4. How to Work with R
2.4.1. Coding in R
2.4.2. R as Calculator
2.4.3. Functions
2.5. R as Data Management Software
2.5.1. Variables
2.5.2. Vectors
2.5.3. Data Frames
2.5.4. Other Object Classes
2.6. Packages
2.7. Learning R
2.8. Summary
2.9. Exercises
2.9.1. Problem Set
2.9.2. Exploratory Data Assignment
3. Telling a Data Story: Examining Individual Records
3.1. Worked Example and Learning Objectives
3.1.1. Getting Started - A Reminder
3.2. Introducing R Markdown
3.3. Access and Import Data
3.4. Getting Acquainted with a Data Frame’s Structure
3.5. Subsetting Data Objects
3.5.1. Subsetting Data: How and Why?
3.5.2. Subsetting in Base R: Vectors
3.5.3. Subsetting in Base R: Data Frames
3.5.4. Subsetting in tidyverse
3.5.5. Combining Subsets with Other Tools
3.6. Sorting
3.7. Summary
3.8. Exercises
3.8.1. Problem Set
3.8.2. Exploratory Data Assignment
4. The Pulse of the City: Observing Variable Patterns
4.1. Worked Example and Learning Objectives
4.2. Summarizing Variables: How and Why?
4.3. Classes of Variable
4.3.1. The Five Main Classes of Variable
4.3.2. Example Variables from Craigslist
4.3.3. Converting Variable Classes: as. Functions
4.4. Summarizing Variables
4.4.1. Making Tables
4.4.2. The summary() Function
4.4.3. Summary Statistics
4.5. Doing More with Summaries
4.5.1. Summarizing Multiple Variables with apply()
4.5.2. Summarizing across Categories in One Variable: by()
4.6. Summarizing Subsets: The Return of Piping
4.6.1. Summary statistics with summarise()
4.6.2. Summary Statistics by Categories with group_by()
4.7. Tables as Objects
4.8. Intro to Visualization: ggplot2
4.8.1. Histograms: The Most Popular Univariate Visualizations
4.8.2. Additional Univariate Visualizations
4.8.3. Incorporating Pipes into ggplot2
4.8.4. Stacked Graphs: One Variable across Categories
4.9. Summary
4.10. Exercises
4.10.1. Problem Set
4.10.2. Exploratory Data Assignment
5. Uncovering Information: Making and Creating Variables
5.1. Worked Example and Learning Objectives
5.2. Editing and Creating Variables: How and Why?
5.3. Variables in the Bike Collisions Dataset
5.4. Calculating (and Recalculating) Numeric Variables
5.4.1. One Variable, One Equation
5.4.2. Multiple Variables, One Equation
5.4.3. Multiple Variables, Different Equations
5.4.4. Calculating Variables in tidyverse
5.5. Manipulating Character Variables: stringr
5.5.1. Mutating Strings
5.5.2. Joining Strings
5.6. Creating Categories
5.6.1. Categorizing by Content: str_detect()
5.6.2. Categorizing by Criteria: ifelse()
5.6.3. Categorizing by Levels
5.7. Text Analysis
5.7.1. Step 1: Preparing the Data
5.7.2. Step 2: Creating a Corpus
5.7.3. Step 3: Cleaning the Text
5.7.4. Step 4: Creating a Document Term Matrix
5.7.5. Step 5: Examining Word Frequency
5.7.6. Summary and Usage
5.8. Dealing with Dates
5.9. Returning to ggplot2
5.9.1. Visualizing Weather and Injuries – Customizing Graphs
5.9.2. Visualizing a Third Variable: Facets
5.9.3. Visualizing Word Frequencies: Word Clouds
5.10. Summary
5.11. Exercises
5.11.1. Problem Set
5.11.2. Exploratory Data Assignment
Information: Unit I Summary and Major Assignments
II. Measurement
6. Measuring with Big Data
6.1. Worked Example and Learning Objectives
6.2. Data and Theory: The Responsibility of the Analyst
6.3. Missing Ingredients of Naturally Occurring Data
6.4. Unit of Analysis
6.4.1. Schema
6.5. What We Did: Defining “Neighborhood”
6.6. Isolating Relevant Content
6.6.1. Latent Constructs: A Guide for Measurement
6.6.2. What We Did: Isolating Physical Disorder
6.7. Biases and Validity
6.7.1. Validity
6.7.2. What We Did: Disentangling Physical Disorder from Reporting Tendencies
6.8. Summary
6.9. Exercises
6.9.1. Problem Set
6.9.2. Exploratory Data Assignment
7. Making Measures from Records: Aggregating and Merging Data
7.1. Worked Example and Learning Objectives
7.2. Aggregating and Merging Data: How and Why?
7.3. Introducing the Schema for Historical Census Data
7.4. Aggregation
7.4.1. aggregate()
7.4.2. Aggregation with tidyverse
7.5. Merging
7.5.1. merge()
7.5.2. join functions in tidyverse
7.6. SQL and the sqldf Package
7.6.1. Aggregation with SQL
7.6.2. Merging with SQL
7.7. Creating Custom Functions
7.8. Bivariate Visualizations
7.8.1. geom_point()
7.8.2. geom_density2d()
7.9. Summary
7.10. Exercises
7.10.1. Problem Set
7.10.2. Exploratory Data Assignment
8. Mapping Communities
8.1. Worked Example and Learning Objectives
8.2. Intro to Geographical Information Systems
8.2.1. Structure of Data
8.2.2. Coordinate Projection Systems
8.2.3. Types of Spatial Data
8.2.4. Making a Map: Layering .shps
8.3. Working with Spatial Data in R
8.3.1. The sf Package
8.3.2. Importing Spatial Data into R
8.3.3. Plotting a Shapefile
8.4. Making a Map
8.4.1. Importing and Merging Additional Data
8.4.2. Creating a Base Map
8.4.3. Making a Multi-Layer Map
8.4.4. Customization
8.4.5. Summary and Extensions
8.5. Working with Points
8.5.1. “Map” Records with lat-long
8.5.2. Converting Records into sf Points
8.5.3. Spatial Joining Points with Polygons
8.5.4. Aggregating Spatially Joined Data
8.5.5. Summary and Extensions
8.6. Connecting R to Other GIS Software
8.6.1. Making an Interactive Leaflet Map
8.7. Summary
8.8. Exercises
8.8.1. Problem Set
8.8.2. Exploratory Data Assignment
9. Advanced Visual Techniques
9.1. Worked Example and Learning Objectives
9.2. Data Visualization: How and Why?
9.3. Multiplots
9.3.1. Making Individual Graphics
9.3.2. Coordinating Individual Graphics in a Multiplot
9.4. Streamgraphs
9.4.1. Executing a Streamgraph
9.5. Heat Maps
9.5.1. A Heat Map for Two Categorical Variables
9.6. Correlograms
9.6.1. Creating a Correlogram
9.7. Animations
9.7.1. Animating a Bar Graph
9.7.2. Animating a Line Graph
9.7.3. Animations Redux
9.8. Summary
9.9. Exercises
9.9.1. Problem Set
9.9.2. Exploratory Data Assignment
Measurement: Unit II Summary and Major Assignments
III. Discovery
10. Beyond Measurement: Inferential Statistics (and Correlations)
10.1. Worked Example and Learning Objectives
10.2. Foundations of Inferential Statistics
10.2.1. Inferential Statistics and Samples
10.2.2. Distributions of Numeric Variables
10.2.3. Hypothesis Testing
10.2.4. Effect Sizes and Significance
10.2.5. Inferential Statistics in Practice: Analyzing “Big” Data
10.3. Correlations
10.3.1. Why Use a Correlation?
10.3.2. Effect Size: The Correlation Coefficient (r)
10.3.3. Running Correlations in R
10.3.4. Visualizing Correlation Matrices
10.4. Summary
10.5. Exercises
10.5.1. Problem Set
10.5.2. Exploratory Data Assignment
11. Identifying Inequities across Groups: ANOVA and t-Test
11.1. Worked Example and Learning Objectives
11.2. Identifying Inequities across Groups
11.2.1. Making Statistical Comparisons across Groups
11.3. t-Test: Comparing Two Groups
11.3.1. Why Use a t-Test?
11.3.2. Effect Size: Magnitude of Difference in Means
11.4. Conducting a t-Test in R
11.4.1. Single-Sample t-Test
11.4.2. Two-Sample t-Test
11.5. ANOVA: Comparing Three or More Groups
11.5.1. Why Use an ANOVA?
11.5.2. Effect size: F-Statistic
11.6. Conducting an ANOVA in R
11.6.1. ANOVA with aov()
11.6.2. Post-Hoc Tests with TukeyHSD()
11.6.3. Communicating ANOVA Results
11.7. Visualizing Differences between Groups
11.7.1. Representing Means
11.7.2. Adding Variability
11.7.3. Comparing Multiple Variables across Groups
11.8. Summary
11.9. Exercises
11.9.1. Problem Set
11.9.2. Exploratory Data Assignment
12. Unpacking Mechanisms Driving Inequities: Multivariate Regression
12.1. Worked Example and Learning Objectives
12.2. Conducting an Equity Analysis
12.2.1. Multivariate Analysis: Modeling Dependent and Independent Variables
12.2.2. Multivariate Relationships: Correlation and Causation
12.3. Regression
12.3.1. Why Use a Regression?
12.3.2. Interpreting Multiple Independent Variables
12.3.3. Effect Size: Evaluating Each Independent Variable and the Model
12.4. Conducting Regressions in R: lm()
12.4.1. Bivariate Regression
12.4.2. Calculating Standardized Betas with lm.beta()
12.4.3. Multivariate Regression
12.4.4. Reporting Regressions
12.4.5. Incorporating Categorical Independent Variables
12.5. Some Extensions to Regression Analysis
12.5.1. Dealing with Non-Normal Dependent Variables
12.5.2. Working with Residuals
12.5.3. Building a Good Model
12.6. Visualizing Regressions
12.7. Summary
12.8. Exercises
12.8.1. Problem Set
12.8.2. Exploratory Data Assignment
Discovery: Unit III Summary and Major Assignments
IV. The Other Tools
13. Advanced Analytic Techniques
13.1. Structure and Learning Objectives
13.2. Network Science
13.2.1. What It Is
13.2.2. How It Works
13.2.3. When to Use It
13.2.4. Ethical Considerations
13.2.5. Major Applications
13.2.6. Additional Reading
13.3. Machine Learning and Artificial Intelligence
13.3.1. What It Is
13.3.2. How It Works
13.3.3. When to Use It
13.3.4. Ethical Considerations
13.3.5. Major Applications
13.3.6. Additional Reading
13.4. Predictive Analytics
13.4.1. What It Is
13.4.2. How It Works
13.4.3. When to Use It
13.4.4. Ethical Considerations
13.4.5. Major Applications
13.4.6. Additional Reading
13.5. Summary
13.6. Exercises
13.6.1. Problem Set
13.6.2. Exploratory Data Assignment
14. Emergent Technologies
14.1. Structure and Learning Objectives
14.2. Sensor Networks
14.2.1. What It Is
14.2.2. How It Works
14.2.3. Ethical Considerations
14.2.4. Major Applications
14.2.5. Additional Readings
14.3. 5G Cellular Networks
14.3.1. What It Is
14.3.2. How It Works
14.3.3. Ethical Considerations
14.3.4. Major Applications
14.3.5. Additional Readings
14.4. Blockchain
14.4.1. What It Is
14.4.2. How It Works
14.4.3. Ethical Considerations
14.4.4. Major Applications
14.4.5. Additional Readings
14.5. Summary
14.6. Exercises
14.6.1. Problem Set
14.6.2. Exploratory Data Assignment
The Other Tools: Unit IV Summary and Major Assignments
Bibliography
Index