Mining Social Media: Finding Stories in Internet Data

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

BuzzFeed News Senior Reporter Lam Thuy Vo explains how to mine, process, and analyze data from the social web in meaningful ways with the Python programming language. Did fake Twitter accounts help sway a presidential election? What can Facebook and Reddit archives tell us about human behavior? In Mining Social Media, senior BuzzFeed reporter Lam Thuy Vo shows you how to use Python and key data analysis tools to find the stories buried in social media. Whether you're a professional journalist, an academic researcher, or a citizen investigator, you'll learn how to use technical tools to collect and analyze data from social media sources to build compelling, data-driven stories. Learn how to: • Write Python scripts and use APIs to gather data from the social web • Download data archives and dig through them for insights • Inspect HTML downloaded from websites for useful content • Format, aggregate, sort, and filter your collected data using Google Sheets • Create data visualizations to illustrate your discoveries • Perform advanced data analysis using Python, Jupyter Notebooks, and the pandas library • Apply what you've learned to research topics on your own Social media is filled with thousands of hidden stories just waiting to be told. Learn to use the data-sleuthing tools that professionals use to write your own data-driven stories.

Author(s): Lam Thuy Vo
Edition: 1
Publisher: No Starch Press
Year: 2019

Language: English
Commentary: Vector PDF
Pages: 208
City: San Francisco, CA
Tags: Data Analysis; Data Mining; Python; Data Visualization; JSON; Web Scraping; CSV; Twitter; pandas; Social Media; Facebook; Jupyter; Reddit; Elementary

Brief Contents
Contents in Detail
Acknowledgments
Introduction
What Is Data Analysis?
Who Is This Book For?
Conventions Used in This Book
What This Book Covers
Part I: Data Mining
Part II: Data Analysis
Downloading and Installing Python
Installing on Windows
Installing on macOS
Getting Help When You’re Stuck
Summary
Part I: Data Mining
Chapter 1: The Programming Languages You’ll Need to Know
Frontend Languages
How HTML Works
How CSS Works
How JavaScript Works
Backend Languages
Using Python
Getting Started with Python
Working with Numbers
Working with Strings
Storing Values in Variables
Storing Multiple Values in Lists
Working with Functions
Creating Your Own Functions
Using Loops
Using Conditionals
Summary
Chapter 2: Where to Get Your Data
What Is an API?
Using an API to Get Data
Getting a YouTube API Key
Retrieving JSON Objects Using Your Credentials
Answering a Research Question Using Data
Refining the Data That Your API Returns
Summary
Chapter 3: Getting Data with Code
Writing Your First Script
Running a Script
Planning Out a Script
Libraries and pip
Creating a URL-based API Call
Storing Data in a Spreadsheet
Converting JSON into a Dictionary
Going Back to the Script
Running the Finished Script
Dealing with API Pagination
Templates: How to Make Your Code Reusable
Storing Values That Change in Variables
Storing Code in a Reusable Function
Summary
Chapter 4: Scraping Your Own Facebook Data
Your Data Sources
Downloading Your Facebook Data
Reviewing the Data and Inspecting the Code
Structuring Information as Data
Scraping Automatically
Analyzing HTML Code to Recognize Patterns
Grabbing the Elements You Need
Extracting the Contents
Writing Data into a Spreadsheet
Building Your Rows List
Writing to Your .csv File
Running the Script
Summary
Chapter 5: Scraping a Live Site
Messy Data
Ethical Considerations for Data Scraping
The Robots Exclusion Protocol
The Terms of Service
Technical Considerations for Data Scraping
Reasons for Scraping Data
Scraping from a Live Website
Analyzing the Page’s Contents
Storing the Page Content in Variables
Making the Script Reusable
Practicing Polite Scraping
Summary
Part II: Data Analysis
Chapter 6: Introduction to Data Analysis
The Process of Data Analysis
Bot Spotting
Getting Started with Google Sheets
Modifying and Formatting the Data
Aggregating the Data
Using Pivot Tables to Summarize Data
Using Formulas to Do Math
Sorting and Filtering the Data
Merging Data Sets
Other Ways to Use Google Sheets
Summary
Chapter 7: Visualizing Your Data
Understanding Our Bot Through Charts
Choosing a Chart
Specifying a Time Period
Making a Chart
Conditional Formatting
Single-Color Formatting
Color Scale Formatting
Summary
Chapter 8: Advanced Tools for Data Analysis
Using Jupyter Notebook
Setting Up a Virtual Environment
Organizing the Notebook
Installing Jupyter and Creating Your First Notebook
Working with Cells
What Is pandas?
Working with Series and Data Frames
Reading and Exploring Large Data Files
Looking at the Data
Viewing Specific Columns and Rows
Summary
Chapter 9: Finding Trends in Reddit Data
Clarifying Our Research Objective
Outlining a Method
Narrowing the Data’s Scope
Selecting Data from Specific Columns
Handling Null Values
Classifying the Data
Summarizing the Data
Sorting the Data
Describing the Data
Summary
Chapter 10: Measuring the Twitter Activity of Political Actors
Getting Started
Setting Up Your Environment
Loading the Data into Your Notebook
Lambdas
Filtering the Data Set
Formatting the Data as datetimes
Resampling the Data
Plotting the Data
Summary
Chapter 11: Where to Go from Here
Coding Styles
Statistical Analysis
Other Kinds of Analyses
Conclusion
Index