This Straight to the Point guide provides an introduction to data cleansing, which also goes by names such as data munging and data wrangling. Whatever the name, it basically means doing what needs to be done to make data useful and trustworthy. Data cleansing can include the following tasks:Deleting unnecessary headersDeleting summary rowsFilling in gapsFlattening a reportMerging and appending data from multiple sourcesPulling data from source X to complete data in source YSplitting names from addressesIdentifying and deleting duplicate recordsConverting units of measurement in multiple sources
Author(s): Oz du Soleil
Publisher: Independent Publishers Group
Year: 2019
Language: English
Commentary: Excel guide an introduction to data cleansing, data munging and data wrangling
Pages: 78
Tags: Excel guide an introduction to data cleansing, data munging and data wrangling
About the Author
Acknowledgments
INTRODUCTION
A Data Cleansing Example
Data Cleansing as a Skill
The Straight to the Point Ethos
CORRECTING NAMES: PROPER CASE
COMPARING LISTS: WHAT’S OVER HERE THAT’S NOT OVER THERE?
Invitations and Responses (Match It All Up!)
Determining What’s over There That’s Not over Here
A Word About Strategy
PEELING, PARSING, AND SEGMENTING
Extracting the First Name (Using Flash Fill)
Splitting by a Single Delimiter: Separating the City from the Name
Splitting into Rows: Getting Those People Out of There!
IDENTIFYING DUPLICATE RECORDS: FUZZY MATCHING
Excel’s Duplicate Remover: The Hazard!
Reality, Context, and Strategy: Flagging Records for Review Instead of Clearing Duplicates
MERGING AND APPENDING MULTIPLE WORKBOOKS
FROM USELESS TO USEFUL: FLATTENING A REPORT
Let’s Flatten Some Stuff!
FINAL THOUGHTS
Index