SAS Institute, 2014. — 120 p. — ISBN: 1612909043, 9781612909042
Unstructured data is the most voluminous form of data in the world, and analysts rarely receive it in perfect condition for processing. In other words, you often need to clean, transform, and enhance your source data before you can use and derive value from it—especially where textual data is concerned. In Introduction to Regular Expressions in SAS, SAS programmers of virtually all skill levels will learn how to harness the power of Regular Expressions within the SAS programming language for a wide array of everyday applications of unstructured data analyses. This book uses a practical, examples-based approach to walk you through using Regular Expressions for unstructured data processing, and provides you with the foundational information and examples to perform advanced applications. From fuzzy matching to data extraction, this book is a critical reference for any advanced analytics practitioner who needs to leverage SAS software to effectively process their data. This book is part of the SAS Press Program.
About the Author
K. Matthew Windham, CAP, is the director of analytics at NTELX Inc., an analytics and technology solutions consulting firm located in the Washington, DC area. His focus is on helping clients improve their daily operations through the application of mathematical and statistical modeling, data and text mining, and optimization. A longtime SAS user, Matt enjoys leveraging the breadth of the SAS platform to create innovative, predictive analytics solutions. During his career, Matt has led consulting teams in mission-critical environments to provide rapid, high-impact results. He has also architected and delivered analytics solutions across the federal government, with a particular focus on the US Department of Defense and the US Department of the Treasury. Matt is a Certified Analytics Professional (CAP) who received his BS in Applied Mathematics from N.C. State University and his MS in Mathematics and Statistics from Georgetown University.