The Fundamentals of People Analytics: With Applications in R

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This open access book prepares current and aspiring analytics professionals to effectively address this need by curating key concepts spanning the entire analytics lifecycle, along with step-by-step instructions for their applications to real-world problems, using ubiquitous and freely available open-source software. This book does not assume prior knowledge of statistics, how to query databases, or how to write performant code; early chapters include an introduction to R and SQL as well as an overview of statistical foundations.

Human capital is an organization’s most important asset. Without the knowledge and skills of people, an organization can accomplish nothing. The acquisition, development, and retention of critical talent has become increasingly more complex and challenging, and organizations are making significant investments to gain a deeper, data-informed understanding of organizational phenomena impacting the bottom line. 

By the end of this book, readers will be able to:
• Design and conduct empirical research
• Query and wrangle data using SQL
• Profile, clean, and analyze data using R
• Apply appropriate statistical and ML models to a range of people analytics use cases
• Package and present analyses to communicate impactful insights to stakeholders

Author(s): Craig Starbuck
Publisher: Springer
Year: 2023

Language: English
Pages: 385
City: Cham

Foreword
Preface
Contents
Getting Started
Guiding Principles
Pro-Employee Thinking
Quality
Prioritization
Tooling
Data Sets
Employees
Turnover Trends
Survey Responses
4D Framework
Introduction to R
Getting Started
Installing R
Installing R Studio
Installing Packages
Loading Data
Case Sensitivity
Help
Objects
Comments
Testing Early and Often
Vectors
Vectorized Operations
Matrices
Factors
Data Frames
Lists
Loops
User-Defined Functions (UDFs)
Graphics
Review Questions
Introduction to SQL
Basics
Aggregate Functions
Joins
Subqueries
Virtual Tables
Window Functions
Common Table Expressions (CTEs)
Review Questions
Research Design
Research Questions
Research Hypotheses
Internal and External Validity
Research Methods
Research Designs
Experimental Research
Quasi-Experimental Research
Non-Experimental Research
Review Questions
Measurement and Sampling
Variable Types
Independent Variables (IV)
Dependent Variables (DV)
Control Variables (CV)
Moderating Variables
Mediating Variables
Endogenous vs. Exogenous Variables
Measurement Scales
Discrete Variables
Nominal
Ordinal
Continuous Variables
Interval
Ratio
Sampling Methods
Probability Sampling
Simple Random Sampling
Stratified Random Sampling
Cluster Sampling
Systematic Sampling
Non-Probability Sampling
Convenience (Accidental) Sampling
Quota Sampling
Purposive (Judgmental) Sampling
Sampling and Nonsampling Error
Sampling Error
Selection Bias
Nonsampling Error
Nonresponse Bias
Nontruthful Responses
Measurement Error
Scale Reliability and Validity
Reliability
Validity
Face validity
Content Validity
Construct Validity
Criterion-Related Validity
Review Questions
Data Preparation
Data Extraction
Data Architecture
Data Lake
Data Warehouse
Data Mart
Database Normalization
Modern Data Infrastructure
Data Screening and Cleaning
Missingness
Outliers
Low Variability
Inconsistent Categories
Data Binning
One-Hot Encoding
Feature Engineering
Review Questions
Descriptive Statistics
Univariate Analysis
Measures of Central Tendency
Mean
Median
Mode
Range
Measures of Spread
Variance
Standard Deviation
Quartiles
Skewness
Kurtosis
Bivariate Analysis
Covariance
Correlation
Review Questions
Statistical Inference
Introduction to Probability
Probability Distributions
Discrete Probability Distributions
Continuous Probability Distributions
Conditional Probability
Central Limit Theorem
Confidence Intervals
Hypothesis Testing
Alpha
Type I & II Errors
p-Values
Bonferroni Correction
Statistical Power
Review Questions
Analysis of Differences
Parametric vs. Nonparametric Tests
Differences in Discrete Data
Chi-Square Test
Fisher's Exact Test
Differences in Continuous Data
Independent Samples t-Test
Mann-Whitney U Test
Paired Samples t-Test
Wilcoxon Signed-Rank Test
Analysis of Variance (ANOVA)
Factorial ANOVA
Review Questions
Linear Regression
Assumptions and Diagnostics
Sample Size
Simple Linear Regression
Multiple Linear Regression
Collinearity Diagnostics
Variable Selection
Moderation
Mediation
Review Questions
Linear Model Extensions
Model Comparisons
Hierarchical Regression
Multilevel Models
Polynomial Regression
Review Questions
Logistic Regression
Binomial Logistic Regression
Multinomial Logistic Regression
Ordinal Logistic Regression
Review Questions
Predictive Modeling
Cross-Validation
Validation Set Approach
Leave-One-Out
k-Fold
Model Performance
Classification
Forecasting
Bias–Variance Tradeoff
Tree-Based Algorithms
Decision Trees
Random Forests
Predictive Modeling
Classification
Forecasting
Review Questions
Unsupervised Learning
Factor Analysis
Exploratory Factor Analysis (EFA)
Confirmatory Factor Analysis (CFA)
Clustering
K-Means Clustering
Hierarchical Clustering
Review Questions
Data Visualization
Best Practices
Color Palette
Chart Borders
Zero Baseline
Intuitive Layout
Preattentive Attributes
Step-by-Step Visual Upgrade
Step 1: Build Bar Chart with Defaults
Step 2: Remove Legend
Step 3: Assign Colors Strategically
Step 4: Add Axis Titles and Margins
Step 5: Add Left-Justified Title
Step 6: Remove Background
Step 7: Remove Axis Ticks
Step 8: Mute Titles
Step 9: Flip Axes
Step 10: Sort Data
Visualization Types
Tables
Heatmaps
Scatterplots
Line Graphs
Slopegraphs
Bar Charts
Combination Charts
Waterfall Charts
Waffle Charts
Sankey Diagrams
Pie Charts
3D Visuals
Elegant Data Visualization
Review Questions
Data Storytelling
Know Your Audience
Production Status
Structural Elements
TL;DR
Purpose
Methodology
Results
Limitations
Next Steps
The Appendix
Q&A
Review Questions
Appendix
Appendix
4D Framework
Discover
Client
Primary Objective
Problem Statement
Guiding Theories
Research Questions
Research Hypotheses
Assumptions
Cadence
Aggregation
Deliverable
Filters and Dimensions
Design
Data Privacy
Data Sources and Elements
Data Quality
Variables
Analysis Method
Dependencies
Change Management
Sign-Off
Develop
Development Patterns
Productionalizable Code
Unit Testing
User Acceptance Testing (UAT)
Deliver
Data Visualization
Step-by-Step Visual Upgrade
Tables
Heatmaps
Scatterplots
Line Charts
Slopegraphs
Bar Charts
Combination Charts
Waterfall Charts
Waffle Charts
Sankey Diagrams
Pie Charts
Bibliography
Index