Author(s): Giuseppe Ciaburro
Year: 0
Language: English
Pages: 416
Cover......Page 1
Title Page......Page 2
Copyright and Credits......Page 3
Packt Upsell......Page 4
Contributors......Page 5
Table of Contents......Page 7
Preface......Page 11
Chapter 1: Getting Started with Regression......Page 16
Going back to the origin of regression......Page 17
Regression in the real world......Page 21
Understanding regression concepts......Page 23
Regression versus correlation......Page 27
Discovering different types of regression......Page 30
The R environment......Page 33
Installing R......Page 36
Using precompiled binary distribution......Page 38
Installing on Linux......Page 39
RStudio......Page 40
The R stats package......Page 43
The car package......Page 44
The MASS package......Page 46
The caret package......Page 47
The glmnet package......Page 48
The sgd package......Page 49
The Lars package......Page 50
Summary......Page 51
Chapter 2: Basic Concepts – Simple Linear Regression......Page 52
Association between variables – covariance and correlation......Page 53
Searching linear relationships......Page 60
Least squares regression......Page 64
Creating a linear regression model......Page 73
Exploring model results......Page 78
Diagnostic plots......Page 81
Modeling a perfect linear association......Page 87
Summary......Page 91
Chapter 3: More Than Just One Predictor – MLR......Page 92
Multiple linear regression concepts......Page 93
Building a multiple linear regression model......Page 99
Categorical variables......Page 105
Building a model......Page 106
Gradient Descent and linear regression......Page 112
Gradient Descent......Page 114
The sgd package......Page 116
Linear regression with SGD......Page 117
Polynomial regression......Page 121
Summary......Page 128
Chapter 4: When the Response Falls into Two Categories – Logistic Regression......Page 130
Understanding logistic regression......Page 131
The logit model......Page 133
Simple logistic regression......Page 136
Multiple logistic regression......Page 144
Customer satisfaction analysis with the multiple logistic regression......Page 145
Multiple logistic regression with categorical data......Page 152
Multinomial logistic regression......Page 165
Summary......Page 171
Chapter 5: Data Preparation Using R Tools......Page 172
A first look at data......Page 173
Change datatype......Page 176
Removing empty cells......Page 178
Missing values ......Page 179
Treatment of NaN values......Page 182
Finding outliers in data......Page 183
Scale of features......Page 190
Min–max normalization......Page 191
z score standardization......Page 194
Discretization in R......Page 196
Data discretization by binning......Page 197
Data discretization by histogram analysis......Page 200
Principal Component Analysis......Page 204
Summary......Page 215
Chapter 6: Avoiding Overfitting Problems - Achieving Generalization......Page 217
Understanding overfitting......Page 218
Overfitting detection – cross-validation......Page 221
Feature selection......Page 236
Stepwise regression......Page 237
Regression subset selection......Page 245
Ridge regression......Page 253
Lasso regression......Page 262
ElasticNet regression......Page 269
Summary......Page 272
Chapter 7: Going Further with Regression Models......Page 273
Robust linear regression......Page 274
Bayesian linear regression......Page 282
Basic concepts of probability......Page 283
Bayes' theorem......Page 289
Bayesian model using BAS package......Page 291
Count data model......Page 299
Poisson distributions......Page 300
Poisson regression model......Page 302
Modeling the number of warp breaks per loom......Page 304
Summary......Page 310
Chapter 8: Beyond Linearity – When Curving Is Much Better......Page 311
Nonlinear least squares......Page 312
Multivariate Adaptive Regression Splines......Page 318
Generalized Additive Model......Page 330
Regression trees......Page 337
Support Vector Regression......Page 346
Summary......Page 350
Chapter 9: Regression Analysis in Practice......Page 352
Random forest regression with the Boston dataset......Page 353
Exploratory analysis......Page 354
Multiple linear model fitting......Page 364
Random forest regression model......Page 368
Classifying breast cancer using logistic regression......Page 373
Exploratory analysis......Page 375
Model fitting......Page 380
Regression with neural networks......Page 390
Exploratory analysis......Page 392
Neural network model......Page 396
Summary......Page 408
Other Books You May Enjoy......Page 409
Index......Page 412