Machine Learning for Factor Investing

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Machine learning (ML) is progressively reshaping the fields of quantitative finance and algorithmic trading. ML tools are increasingly adopted by hedge funds and asset managers, notably for alpha signal generation and stocks selection. The technicality of the subject can make it hard for non-specialists to join the bandwagon, as the jargon and coding requirements may seem out-of-reach. Machine learning for factor investing: Python version bridges this gap. It provides a comprehensive tour of modern ML-based investment strategies that rely on firm characteristics. The book covers a wide array of subjects which range from economic rationales to rigorous portfolio back-testing and encompass both data processing and model interpretability. Common supervised learning algorithms such as tree models and neural networks are explained in the context of style investing and the reader can also dig into more complex techniques like autoencoder asset returns, Bayesian additive trees and causal models. All topics are illustrated with self-contained Python code samples and snippets that are applied to a large public dataset that contains over 90 predictors. The material, along with the content of the book, is available online so that readers can reproduce and enhance the examples at their convenience. If you have even a basic knowledge of quantitative finance, this combination of theoretical concepts and practical illustrations will help you learn quickly and deepen your financial and technical expertise.

Author(s): Guillaume Coquere; Tony Guida
Publisher: CRC Press LLC
Year: 2023

Language: English
Pages: 358

I Introduction

1 Notations and data

1.1 Notations

1.2 Dataset

2 Introduction

2.1 Context

2.2 Portfolio construction: the workflow

2.3 Machine learning is no magic wand

3 Factor investing and asset pricing anomalies

3.1 Introduction

3.2 Detecting anomalies

3.2.1 Challenges

3.2.2 Simple portfolio sorts

3.2.3 Factors

3.2.4 Fama-MacBeth regressions

3.2.5 Factor competition

3.2.6 Advanced techniques

3.3 Factors or characteristics?

3.4 Hot topics: momentum, timing, and ESG

3.4.1 Factor momentum

3.4.2 Factor timing

3.4.3 The green factors

3.5 The links with machine learning

3.5.1 Short list of recent references

3.5.2 Explicit connections with asset pricing models

3.6 Coding exercises

4 Data preprocessing

4.1 Know your data

4.2 Missing data

4.3 Outlier detection

4.4 Feature engineering

4.4.1 Feature selection

4.4.2 Scaling the predictors

4.5 Labelling

4.5.1 Simple labels

4.5.2 Categorical labels

4.5.3 The triple barrier method

4.5.4 Filtering the sample

4.5.5 Return horizons

4.6 Handling persistence

4.7 Extensions

4.7.1 Transforming features

4.7.2 Macroeconomic variables

4.7.3 Active learning

4.8 Additional code and results

4.8.1 Impact of rescaling: graphical representation

4.8.2 Impact of rescaling: toy example

4.9 Coding exercises

II Common supervised algorithms

5 Penalized regressions and sparse hedging for minimum variance portfolios

5.1 Penalized regressions

5.1.1 Simple regressions

5.1.2 Forms of penalizations

5.1.3 Illustrations

5.2 Sparse hedging for minimum variance portfolios

5.2.1 Presentation and derivations

5.2.2 Example

5.3 Predictive regressions

5.3.1 Literature review and principle

5.3.2 Code and results

5.4 Coding exercise

6 Tree-based methods

6.1 Simple trees

6.1.1 Principle

6.1.2 Further details on classification

6.1.3 Pruning criteria

6.1.4 Code and interpretation

6.2 Random forests

6.2.1 Principle

6.2.2 Code and results

6.3 Boosted trees: Adaboost

6.3.1 Methodology

6.3.2 Illustration

6.4 Boosted trees: extreme gradient boosting

6.4.1 Managing loss

6.4.2 Penalization

6.4.3 Aggregation

6.4.4 Tree structure

6.4.5 Extensions

6.4.6 Code and results

6.4.7 Instance weighting

6.5 Discussion

6.6 Coding exercises

7 Neural networks

7.1 The original perceptron

7.2 Multilayer perceptron

7.2.1 Introduction and notations

7.2.2 Universal approximation

7.2.3 Learning via back-propagation

7.2.4 Further details on classification

7.3 How deep we should go and other practical issues

7.3.1 Architectural choices

7.3.2 Frequency of weight updates and learning duration

7.3.3 Penalizations and dropout

7.4 Code samples and comments for vanilla MLP

7.4.1 Regression example

7.4.2 Classification example

7.4.3 Custom losses

7.5 Recurrent networks

7.5.1 Presentation

7.5.2 Code and results

7.6 Other common architectures

7.6.1 Generative adversarial networks

7.6.2 Autoencoders

7.6.3 A word on convolutional networks

7.6.4 Advanced architectures

7.7 Coding exercise

8 Support vector machines

8.1 SVM for classification

8.2 SVM for regression

8.3 Practice

8.4 Coding exercises

9 Bayesian methods

9.1 The Bayesian framework

9.2 Bayesian sampling

9.2.1 Gibbs sampling

9.2.2 Metropolis-Hastings sampling

9.3 Bayesian linear regression

9.4 Naïve Bayes classifier

9.5 Bayesian additive trees

9.5.1 General formulation

9.5.2 Priors

9.5.3 Sampling and predictions

9.5.4 Code

III From predictions to portfolios

10 Validating and tuning

10.1 Learning metrics

10.1.1 Regression analysis

10.1.2 Classification analysis

10.2 Validation

10.2.1 The variance-bias tradeoff: theory

10.2.2 The variance-bias tradeoff: illustration

10.2.3 The risk of overfitting: principle

10.2.4 The risk of overfitting: some solutions

10.3 The search for good hyperparameters

10.3.1 Methods

10.3.2 Example: grid search

10.3.3 Example: Bayesian optimization

10.4 Short discussion on validation in backtests

11 Ensemble models

11.1 Linear ensembles

11.1.1 Principles

11.1.2 Example

11.2 Stacked ensembles

11.2.1 Two-stage training

11.2.2 Code and results

11.3 Extensions

11.3.1 Exogenous variables

11.3.2 Shrinking inter-model correlations

11.4 Exercise

12 Portfolio backtesting

12.1 Setting the protocol

12.2 Turning signals into portfolio weights

12.3 Performance metrics

12.3.1 Discussion

12.3.2 Pure performance and risk indicators

12.3.3 Factor-based evaluation

12.3.4 Risk-adjusted measures

12.3.5 Transaction costs and turnover

12.4 Common errors and issues

12.4.1 Forward looking data

12.4.2 Backtest overfitting

12.4.3 Simple safeguards

12.5 Implication of non-stationarity: forecasting is hard

12.5.1 General comments

12.5.2 The no free lunch theorem

12.6 First example: a complete backtest

12.7 Second example: backtest overfitting

12.8 Coding exercises

IV Further important topics

13 Interpretability

13.1 Global interpretations

13.1.1 Simple models as surrogates

13.1.2 Variable importance (tree-based)

13.1.3 Variable importance (agnostic)

13.1.4 Partial dependence plot

13.2 Local interpretations

13.2.1 LIME

13.2.2 Shapley values

13.2.3 Breakdown

14 Two key concepts: causality and non-stationarity

14.1 Causality

14.1.1 Granger causality

14.1.2 Causal additive models

14.1.3 Structural time series models

14.2 Dealing with changing environments

14.2.1 Non-stationarity: yet another illustration

14.2.2 Online learning

14.2.3 Homogeneous transfer learning

15 Unsupervised learning

15.1 The problem with correlated predictors

15.2 Principal component analysis and autoencoders

15.2.1 A bit of algebra

15.2.2 PCA

15.2.3 Autoencoders

15.2.4 Application

15.3 Clustering via k-means

15.4 Nearest neighbors

15.5 Coding exercise

16 Reinforcement learning

16.1 Theoretical layout

16.1.1 General framework

16.1.2 Q-learning

16.1.3 SARSA

16.2 The curse of dimensionality

16.3 Policy gradient

16.3.1 Principle

16.3.2 Extensions

16.4 Simple examples

16.4.1 Q-learning with simulations

16.4.2 Q-learning with market data

16.5 Concluding remarks

16.6 Exercises

V Appendix

17 Data description

18 Solutions to exercises

18.1 Chapter 3

18.2 Chapter 4

18.3 Chapter 5

18.4 Chapter 6

18.5 Chapter 7: the autoencoder model and universal approximation

18.6 Chapter 8

18.7 Chapter 11: ensemble neural network

18.8 Chapter 12

18.8.1 EW portfolios

18.8.2 Advanced weighting function

18.9 Chapter 15

18.10 Chapter 16

Bibliography

Index