Hands-On Data Analysis in R for Finance

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

The subject of this textbook is to act as an introduction to data science / data analysis applied to finance, using R and its most recent and freely available extension libraries. The targeted academic level is undergrad students with a major in data science and/or finance and graduate students, and of course practitioners or professionals who need a desk reference.

• Assumes no prior knowledge of R;

• The content has been tested in actual university classes;

• Makes the reader proficient in advanced methods such as machine learning, time series analysis, principal component analysis and more;

• Gives comprehensive and detailed explanations on how to use the most recent and free resources, such as financial and statistics libraries or open database on the internet.

Author(s): Jean-Francois Collard
Publisher: CRC Press/Chapman & Hall
Year: 2022

Language: English
Pages: 413
City: Boca Raton

Cover
Half Title
Title Page
Copyright Page
Dedication
Contents
List of Figures
Preface
1. Your Working Environment
1.1. RStudio
1.2. R Notebooks
1.3. Packages
1.4. Specialized Packages for Finance
2. Reading Data in R
2.1. Reading Input (Data) Files
2.2. Reading Excel Files
2.3. Reading Tables
2.4. Packages Come With Datasets
2.5. Reading XML Data
2.6. JSON
2.7. Chapter-End Summary
3. Financial Data
3.1. Yahoo! Finance
3.2. Federal Reserve Economic Data (FRED)
3.3. Nasdaq
3.4. Other Data Sources
4. Introduction to R
4.1. Expressions
4.2. Creating New Variables
4.3. Data Types and Type Conversion
4.4. Vectors
4.5. Matrices
4.6. Lists
4.7. Data Frames
4.8. Time Series
4.9. Data Wrangling
4.10. Exercises
4.10.1. Formatting
4.10.2. Format Conversion
4.10.3. Wrangling Using pivot__longer
4.10.4. Computing Daily Returns From Daily Prices
4.10.5. Histogram of Apple’s Daily Returns
5. Functions
5.1. Calling Existing Functions
5.2. Creating New Functions
5.3. Function Composition (a.k.a Piping)
5.4. Optimization
5.5. Manipulating Character Strings
5.6. Key Statistics Functions
5.7. Empirical Distributions
5.8. Chapter-End Summary
5.9. Exercises
5.9.1. Histogram of Oil Returns
5.9.2. ECDF of Oil Prices
5.9.3. Peak of Oil Prices
5.9.4. Qnorm
5.9.5. Returns vs Log Returns
5.9.6. Skew and Kurtosis
5.9.7. Function to Calculate Returns
5.9.8. Risk Limit
5.9.9. Probability of Reaching a Profit Target
5.9.10. Finding Most Significant Outlier
6. Data Transformation
6.1. Selecting Rows: Slicing
6.2. Group__by
6.3. Filter
6.4. Arrange
6.5. Rename
6.6. Mutate
6.7. Summarize
6.8. Contingency Tables
6.9. Aggregate
6.10. Chapter-End Summary
6.11. Exercises
6.11.1. Filtering on Either of Two Conditions
6.11.2. Performance by Sector
6.11.3. Ordering and Plotting Returns
6.11.4. Removing NAs
6.11.5. Removing Outliers
6.11.6. Deutsche Bank’s Long-Term Debt
7. Merging Data Sets
7.1. Inner Join
7.2. Left Join
7.3. Right Join
7.4. Full Join (a.k.a. Outer Join)
7.5. Merging Nasdaq Datasets
7.6. Chapter-End Summary
7.7. Exercises
7.7.1. The Zacks EE Dataset
7.7.2. Merging Dividend and Split Data
8. Graphing Using Ggplot
8.1. The Grammar of Ggplot Commands
8.2. Geometric Objects
8.3. Separating by Color
8.4. Separating by Size
8.5. Separating by Shape
8.6. Curves of Best Fit
8.7. Case Study: The House Price Dataset
8.8. Case Study: The Ocean Portfolio
8.9. Exercises
8.9.1. Change the Marker Shape by Region
8.9.2. Change the Marker Color by Price
8.9.3. Market Cap by Countries
9. Returns and Returns-based Statistics
9.1. Single-Period Returns
9.2. Multiple Periods
9.3. Prices and Adjusted Prices
9.4. Returns
9.5. Volatility
9.6. Sharpe
9.7. Drawdowns
9.8. Benchmark-Relative Performance and Risk
9.9. Rolling Correlations
9.10. Normality of Return Distributions
9.11. Fitting A Distribution
9.12. Are Differences in Returns Significant?
9.13. Exercises
9.13.1. Verifying GM’s and Ford’s Returns
9.13.2. Computing Monthly Percentage Changes of Oil Prices
9.13.3. Comparing Returns and Log-Returns
9.13.4. Worst and Best Days for Bitcoin
9.13.5. Bull Beta
10. Portfolios
10.1. Building Portfolios Using Tidyquant
10.2. Building Portfolios Using PerformanceAnalytics
10.3. Portfolio Optimization
10.4. Exercises
10.4.1. Correlation Matrix
10.4.2. Improving the Portfolio Growth Graph
10.4.3. Portfolio of Hedge Funds
10.4.4. Larger Search Space
11. Modeling Returns & Simulations
11.1. Normal and Log-normal Models
11.2. Log-normal Model – Multi-period Return
11.3. Random Walk
11.4. Geometric Random Walk
11.5. Toward Simulations
11.6. The Multiple Questions Simulations Can Answer
11.7. Exercises
11.7.1. Probability of a Loss
12. Linear and Polynomial Regression
12.1. The House Price Dataset
12.2. Multi-linear Regression
12.3. Collinearity
12.4. Variance Inflation Factor
12.5. ANOVA
12.6. Response Transformation
12.7. Linear Regression with Categorical Variables
12.8. Polynomial Regression
12.9. Exercises
12.9.1. Collinearity
12.9.2. Order of Independent Variables in Multi-linear Regressions
13. Fixed Income
13.1. Present Value
13.2. Present Value of Coupon Bonds
13.3. Exercises
13.3.1. Alternative Formula for the Present Value of a Coupon Bond
13.3.2. Modified Duration
13.3.3. Yield to Maturity
14. Principal Component Analysis
14.1. Directions of Most Variance
14.2. Application to a Full Example
14.3. How Much Variance is Explained by Each Principal Component?
14.4. Chapter-End Summary
14.5. Exercises
14.5.1. PCA on Rates
14.5.2. PCA on ACWI
15. Options
15.1. European Options
15.2. American Options
15.3. Embedded Optionality in Callable Bonds
15.4. Exercises
15.4.1. Black-Scholes
15.4.2. Plot d1 as a Function of Time
16. Value at Risk
16.1. Parametric VaR
16.2. Nonparametric VaR
16.3. Calculating VaR Using the Covariance Matrix
16.4. Conditional Value at Risk
16.5. Calculating VaR Using PerformanceAnalytics
16.6. Calculating VaR Using Tidyquant
16.7. Chapter-End Summary
16.8. Exercises
16.8.1. How Sensitive is VaR to α, Revisited
16.8.2. Comparing VaR Methods
16.8.3. Comparing CVaR Methods
16.8.4. Rolling VaR
16.8.5. Non-parametric VaR
17. Time Series Analysis
17.1. ACFs and PACFs
17.2. But What Are These Autoregressive (AR) and Moving Average (MA) Models?
17.3. Fitting a Model
17.4. Forecasting
17.5. First Differencing, or Integrated Model?
17.6. A Digression: The Intuition of the ACF Values
18. Machine Learning
18.1. Supervised Algorithms
18.2. KNN
18.3. Logistic Regression
18.4. Decision Tree
18.5. Regression Trees (Supervised)
18.6. K-Means Clustering
18.7. Hierarchical Clustering
18.8. Chapter-End Summary
18.9. Exercises
18.9.1. K-means Clustering on GICS Industries
18.9.2. Hierarchical Clustering on P/CF and ROE
19. Presenting the Results of Your Analyses
19.1. Markdown Documents
19.2. Shiny
20. Appendix: Main Packages Seen in this Book
Index