Adventures In Financial Data Science: The Empirical Properties Of Financial And Economic Data

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book provides insights into the true nature of financial and economic data, and is a practical guide on how to analyze a variety of data sources. The focus of the book is on finance and economics, but it also illustrates the use of quantitative analysis and data science in many different areas. Lastly, the book includes practical information on how to store and process data and provides a framework for data driven reasoning about the world. The book begins with entertaining tales from Graham Giller's career in finance, starting with speculating in UK government bonds at the Oxford Post Office, accidentally creating a global instant messaging system that went "viral" before anybody knew what that meant, on being the person who forgot to hit "enter" to run a hundred-million dollar statistical arbitrage system, what he decoded from his brief time spent with Jim Simons, and giving Michael Bloomberg a tutorial on Granger Causality. The majority of the content is a narrative of analytic work done on financial, economics, and alternative data, structured around both Dr Giller's professional career and some of the things that just interested him. The goal is to stimulate interest in predictive methods, to give accurate characterizations of the true properties of financial, economic and alternative data, and to share what Richard Feynman described as "The Pleasure of Finding Things Out."

Author(s): Graham L. Giller
Series: World Scientific Series In Finance, 19
Edition: 2
Publisher: World Scientific
Year: 2022

Language: English
Pages: 511
City: Singapore

Contents
Preface
About the Author
List of Figures
List of Tables
Chapter 1. Biography and Beginnings
1.1. About this Book
1.1.1. What this Book is?
1.1.2. What this Book is Not?
1.2. Family
1.2.1. AboutMe
1.2.2. Grandad and the Oil Tanker
1.2.3. Grandma, Bletchley Park, and Partial Differential Equations
1.2.4. Grandpa, Dad, and Caring about People
1.2.5. MyFamily
1.3. Oxford, Physics and Bond Trading
1.3.1. Physics
1.3.2. Bond Trading
1.3.3. Electronics
1.4. Morgan Stanley and P.D.T.
1.4.1. The Japanese Warrants Desk
1.4.2. Creating a Global Instant Messaging Infrastructure by Accident
1.4.3. Statistical Arbitrage
1.4.4. Process Driven Trading
1.5. Self Employed
1.6. Professional Data Science
1.6.1. Consulting
1.6.2. Bloomberg
1.6.3. Artificial Intelligence
1.6.4. Data Science is Actually Science
1.6.5. Statistics and the Multiverse
1.6.6. We Learn as We Work
Chapter 2. Financial Data
2.1. Modeling Asset Prices as Stochastic Processes
2.1.1. Geometric Brownian Motion
2.1.2. Randomness in Finance
2.1.3. The Golden Rule of Prediction
2.1.4. Linear Additive Noise
2.2. Abnormality of Financial Distributions
2.2.1. Calendars and Returns
2.2.2. Martingales and Markov Processes
2.2.3. Daily Returns of the S&P 500 Index
2.2.4. Temporal Invariance
2.2.5. Heteroskedasticity
2.2.6. GARCH
2.2.7. Stock Market Returns are Not Normally Distributed
2.3. The US Stock Market Through Time
2.3.1. Drift and Momentum
2.3.2. Kurtosis
2.3.3. Recessions
2.3.4. The Basic Properties of Index Returns
2.4. Interest Rates
2.4.1. The Long-Term Properties of Three Month Treasury Bill Rates
2.4.1.1. The Time Series of Rates
2.4.1.2. The Distribution of Changes in Interest Rates
2.4.1.3. A Daily Vasicek-GARCH(1, 1) Model with Quiet Days Censored
2.4.1.4. Testing for Independence in the Direction of Changes of Interest Rates
2.4.1.5. A Compound Model for Interest Rate Changes
2.4.1.6. Estimation of the Markov Chain for Rate Change Direction
2.4.1.7. Estimation of the Scale Distribution for Rate Changes
2.4.2. The Basic Properties of Interest Rates
2.5. LIBOR and Eurodollar Futures
2.5.1. LIBOR
2.5.1.1. My impressions of LIBOR from the 1990s
2.5.1.2. LIBOR as a Time Series Compared to Treasury Rates
2.5.1.3. Whittle Test for Independence of the Sequences of Direction of Daily Change of LIBOR Rates
2.5.1.4. Estimated Markov Chain Transition Matrix for Direction of Daily Change of LIBOR Rates
2.5.2. Biased Expectations in Eurodollar Futures
2.5.2.1. Futures Pricing Relationships
2.5.2.2. A Simple Test for Biased Expectations for Near Quarterly Eurodollar Futures
2.5.2.3. Tests for Biased Expectations for Other Quarterly Eurodollar Futures
2.5.2.4. A Model for the Daily Change in Eurodollar Futures Prices
2.5.2.5. Estimation of the Eurodollar Futures Variance Model Contract by Contract
2.5.2.6. Test of a Model for Eurodollar Futures Price Changes Including a Risk Premium and Momentum
2.6. Asymmetric Response
2.6.1. Motivation
2.6.2. The Functional Form of Δσ2t (rt−1)
2.6.3. Asymmetric Response in Index Returns
2.6.3.1. Fitting the GJR Model to the Entire History of the S&P 500 Index
2.6.3.2. Piecewise Quadratic GARCH
2.6.3.3. Visualization of the Form of Δσ2t (rt−1)
2.6.3.4. Asymmetric Response in Index Returns through Time
2.6.4. Asymmetric Response in Rates
2.6.4.1. Treasury Bill Rates
2.6.4.2. Exchange Rates: British Pounds
2.6.5. Asymmetric Response in Individual Stock Returns
2.6.5.1. A Note on Survivorship Bias
2.6.5.2. Abnormality of Individual Stock Returns
2.6.5.3. Heteroskedasticity in Individual Stock Returns
2.6.5.4. Asymmetric Response in Individual Stock Volatility
2.6.5.5. Autocorrelation in Individual Stock Returns
2.6.5.6. The Market Factor in Individual Stock Returns
2.6.5.7. The Basic Properties of Individual Equity Returns
2.6.6. An Agent Based Model for Asymmetric Response
2.6.6.1. Stock Returns with Volatility Generated by Noise Traders, Stock Specific News and the Market
2.6.6.2. An Autoregressive Model for the Number of Noise Traders
2.6.6.3. A Model for Downside Response in Variance
2.6.6.4. The Aggregate Individual Stock Variance Process
2.6.6.5. Index Variance Reduction due to the Response of Noise Traders to Positive Return Shocks
2.6.7. Concluding Remarks
2.7. Equity Index Options
2.7.1. The Returns of Basic Option Strategies
2.7.1.1. Definition of Strategies
2.7.1.2. Distributions of Basic Option Strategy Monthly Returns
2.7.1.3. Bootstrapping the Excess Return of a Call Buying Strategy
2.7.2. The Price at which the Entire Option Market Breaks Even
2.7.2.1. The Market Clearing Price for an Option Market
2.7.2.2. The Average Discount between Option Market Forward Prices and Spot Prices
2.7.2.3. Properties of the Risk Premium Rate of Option Market Forward Prices
2.8. The VIX Index
2.8.1. The Relationship between Implied Volatility and Empirical Volatility
2.8.1.1. Mean Reverting Nature of GARCH Models
2.8.1.2. Computing the Average Forward Variance over 30 Days
2.8.1.3. Comparison of the VIX with an Empirically Valid Volatility Model
2.8.1.4. Test of the Variance Linearity Hypothesis for the VIX
2.8.2. The Relationship between Market Returns and Implied Volatility
2.8.2.1. Simple Cross-Sectional Regression between Returns of the VIX and the Market
2.8.2.2. A Word of Caution on Polynomial Models
2.9. Microwave Latency Arbitrage
2.9.1. The Signal
2.9.1.1. Basic Metrology
2.9.1.2. Price Changing Ticks
2.9.1.3. Response of SPY to Price Changing Ticks in ES Futures
2.9.1.4. The Effect of Book Imbalance
2.9.1.5. Microstructure is the Key
2.10. What I’ve Learned about Financial Data
Chapter 3. Economic Data and Other Time-Series Analysis
3.1. Non-Farm Payrolls
3.1.1. Time-Series Prediction of Non-Farm Payrolls before COVID-19
3.1.2. Details of the GJR-GARCH Model
3.1.3. Student’s t Distribution
3.1.4. A GJR-GARCH Model with Student’s t Distribution
3.1.5. Bootstrapping the Change in Akaike Information Criterion
3.1.6. Predictive Model Selection Using the AIC
3.1.7. Out-of-Sample Performance of the Model
3.1.8. Time-Series Prediction of Non-Farm Payrolls During the Coronavirus Outbreak
3.1.9. A Distribution with Even Fatter Tails
3.2. Initial Claims
3.2.1. Predicting Initial Claims from Google Search Trends
3.2.2. Google Trends
3.2.3. Details of the Covariance between Google Trends and Initial Claims
3.2.4. Time Varying Coefficients Models
3.2.5. Fitting a Simple Linear Model and a TVC Model to the Google Trends Data
3.3. Twitter
3.3.1. Do Twitter Users Believe Traders Have Hot Hands?
3.3.1.1. The Experiment
3.3.1.2. Analysis of Variance
3.3.2. #nfpguesses
3.3.3. Lexical Sentiment
3.3.3.1. Twitter Sentiment Analysis for Predicting the Market
3.3.3.2. Analysis of Twitter Users in 2012
3.3.3.3. Characteristics of Sentiment Graded Tweets
3.3.3.4. Relationship of Sentiment Graded Tweets to Market Index Returns
3.3.3.5. My Opinion on Lexical Sentiment and the Market
3.3.4. Sentiment and the Macroeconomy
3.3.4.1. Cloud Services for NLP
3.3.4.2. Collecting Sentiment Data from Twitter
3.3.4.3. Time Series of Twitter Organic Sentiment for the USA
3.3.4.4. Correlation of Sentiment with Initial Claims
3.3.4.5. Panel Regression Analysis
3.3.4.6. Panel Regression of Initial Claims onto Twitter Sentiment
3.4. Analysis of Climate Data
3.4.1. The Central England Temperature Series
3.4.1.1. A Bootstrap Analysis of the Non-Parametric Regression
3.4.2. A Seasonal Autoregressive Model for the Central England Temperature
3.4.2.1. Issues with Calendar Dates in Long-Time Series
3.4.2.2. A Note on Product Models and Additive Models
3.4.2.3. Selection of the Autoregressive and Seasonal Model Order for the Pre-Industrial Period
3.4.2.4. Structure of the Optimized Model
3.4.2.5. Non-Normality in the Central England Temperature
3.4.2.6. A Note on High Order Autoregressive Models
3.4.3. Heteroskedasticity in the Central England Temperature
3.4.3.1. Autocorrelation Function of the Residuals
3.4.3.2. Autocorrelation Function of the Squared Residuals
3.4.3.3. Structural Heteroskedasticity
3.4.3.4. Confidence Intervals and Likelihood Ratios
3.4.3.5. Identification of the Optimal Model with Structural Heteroskedasticity
3.4.4. Comparing Models Estimated in the Pre-Industrial and Industrial Periods
3.4.4.1. Errors in the Variables
3.4.4.2. Comparing Parameters Estimated for twoDistinct Periods
3.4.4.3. Consistency of the Autoregressive Models
3.4.4.4. Consistency of the Temperature Trend
3.4.5. Forecasting the Central England Temperature in the Modern Period
3.4.5.1. Forecasting Skill
3.4.5.2. Comparison of the Predictive Models
3.4.6. Does the Central England Temperature Support an Upward Trend?
3.4.7. Do Sunspot Numbers Explain Temperature Changes?
3.4.7.1. The Mean Monthly Sunspot Number
3.4.7.2. The Box–Cox Transformation
3.4.7.3. The Statistics of Counting
3.4.7.4. Fitting a Model with Transformed Sunspot Number
3.4.7.5. Results
3.4.8. My Motivation for this Work and What I Learned
3.5. Sunspots
3.5.1. The Solar Cycle
3.5.2. Markov Switching Models
3.5.2.1. General Structure
3.5.2.2. Specific Model for Sunspots
3.5.2.3. Interpretation of the Model
3.5.3. The Fitted Model
3.5.3.1. Selection of the Autoregressive Model Order
3.5.3.2. Estimated Parameters for the Optimal Model
3.5.3.3. Accuracy of In-Sample Predictions
3.5.3.4. Predictions of Sunspot Number in the Modern Period
3.5.4. Summary of This Work
Chapter 4. Politics, Schools, Public Health, and Language
4.1. Presidential Elections
4.1.1. The New York Times Data
4.1.1.1. The 2016 Election
4.1.2. Discrete Dependent Variables
4.1.2.1. PROBIT and Logit Models
4.1.2.2. Maximum Likelihood Estimation
4.1.2.3. Other Classification Algorithms
4.1.3. A Generalized Linear Model
4.1.3.1. The Baseline Model
4.1.3.2. Testing Predictors Independently
4.1.3.3. Results of the Independent Predictor Regressions
4.1.3.4. Joint Model for Elections from 1896 to 2012
4.1.3.5. Precision and Recall
4.1.3.6. Bootstrapping the F-Score Optimization
4.1.4. ANa¨ıve Bayes Classifier for Presidential Elections
4.1.4.1. Na¨ıve Bayes
4.1.4.2. Training a Na¨ıve Bayes Classifier
4.1.4.3. Precision and Recall for Na¨ıve Bayes
4.1.4.4. Correlation between Classifier Probabilities
4.1.5. The Trend Toward Taller Candidates
4.2. School Board Elections
4.2.1. How the Election Works
4.2.2. TheData
4.2.3. A Model for Independent Win Probabilities
4.2.4. Estimation Results
4.2.4.1. Results of the Independent Wins Regression
4.2.4.2. Predicted Rankings for the 2019 Election
4.2.5. A Poisson Model for Vote Counts
4.2.5.1. The Model for Candidate Vote Counts
4.2.6. Estimation Results
4.2.6.1. Results of the Independent Vote Counts Regression
4.2.6.2. Expression of the Outcome in terms of Vote Share
4.2.7. Election Results
4.3. Analysis of Public Health Data
4.3.1. TheData
4.3.2. BodyMassIndex
4.3.2.1. “Normal” BMI
4.3.2.2. The Relationship between Human Body Weight and Height
4.3.2.3. The Extreme Value Distribution (Gumbel Distribution)
4.3.2.4. A Model for Human Body Weight
4.3.2.5. Estimated Parameters
4.3.2.6. Correcting for the Dependence of Weight on Age
4.3.3. Testing Dietary and Activity Variables
4.3.3.1. Analytical Approach
4.3.3.2. Alcohol Consumption
4.3.3.3. Exercise
4.3.3.4. French Fries
4.3.3.5. What Should My Weight Be?
4.3.4. Survey Design and Causality
4.3.5. Variation of Parameter Estimates with Survey Year
4.3.5.1. Demographic Parameters
4.3.5.2. Behavioural Parameters
4.3.6. Variation of Population Aggregates with Survey Year
4.3.7. What I Learned from this Analysis
4.4. Statistical Analysis of Language
4.4.1. Zipf ’s Law and Random Composition
4.4.2. Empirical Distributions of Words in English Language Corpora
4.4.2.1. The Natural Language Toolkit (NLTK)
4.4.2.2. The Corpora
4.4.2.3. An Extended Zipf–Mandelbrot Law
4.4.2.4. Frequency Analysis
4.4.2.5. Empirical Results
4.4.2.6. Conclusions from Frequency Analysis
4.4.3. Simulation of the Frequency-Rank Distribution for Naive Pseudowords Generated from the Brown Corpus
4.4.3.1. Alphabet Statistics
4.4.3.2. Pseudoword Generation
4.4.3.3. Bootstrapping the Brown Corpus Pseudowords
4.4.3.4. Analysis of Unordered Pseudowords
4.4.3.5. Analysis of Semi-Ordered Pseudowords
4.4.3.6. Summary of the Pseudoword Studies
4.5. Learning from a Mixed Bag of Studies
Chapter 5. Demographics and Survey Research
5.1. Machine Learning Models for Gender Assignment
5.1.1. TheData
5.1.1.1. Creating Features
5.1.1.2. Random Subsampling
5.1.2. Logistic Regression
5.1.3. A Regression Tree Model
5.1.4. Machine Learning and Language
5.1.5. A Note on Non-Binary Gender
5.2. Bayesian Estimation of Demographics
5.2.1. Methodology
5.2.2. Interesting Marginal Results
5.2.2.1. The Probable Age of “Veronica”
5.2.2.2. The Likely Gender of “Leslie”
5.3. Working with Patreon
5.3.1. TheData
5.3.2. Raking Weights
5.3.3. Patreon Cancellation Surveys and Consumer Sentiment
5.3.3.1. The University of Michigan’s Index of Consumer Sentiment
5.3.3.2. “Not Financial Instability”
5.3.3.3. The Relationship between Consumer Sentiment and Pledge Cancellation
5.3.3.4. A Granger Causality Analysis of Patreon Data
5.3.3.5. A Linear Predictive Model
5.3.3.6. Results of the Granger Test
5.4. Survey and Opinion Research
5.4.1. Consumer Sentiment
5.4.1.1. Exploratory Data Analysis
5.4.1.2. A Time-Varying Coefficients Model
5.4.2. Consumer Expectations of Inflation
5.4.2.1. Exploratory Data Analysis
5.4.2.2. A Time-Varying Coefficients Model
5.4.3. The Residuals to the Time-Varying Coefficients Model
5.4.4. Response Times and Question Complexity
5.4.4.1. Hick’s Law
5.4.4.2. Empirical Results
5.4.5. Respondent Honesty
5.4.5.1. The Giller Investments Employment Situation Survey
5.4.5.2. Analysis of Cross-tabulations
5.5. Working with China Beige Book
5.5.1. Modeling Fundamental Company Data
5.5.2. The China Beige Book Data
5.5.3. Predicting the Quarterly Revenues of Wynn Resorts
5.5.3.1. ASimpleModel
5.5.3.2. Results
5.5.3.3. The Effect of Coronavirus
5.6. Generalized Autoregressive Dirichlet Multinomial Models
5.6.1. The Dirichlet Distribution
5.6.2. The Multinomial Distribution
5.6.3. The Dirichlet–Multinomial Distribution
5.6.4. The Temporal Evolution of Opinion and Sentiment
5.6.5. Differing Time-Horizons
5.6.6. Use of the Dirichlet–Multinomial Distribution
5.6.7. SimpleARMA(1, 1) Models
5.6.8. SpecialCases
5.6.9. Properties of the Dirichlet Distribution and the Dirichlet Multinomial Distribution
5.6.10. Unconditional Means
5.6.11. Unconditional Mean of the Norm of the Concentration Vector
5.6.12. Lead 1 Forecasting
5.6.13. Lead m Forecasting and Stationarity
5.6.14. Aggregate Conditions for Stationarity of the Process
5.6.15. In Conclusion
5.7. Presidential Approval Ratings
5.7.1. TheData
5.7.2. Combining Opinion Polls
5.7.3. The Model
5.7.4. Estimated Parameters
5.7.5. Interpretation of the Model
5.7.6. Validation of the Framework
Acknowledgements
Chapter 6. Coronavirus
6.1. Discrete Stochastic Compartment Models
6.1.1. Continuous Time Compartment Models
6.1.2. Discrete Time Stochastic Models
6.1.2.1. Defining the Model
6.1.2.2. TheData
6.1.2.3. Estimating the Model
6.1.2.4. The Reproduction Ratio and Case Fatality Rate
6.2. Fitting Coronavirus in New Jersey
6.2.1. Early Models and Predictions
6.2.2. Modeling Coronavirus with Piecewise Constant Parameters
6.2.2.1. Monmouth County
6.2.2.2. New Jersey
6.2.2.3. United States
6.3. Independent Models by State
6.3.1. Independent Analysis of the States
6.3.1.1. Variation of Reproduction Rate by State
6.3.1.2. The Effect of Population on Reproduction Rate
6.3.1.3. The Effect of Population Size on Case Fatality Rate
6.4. Geospatial and Topological Models
6.4.1. Topology versus Geography
6.4.2. A Map of New Jersey Based on Connectivity
6.4.3. A Topological Compartment Model for COVID-19
6.4.3.1. Defining the Model
6.4.3.2. Estimating the Model
6.4.3.3. Adding Piecewise Continuous Propagation
6.4.4. Estimating the Topological Model for Central Jersey
6.4.4.1. Results of the Regression
6.4.4.2. Comparison of the Models for Monmouth County, New Jersey
6.4.5. A Map of the United States Based on Connectivity
6.4.6. Estimating the Topological Model for the Midwest
6.4.6.1. Individual Time Series for the States
6.4.6.2. The Regional Reproduction Rate and Case Fatality Rate
6.4.6.3. Piecewise Constant Scale for the Reproduction Rate
6.4.6.4. Comparison of Estimates of the Number of InfectedPeople in Nebraska
6.4.6.5. Estimate of the Propagation Matrix β
6.5. Looking Back at this Work
6.5.1. How Public Coronavirus Data has Changed
6.5.2. Ex Post Analysis of My Coronavirus Predictions
6.5.2.1. Ex Ante Predictions
6.5.2.2. Ex Post Analysis
6.6. COVID Partisanship in the United States
6.6.1. State by State Results
6.6.2. Meta-Analysis of the Reproduction Rate
6.6.2.1. ANOVA Analysis
6.6.2.2. Panel Regression Analysis
6.7. Final Conclusions
Chapter 7. Theory
7.1. Some Remarks on the PDT Trading Algorithm
7.2. Cosine Similarity
7.2.1. Definition of Random Bitstreams
7.2.2. The Support and Inner Product of Bitstreams
7.2.2.1. Definition
7.2.2.2. Distribution
7.2.2.3. Relationship to the Inner Product
7.2.2.4. Covariance of the Inner Product with its Arguments
7.2.3. The Cosine Similarity of two Bitstreams
7.2.3.1. Definition
7.2.3.2. Relationship to the Support
7.2.3.3. A Tighter Upper Bound
7.2.3.4. A Normal Approximation to the Sampling Distribution of the Cosine Similarity
7.2.3.5. Numerical Simulations
7.2.3.6. Support Adjusted Cosine Similarity
7.3. The Construction and Properties of Ellipsoidal Probability Density Functions
7.3.1. The Distance Metric Framework
7.3.1.1. Normal and Multinormal Distributions
7.3.1.2. A Method for Generalization from Other UnivariateDistributions
7.3.2. A Generalization of Spherical Polar Coordinates to an Arbitrary Number of Dimensions
7.3.2.1. The Lower Dimensions
7.3.2.2. A General Scheme
7.3.2.3. Transformation of the Volume Integral
7.3.3. The Normalization of a Multivariate Distribution based on a Distance Metric Transformation of a Univariate Distribution
7.3.3.1. Solution for a General Distribution
7.3.3.2. Evaluation of the Product Factor
7.3.4. A Test Statistic for the Identification of Multivariate Distributions
7.3.4.1. Definition
7.3.4.2. An Example: The Multinormal Statistic
7.3.5. Measures of Location, Dispersion and Shape for Multivariate Distributions with Ellipsoidal Symmetry
7.3.5.1. The Population Mean, Mode and Median
7.3.5.2. The Population Covariance Matrix
7.3.5.3. Measures of Distributional Shape
7.3.6. The Characteristic Function and the MomentGenerating Function
7.3.6.1. The Characteristic Function
7.3.6.2. The Moment Generating Function
7.3.6.3. The Roots of the Gradient of the Moment Generating Function
7.3.6.4. Example: The Multinormal Solution
7.3.7. Maximum Likelihood Estimation
7.3.7.1. The Maximum Likelihood Estimator of the Population Mean
7.3.7.2. The Sample Mean Differential Equation
7.3.7.3. A General Solution to the Sample Mean Differential Equation
7.3.7.4. The Maximum Likelihood Estimator of the Covariance Parameter
7.3.7.5. The Maximum Likelihood Estimator of the Distributional Parameters
7.3.8. The Simulation of Multivariate Distributions
7.3.8.1. General Methodology
7.4. The Generalized Error Distribution
7.4.1. The Univariate Generalized Error Distribution
7.4.1.1. Definition
7.4.2. A Standardized Generalized Error Distribution
7.4.3. A Multivariate Generalization
7.4.3.1. Construction of a Multivariate Distribution
7.4.3.2. Moments of the Constructed Distribution
7.4.3.3. The Multivariate Kolmogorov Test Statistic
7.4.3.4. Maximum Likelihood Regression
7.5. Frictionless Asset Allocation with Ellipsoidal Distributions
7.5.1. Utility Theory and Portfolio Choice
7.5.1.1. Asset Allocation Under Uncertainty
7.5.1.2. Negative Exponential Utility and Frictionless Trading
7.5.2. Ellipsoidal Distributions
7.5.2.1. General Considerations
7.5.2.2. The Scaling Functions
7.5.2.3. The Optimal Portfolio
7.5.3. The Generalized Error Distribution
7.5.3.1. Calculation of the Scaling Functions for a Single Asset
7.6. Asset Allocation with Realistic Distributions of Returns
7.6.1. Markowitz Style Mean-Variance Efficient Portfolios
7.6.1.1. Mean Variance Efficient Portfolio Selection
7.6.2. Portfolio Selection in the Real World
7.6.3. The Motivation for this Work
Epilogue
E.1. The Nature of Business
E.2. The Analysis of Data
E.3. Summing Things Up
Appendix A. How I Store and Process Data
A.1. Databases
A.2. Programming and Analytical Languages
A.3. Analytical Workflows
A.4. Hardware Choices
Appendix B. Some of the Data Sources I’ve Used for This Book
B.1. Financial Data
B.2. Economic Data
B.3. Social Media and Internet Activity
B.4. Physical Data
B.5. Health and Demographics Data
B.6. Political Data
Bibliography
Index