Algorithms for Decision Making

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

A broad introduction to algorithms for decision making under uncertainty, introducing the underlying mathematical problem formulations and the algorithms for solving them.

Automated decision-making systems or decision-support systems—used in applications that range from aircraft collision avoidance to breast cancer screening—must be designed to account for various sources of uncertainty while carefully balancing multiple objectives. This textbook provides a broad introduction to algorithms for decision making under uncertainty, covering the underlying mathematical problem formulations and the algorithms for solving them.
 
The book first addresses the problem of reasoning about uncertainty and objectives in simple decisions at a single point in time, and then turns to sequential decision problems in stochastic environments where the outcomes of our actions are uncertain. It goes on to address model uncertainty, when we do not start with a known model and must learn how to act through interaction with the environment; state uncertainty, in which we do not know the current state of the environment due to imperfect perceptual information; and decision contexts involving multiple agents. The book focuses primarily on planning and reinforcement learning, although some of the techniques presented draw on elements of supervised learning and optimization. Algorithms are implemented in the Julia programming language. Figures, examples, and exercises convey the intuition behind the various approaches presented.

Author(s): Mykel J. Kochenderfer, Tim A. Wheeler, Kyle H. Wray
Edition: 1
Publisher: The MIT Press
Year: 2022

Language: English
Commentary: This work is subject to a Creative Commons CC-BY-NC-ND license and can be downloaded from https://algorithmsbook.com/
Pages: 700
City: Cambridge, MA
Tags: Algorithms; Genetic Algorithms; Bayesian Networks; Bayesian Inference; Kalman Filtering; Decision Making; Statistics; Gradient Descent; Numerical Methods; Maximum Likelihood Estimation; Applied Mathematics; Uncertainty; Naïve Bayes; Multi-Agent Systems

Preface
Acknowledgments
Introduction
Decision Making
Applications
Methods
History
Societal Impact
Overview
I Probabilistic Reasoning
Representation
Degrees of Belief and Probability
Probability Distributions
Joint Distributions
Conditional Distributions
Bayesian Networks
Conditional Independence
Summary
Exercises
Inference
Inference in Bayesian Networks
Inference in Naive Bayes Models
Sum-Product Variable Elimination
Belief Propagation
Computational Complexity
Direct Sampling
Likelihood Weighted Sampling
Gibbs Sampling
Inference in Gaussian Models
Summary
Exercises
Parameter Learning
Maximum Likelihood Parameter Learning
Bayesian Parameter Learning
Nonparametric Learning
Learning with Missing Data
Summary
Exercises
Structure Learning
Bayesian Network Scoring
Directed Graph Search
Markov Equivalence Classes
Partially Directed Graph Search
Summary
Exercises
Simple Decisions
Constraints on Rational Preferences
Utility Functions
Utility Elicitation
Maximum Expected Utility Principle
Decision Networks
Value of Information
Irrationality
Summary
Exercises
II Sequential Problems
Exact Solution Methods
Markov Decision Processes
Policy Evaluation
Value Function Policies
Policy Iteration
Value Iteration
Asynchronous Value Iteration
Linear Program Formulation
Linear Systems with Quadratic Reward
Summary
Exercises
Approximate Value Functions
Parametric Representations
Nearest Neighbor
Kernel Smoothing
Linear Interpolation
Simplex Interpolation
Linear Regression
Neural Network Regression
Summary
Exercises
Online Planning
Receding Horizon Planning
Lookahead with Rollouts
Forward Search
Branch and Bound
Sparse Sampling
Monte Carlo Tree Search
Heuristic Search
Labeled Heuristic Search
Open-Loop Planning
Summary
Exercises
Policy Search
Approximate Policy Evaluation
Local Search
Genetic Algorithms
Cross Entropy Method
Evolution Strategies
Isotropic Evolutionary Strategies
Summary
Exercises
Policy Gradient Estimation
Finite Difference
Regression Gradient
Likelihood Ratio
Reward-to-Go
Baseline Subtraction
Summary
Exercises
Policy Gradient Optimization
Gradient Ascent Update
Restricted Gradient Update
Natural Gradient Update
Trust Region Update
Clamped Surrogate Objective
Summary
Exercises
Actor-Critic Methods
Actor-Critic
Generalized Advantage Estimation
Deterministic Policy Gradient
Actor-Critic with Monte Carlo Tree Search
Summary
Exercises
Policy Validation
Performance Metric Evaluation
Rare Event Simulation
Robustness Analysis
Trade Analysis
Adversarial Analysis
Summary
Exercises
III Model Uncertainty
Exploration and Exploitation
Bandit Problems
Bayesian Model Estimation
Undirected Exploration Strategies
Directed Exploration Strategies
Optimal Exploration Strategies
Exploration with Multiple States
Summary
Exercises
Model-Based Methods
Maximum Likelihood Models
Update Schemes
Exploration
Bayesian Methods
Bayes-Adaptive Markov Decision Processes
Posterior Sampling
Summary
Exercises
Model-Free Methods
Incremental Estimation of the Mean
Q-Learning
Sarsa
Eligibility Traces
Reward Shaping
Action Value Function Approximation
Experience Replay
Summary
Exercises
Imitation Learning
Behavioral Cloning
Data Set Aggregation
Stochastic Mixing Iterative Learning
Maximum Margin Inverse Reinforcement Learning
Maximum Entropy Inverse Reinforcement Learning
Generative Adversarial Imitation Learning
Summary
Exercises
IV State Uncertainty
Beliefs
Belief Initialization
Discrete State Filter
Kalman Filter
Extended Kalman Filter
Unscented Kalman Filter
Particle Filter
Particle Injection
Summary
Exercises
Exact Belief State Planning
Belief-State Markov Decision Processes
Conditional Plans
Alpha Vectors
Pruning
Value Iteration
Linear Policies
Summary
Exercises
Offline Belief State Planning
Fully Observable Value Approximation
Fast Informed Bound
Fast Lower Bounds
Point-Based Value Iteration
Randomized Point-Based Value Iteration
Sawtooth Upper Bound
Point Selection
Sawtooth Heuristic Search
Triangulated Value Functions
Summary
Exercises
Online Belief State Planning
Lookahead with Rollouts
Forward Search
Branch and Bound
Sparse Sampling
Monte Carlo Tree Search
Determinized Sparse Tree Search
Gap Heuristic Search
Summary
Exercises
Controller Abstractions
Controllers
Policy Iteration
Nonlinear Programming
Gradient Ascent
Summary
Exercises
V Multiagent Systems
Multiagent Reasoning
Simple Games
Response Models
Dominant Strategy Equilibrium
Nash Equilibrium
Correlated Equilibrium
Iterated Best Response
Hierarchical Softmax
Fictitious Play
Gradient Ascent
Summary
Exercises
Sequential Problems
Markov Games
Response Models
Nash Equilibrium
Fictitious Play
Gradient Ascent
Nash Q-Learning
Summary
Exercises
State Uncertainty
Partially Observable Markov Games
Policy Evaluation
Nash Equilibrium
Dynamic Programming
Summary
Exercises
Collaborative Agents
Decentralized Partially Observable Markov Decision Processes
Subclasses
Dynamic Programming
Iterated Best Response
Heuristic Search
Nonlinear Programming
Summary
Exercises
Appendices
Mathematical Concepts
Measure Spaces
Probability Spaces
Metric Spaces
Normed Vector Spaces
Positive Definiteness
Convexity
Information Content
Entropy
Cross Entropy
Relative Entropy
Gradient Ascent
Taylor Expansion
Monte Carlo Estimation
Importance Sampling
Contraction Mappings
Graphs
Probability Distributions
Computational Complexity
Asymptotic Notation
Time Complexity Classes
Space Complexity Classes
Decidability
Neural Representations
Neural Networks
Feedforward Networks
Parameter Regularization
Convolutional Neural Networks
Recurrent Networks
Autoencoder Networks
Adversarial Networks
Search Algorithms
Search Problems
Search Graphs
Forward Search
Branch and Bound
Dynamic Programming
Heuristic Search
Problems
Hex World
2048
Cart-Pole
Mountain Car
Simple Regulator
Aircraft Collision Avoidance
Crying Baby
Machine Replacement
Catch
Prisoner's Dilemma
Rock-Paper-Scissors
Traveler's Dilemma
Predator-Prey Hex World
Multicaregiver Crying Baby
Collaborative Predator-Prey Hex World
Julia
Types
Functions
Control Flow
Packages
Convenience Functions
References
Index