This monograph explores the analysis and design of model-free optimal control systems based on reinforcement learning (RL) theory, presenting new methods that overcome recent challenges faced by RL. New developments in the design of sensor data efficient RL algorithms are demonstrated that not only reduce the requirement of sensors by means of output feedback, but also ensure optimality and stability guarantees. A variety of practical challenges are considered, including disturbance rejection, control constraints, and communication delays. Ideas from game theory are incorporated to solve output feedback disturbance rejection problems, and the concepts of low gain feedback control are employed to develop RL controllers that achieve global stability under control constraints.
Output Feedback Reinforcement Learning Control for Linear Systems will be a valuable reference for graduate students, control theorists working on optimal control systems, engineers, and applied mathematicians.
Author(s): Syed Ali Asad Rizvi, Zongli Lin
Series: Control Engineering
Publisher: Birkhäuser
Year: 2022
Language: English
Pages: 303
City: Cham
Preface
Contents
Notation and Acronyms
1 Introduction to Optimal Control and Reinforcement Learning
1.1 Introduction
1.2 Optimal Control of Dynamic Systems
1.2.1 Dynamic Programming Method
1.2.2 The Linear Quadratic Regulation Problem
1.2.3 Iterative Numerical Methods
1.3 Reinforcement Learning Based Optimal Control
1.3.1 Principles of Reinforcement Learning
1.3.2 Reinforcement Learning for Automatic Control
1.3.3 Advantages of Reinforcement Learning Control
Optimality and Adaptivity
Model-Free Control
Large Spectrum of Applications
1.3.4 Limitations of Reinforcement Learning Control
1.3.5 Reinforcement Learning Algorithms
1.4 Recent Developments and Challenges in Reinforcement Learning Control
1.4.1 State Feedback versus Output Feedback Designs
1.4.2 Exploration Signal/Noise and Estimation Bias
1.4.3 Discounted versus Undiscounted Cost Functions
1.4.4 Requirement of a Stabilizing Initial Policy
1.4.5 Optimal Tracking Problems
1.4.6 Reinforcement Learning in Continuous-Time
1.4.7 Disturbance Rejection
1.4.8 Distributed Reinforcement Learning
1.5 Notes and References
2 Model-Free Design of Linear Quadratic Regulator
2.1 Introduction
2.2 Literature Review
2.3 Discrete-Time LQR Problem
2.3.1 Iterative Schemes Based on State Feedback
2.3.2 Model-Free Output Feedback Solution
2.3.3 State Parameterization of Discrete-Time Linear Systems
2.3.4 Output Feedback Q-function for LQR
2.3.5 Output Feedback Based Q-learning for the LQR Problem
2.3.6 Numerical Examples
2.4 Continuous-Time LQR Problem
2.4.1 Model-Based Iterative Schemes for the LQR Problem
2.4.2 Model-Free Schemes Based on State Feedback
2.4.3 Model-Free Output Feedback Solution
2.4.4 State Parameterization
2.4.5 Learning Algorithms for Continuous-Time Output Feedback LQR Control
2.4.6 Exploration Bias Immunity of the Output Feedback Learning Algorithms
2.4.7 Numerical Examples
2.5 Summary
2.6 Notes and References
3 Model-Free H∞ Disturbance Rejection and Linear Quadratic Zero-Sum Games
3.1 Introduction
3.2 Literature Review
3.3 Discrete-Time Zero-Sum Game and H∞ Control
Problem
3.3.1 Model-Based Iterative Algorithms
3.3.2 State Parameterization of Discrete-Time Linear Systems Subject to Disturbances
3.3.3 Output Feedback Q-function for Zero-Sum Game
3.3.4 Output Feedback Based Q-learning for Zero-Sum Game
and H∞ Control Problem
3.3.5 A Numerical Example
3.4 Continuous-Time Zero-Sum Game and H∞ Control
Problem
3.4.1 Model-Based Iterative Schemes for Zero-Sum Game and
H∞ Control Problem
3.4.2 Model-Free Schemes Based on State Feedback
3.4.3 State Parameterization
3.4.4 Learning Algorithms for Output Feedback Differential
Zero-Sum Game and H∞ Control Problem
3.4.5 Exploration Bias Immunity of the Output Feedback Learning Algorithms
3.4.6 Numerical Examples
3.5 Summary
3.6 Notes and References
4 Model-Free Stabilization in the Presence of Actuator Saturation
4.1 Introduction
4.2 Literature Review
4.3 Global Asymptotic Stabilization of Discrete-Time Systems
4.3.1 Model-Based Iterative Algorithms
4.3.2 Q-learning Based Global Asymptotic Stabilization Using State Feedback
4.3.3 Q-learning Based Global Asymptotic Stabilization by Output Feedback
4.3.4 Numerical Simulation
4.4 Global Asymptotic Stabilization of Continuous-Time Systems
4.4.1 Model-Based Iterative Algorithms
4.4.2 Learning Algorithms for Global Asymptotic Stabilization by State Feedback
4.4.3 Learning Algorithms for Global Asymptotic Stabilization by Output Feedback
4.4.4 Numerical Simulation
4.5 Summary
4.6 Notes and References
5 Model-Free Control of Time Delay Systems
5.1 Introduction
5.2 Literature Review
5.3 Problem Description
5.4 Extended State Augmentation
5.5 State Feedback Q-learning Control of Time Delay Systems
5.6 Output Feedback Q-learning Control of Time Delay Systems
5.7 Numerical Simulation
5.8 Summary
5.9 Notes and References
6 Model-Free Optimal Tracking Control and Multi-Agent Synchronization
6.1 Introduction
6.2 Literature Review
6.3 Q-learning Based Linear Quadratic Tracking
6.4 Experience Replay Based Q-learning for Estimating the Optimal Feedback Gain
6.5 Adaptive Tracking Law
6.6 Multi-Agent Synchronization
6.7 Numerical Examples
6.8 Summary
6.9 Notes and References
References
Index