This book presents sequential decision theory from a novel algorithmic information theory perspective. While the former is suited for active agents in known environments, the latter is suited for passive prediction in unknown environments. The book introduces these two different ideas and removes the limitations by unifying them to one parameter-free theory of an optimal reinforcement learning agent embedded in an unknown environment. Most AI problems can easily be formulated within this theory, reducing the conceptual problems to pure computational ones. Considered problem classes include sequence prediction, strategic games, function minimization, reinforcement and supervised learning. The discussion includes formal definitions of intelligence order relations, the horizon problem and relations to other approaches. One intention of this book is to excite a broader AI audience about abstract algorithmic information theory concepts, and conversely to inform theorists about exciting applications to AI.
Author(s): Marcus Hutter
Series: Texts in Theoretical Computer Science. An EATCS Series
Publisher: Springer
Year: 2005
Language: English
Pages: 301
Cover......Page 1
Universal
Artificial Intelligence......Page 4
Texts in Theoretical Computer Science
An EATCS Series......Page 2
ISBN 3540221395......Page 5
Preface......Page 6
Contents......Page 10
Tables, Figures, Theorems, …......Page 16
Notation......Page 18
1 A Short Tour Through the Book......Page 22
1.1 Introduction......Page 23
1.2.1 Introduction......Page 24
1.2.2 Algorithmic Information Theory......Page 25
1.2.3 Uncertainty & Probabilities......Page 26
1.2.4 Algorithmic Probability & Universal Induction......Page 27
1.3 Universal Sequence Prediction......Page 28
1.3.2 Loss Bounds......Page 29
1.3.3 Optimality Properties......Page 30
1.3.4 Miscellaneous......Page 31
1.4.2 Value Functions & Optimal Policies......Page 32
1.4.3 Sequential Decision Theory & Reinforcement Learning......Page 33
1.5.1 The Universal AIXI Model......Page 34
1.5.2 On the Optimality of AIXI......Page 35
1.5.3 Value-Related Optimality Results......Page 36
1.5.4 Markov Decision Processes......Page 38
1.6.1 Introduction......Page 39
1.6.5 Supervised Learning from Examples (EX)......Page 40
1.7.1 The Fastest & Shortest Algorithm for All Well-Defined Problems......Page 41
1.7.2 Time-Bounded AIXI Model......Page 43
1.8 Discussion......Page 45
1.9 History & References......Page 47
2 Simplicity & Uncertainty......Page 50
2.1.1 Examples of Induction Problems......Page 51
2.1.2 Ockham, Epicurus, Hume, Bayes, SolomonofF......Page 52
2.1.3 Problem Setup......Page 53
2.2.1 Definitions and Notation......Page 54
2.2.2 Turing Machines......Page 55
2.2.3 Kolmogorov Complexity......Page 57
2.2.4 Computability Concepts......Page 59
2.3.1 Frequency Interpretation: Counting......Page 61
2.3.2 Objective Interpretation: Probabilities to Describe Uncertain Events......Page 62
2.3.3 Subjective Interpretation: Probabilities to Describe Degrees of Belief......Page 64
2.4.1 The Universal Prior M......Page 66
2.4.2 Universal Sequence Prediction......Page 68
2.4.3 Universal (Semi)Measures......Page 69
2.4.4 Martin-Löf Randomness......Page 75
2.5 History & References......Page 76
2.6 Problems......Page 81
3 Universal Sequence Prediction......Page 86
3.1 Introduction......Page 88
3.2.1 Random Sequences......Page 89
3.2.2 Universal Prior Probability Distribution......Page 90
3.2.3 Universal Posterior Probability Distribution......Page 91
3.2.4 Convergence of Random Sequences......Page 92
3.2.5 Distance Measures between Probability Distributions......Page 93
3.2.6 Convergence of ξ to μ......Page 95
3.2.7 Convergence in Martin-Löf Sense......Page 97
3.2.8 The Case where μ ∉ ℳ......Page 101
3.2.9 Probability Classes ℳ......Page 102
3.3.2 Total Expected Numbers of Errors......Page 103
3.3.3 Proof of Theorem 3.36......Page 105
3.4.1 Unit Loss Function......Page 107
3.4.2 Loss Bound of Merhav & Feder......Page 109
3.4.4 Proof of Theorem 3.48......Page 110
3.4.5 Convergence of Instantaneous Losses......Page 112
3.4.6 General Loss......Page 113
3.5.1 Introduction......Page 114
3.5.2 Games of Chance......Page 115
3.5.4 Information-Theoretic Interpretation......Page 116
3.6.1 Lower Error Bound......Page 117
3.6.2 Pareto Optimality of ξ......Page 120
3.6.3 Balanced Pareto Optimality of ξ......Page 122
3.6.4 On the Optimal Choice of Weights......Page 123
3.7 Miscellaneous......Page 124
3.7.1 Multistep Predictions......Page 125
3.7.2 Continuous Probability Classes ℳ......Page 127
3.7.4 Prediction with Expert Advice......Page 129
3.7.5 Outlook......Page 131
3.8 Summary......Page 132
3.9.1 How to Deal with μ = 0......Page 133
3.9.2 Entropy Inequalities (Lemma 3.11)......Page 134
3.9.3 Error Inequality (Theorem 3.36)......Page 136
3.9.4 Binary Loss Inequality for z≤½ (3.57)......Page 137
3.9.6 General Loss Inequality (3.53)......Page 138
3.11 Problems......Page 140
4 Agents in Known Probabilistic Environments......Page 146
4.1.1 The Cybernetic Agent Model......Page 147
4.1.3 AI Model for Known Deterministic Environment......Page 149
4.1.4 AI Model for Known Prior Probability......Page 151
4.2.1 Probability Distributions......Page 153
4.2.2 Explicit Form of the AIμ Model......Page 154
4.2.3 Equivalence of Functional and Explicit AI Model......Page 155
4.3.1 Factorizable Environments......Page 156
4.3.2 Constants and Limits......Page 159
4.3.3 Sequential Decision Theory......Page 160
4.4 Problems......Page 161
5 The Universal Algorithmic Agent AIXI......Page 162
5.1.1 Definition of the AIXI Model......Page 163
5.1.2 Universality of M^AI and ξ^AI......Page 165
5.1.3 Convergence of ξ^AI to μ^AI......Page 166
5.1.4 Intelligence Order Relation......Page 167
5.2 On the Optimality of AIXI......Page 168
5.3.2 (Pseudo) Passive μ and the HeavenHell Example......Page 170
5.3.3 The OnlyOne Example......Page 171
5.3.4 Asymptotic Learnability......Page 172
5.3.6 Other Concepts......Page 173
5.4.1 The AIρ Models: Preliminaries......Page 174
5.4.2 Pareto Optimality of AIξ......Page 175
5.4.3 Self-Optimizing Policy ρ^ξ w.r.t. Average Value......Page 177
5.5 Discounted Future Value Function......Page 180
5.6 Markov Decision Processes (MDP)......Page 186
5.7 The Choice of the Horizon......Page 190
5.8 Outlook......Page 193
5.10 Converting Functions into Chronological Semimeasures......Page 194
5.11 Proof of the Entropy Inequality......Page 196
5.12 History & References......Page 198
5.13 Problems......Page 199
6 Important Environmental Classes......Page 206
6.1 Repetition of the AIμ/ξ Models......Page 207
6.2 Sequence Prediction (SP)......Page 208
6.2.1 Using the AIμ Model for Sequence Prediction......Page 209
6.2.2 Using the AIξ Model for Sequence Prediction......Page 211
6.3.1 Introduction......Page 213
6.3.3 Using the AIμ Model for Game Playing......Page 214
6.3.5 Using the AIξ Model for Game Playing......Page 216
6.4.1 Applications/Examples......Page 218
6.4.2 The Greedy Model FMGμ......Page 219
6.4.3 The General FMμ/ξ Model......Page 220
6.4.4 Is the General Model Inventive?......Page 222
6.4.5 Using the AI Models for Function Minimization......Page 223
6.4.6 Remark on TSP......Page 224
6.5.2 Supervised Learning with the AIμ/ξ Model......Page 225
6.6 Other Aspects of Intelligence......Page 227
6.7 Problems......Page 228
7 Computational Aspects......Page 230
7.1.1 Introduction & Main Result......Page 231
7.1.2 Levin Search......Page 233
7.1.3 Fast Matrix Multiplication......Page 234
7.1.4 Applicability of the Fast Algorithm M^ε_p*......Page 235
7.1.5 The Fast Algorithm M^ε_p*......Page 236
7.1.6 Time Analysis......Page 237
7.1.8 Algorithmic Complexity and the Shortest Algorithm......Page 239
7.1.10 Summary & Outlook......Page 241
7.2.1 Introduction......Page 242
7.2.2 Time-Limited Probability Distributions......Page 243
7.2.4 Extended Chronological Programs......Page 245
7.2.5 Valid Approximations......Page 246
7.2.7 The Universal Time-Bounded AIXItl Agent......Page 247
7.2.8 Limitations and Open Questions......Page 248
7.2.9 Remarks......Page 249
8 Discussion......Page 252
8.1.1 Results......Page 253
8.1.2 Comparison to Other Approaches......Page 255
8.2.1 Miscellaneous......Page 256
8.2.2 Prior Knowledge......Page 257
8.2.4 How AIXI(tl) Deals with Encrypted Information......Page 258
8.2.5 Mortal Embodied Agents......Page 259
8.3.1 On the Foundations of Machine Learning......Page 260
8.3.2 In a World Without Occam......Page 261
8.4 Outlook & Open Questions......Page 262
8.5 Assumptions, Problems, Limitations......Page 263
8.5.1 Assumptions......Page 264
8.5.3 Limitations......Page 265
8.6.2 On the Existence of Objective Probabilities......Page 266
8.6.3 Free Will versus Determinism......Page 267
8.7 Conclusions......Page 269
Bibliography......Page 272
Index......Page 286