Having taken Professor Darwiche's course on Bayesian Networks, I was excited to get my hands on this book, which is a culmination of the notes from that class and his research on the subject.
This is an excellent text, with very clear explanations and step by step descriptions in pseudo code of the important algorithms in the text.
The first few chapters lay the probabilistic foundations needed for understanding Bayesian Networks and the conditional independences such networks encode.
Chapter 5 gives examples in several different domains of using Bayesian Networks to model different systems and answer queries about them.
After this, the book gets into the meat of its primary focus, efficient probabilistic inference in the context of Bayesian Networks.
It lays out various algorithms for exact inference using jointrees or recursive conditioning, and the complexity and trade-offs of the different approaches.
It further details further refinements that can reduce networks in some cases for even better performance.
After this, it details approximate inference techniques including sampling and belief propagation.
Chapter 14 on belief propagation is especially good, with its discussions on the semantics of belief propagation, generalized belief propagation, and an alternative formulation of generalized belief propagation edge deletion belief propagation.
The last few chapters also delve into learning Bayesian Networks structure and parameters.
All in all, this book will give an in depth knowledge of exact and approximate inference in Bayesian networks and a good overview of learning and applying these models to various domains.
Author(s): Adnan Darwiche
Edition: 1
Publisher: Cambridge University Press
Year: 2009
Language: English
Pages: 562
Tags: Математика;Теория вероятностей и математическая статистика;Математическая статистика;
Cover......Page 1
Half-title......Page 3
Title......Page 5
Copyright......Page 6
Contents......Page 7
Preface......Page 13
Acknowledgments......Page 14
1.1 Automated reasoning......Page 15
1.1.1 The limits of deduction......Page 16
1.1.2 Assumptions to the rescue......Page 17
1.2 Degrees of belief......Page 18
1.2.1 Deciding after believing......Page 19
1.3.1 Initial reactions......Page 20
1.3.2 A second chance......Page 21
1.4 Bayesian networks......Page 22
1.4.1 Modeling with Bayesian networks......Page 23
1.4.2 Reasoning with Bayesian networks......Page 24
Advanced inference algorithms......Page 25
1.5 What is not covered in this book......Page 26
2.2 Syntax of propositional sentences......Page 27
2.3.1 Worlds, models, and events......Page 29
2.3.2 Logical properties......Page 30
2.3.4 Equivalences and reductions......Page 31
2.4 The monotonicity of logical reasoning......Page 32
2.5 Multivalued variables......Page 33
2.6 Variable instantiations and related notations......Page 34
2.7 Logical forms......Page 35
2.7.2 Unit resolution......Page 37
Bibliographic remarks......Page 38
2.8 Exercises......Page 39
3.2 Degrees of belief......Page 41
3.2.1 Properties of beliefs......Page 42
3.3 Updating beliefs......Page 44
3.4 Independence......Page 48
3.4.1 Conditional independence......Page 49
3.4.2 Variable independence......Page 50
3.5 Further properties of beliefs......Page 51
3.6 Soft evidence......Page 53
3.6.1 The “all things considered” method......Page 54
3.6.2 The “nothing else considered” method......Page 55
3.6.3 More on specifying soft evidence......Page 57
3.6.4 Soft evidence as a noisy sensor......Page 58
3.7.1 Distribution and density functions......Page 60
3.7.2 The Bayes factor of a continuous observation......Page 61
Bibliographic remarks......Page 62
3.8 Exercises......Page 63
4.2 Capturing independence graphically......Page 67
4.3 Parameterizing the independence structure......Page 70
4.4 Properties of probabilistic independence......Page 72
4.5 A graphical test of independence......Page 77
4.5.1 Complexity of d-separation......Page 80
4.5.2 Soundness and completeness of d-separation......Page 81
4.6 More on DAGs and independence......Page 82
4.6.2 Independence MAPs......Page 83
Bibliographic remarks......Page 85
4.7 Exercises......Page 86
4.8 Proofs......Page 89
5.2.1 Probability of evidence......Page 90
5.2.2 Prior and posterior marginals......Page 92
Soft evidence......Page 94
5.2.3 Most probable explanation (MPE)......Page 95
5.2.4 Maximum a posteriori hypothesis (MAP)......Page 96
5.3 Modeling with Bayesian networks......Page 98
5.3.1 Diagnosis I: Model from expert......Page 99
5.3.2 Diagnosis II: Model from expert......Page 101
5.3.3 Sensitivity analysis......Page 103
5.3.4 Network granularity......Page 104
5.3.5 Diagnosis III: Model from design......Page 106
Fault modes revisited......Page 109
Posterior marginals......Page 110
Integrating time......Page 111
5.3.6 Reliability: Model from design......Page 112
Logical constraints......Page 114
Lifetime distributions and component reliability......Page 115
5.3.7 Channel coding......Page 116
MAP or posterior-marginal (PM) decoders?......Page 117
Noise models and soft evidence......Page 118
Convolutional codes......Page 119
Turbo codes......Page 121
5.3.8 Commonsense knowledge......Page 122
5.3.9 Genetic linkage analysis......Page 123
Recombination events......Page 124
Genetic linkage and gene maps......Page 125
From pedigrees to Bayesian networks......Page 126
5.4 Dealing with large CPTs......Page 128
5.4.1 Micro models......Page 129
Decision trees and graphs......Page 131
Deterministic CPTs......Page 132
5.5 The significance of network parameters......Page 133
Bibliographic remarks......Page 135
5.6 Exercises......Page 136
6.2 The process of elimination......Page 140
6.3 Factors......Page 142
6.4 Elimination as a basis for inference......Page 145
6.5 Computing prior marginals......Page 147
6.6 Choosing an elimination order......Page 149
6.7 Computing posterior marginals......Page 152
6.8 Network structure and complexity......Page 155
6.9.1 Pruning nodes......Page 157
6.9.3 Network pruning......Page 158
6.9.4 Computing marginals after pruning: An example......Page 160
6.10 Bucket elimination......Page 161
6.11 Exercises......Page 162
6.12 Proofs......Page 165
7.1 Introduction......Page 166
7.2 Factor elimination......Page 167
7.3 Elimination trees......Page 169
7.4 Separators and clusters......Page 171
7.5 A message-passing formulation......Page 173
7.5.1 Multiple marginals and message reuse......Page 175
7.5.3 Joint marginals and evidence......Page 176
7.5.4 The polytree algorithm......Page 177
7.6 The jointree connection......Page 178
7.7 The jointree algorithm: A classical view......Page 180
7.7.1 The Shenoy-Shafer architecture......Page 181
7.7.2 Using nonminimal jointrees......Page 182
7.7.3 Complexity of the Shenoy-Shafer architecture......Page 183
7.7.4 The Hugin architecture......Page 184
7.7.5 Complexity of the Hugin architecture......Page 185
Bibliographic remarks......Page 186
7.8 Exercises......Page 187
7.9 Proofs......Page 190
8.2 Cutset conditioning......Page 192
8.3 Recursive conditioning......Page 195
8.3.1 Recursive decomposition......Page 196
8.3.2 Caching computations......Page 199
8.4 Any-space inference......Page 202
8.5 Decomposition graphs......Page 203
8.6 The cache allocation problem......Page 206
8.6.1 Cache allocation by systematic search......Page 207
A lower bound......Page 208
8.6.2 Greedy cache allocation......Page 209
Bibliographic remarks......Page 210
8.7 Exercises......Page 211
8.8 Proofs......Page 212
9.2 Moral graphs......Page 216
9.3 Elimination orders......Page 217
9.3.1 Elimination heuristics......Page 219
9.3.2 Optimal elimination prefixes......Page 220
9.3.3 Lower bounds on treewidth......Page 222
Depth-first search......Page 223
Best-first search......Page 224
9.3.5 From jointrees to elimination orders......Page 227
9.3.6 From dtrees to elimination orders......Page 228
9.4 Jointrees......Page 230
9.4.1 From elimination orders to jointrees......Page 231
Constructing the clusters......Page 232
Assembling the jointree......Page 233
9.4.2 From dtrees to jointrees......Page 236
9.4.3 Jointrees as a factorization tool......Page 237
9.5 Dtrees......Page 238
9.5.1 From elimination orders to dtrees......Page 239
9.5.2 Constructing dtrees by hypergraph partitioning......Page 241
9.5.3 Balancing dtrees......Page 242
9.6 Triangulated graphs......Page 243
Bibliographic remarks......Page 245
9.7 Exercises......Page 246
9.8 Lemmas......Page 248
9.9 Proofs......Page 250
10.1 Introduction......Page 257
10.2 Computing MPE instantiations......Page 258
10.2.1 Computing MPE by variable elimination......Page 259
Recovering an MPE instantiation......Page 261
An example of computing MPE......Page 263
10.2.2 Computing MPE by systematic search......Page 264
Branch-and-bound search......Page 266
Generating upper bounds by node splitting......Page 267
Reducing the search space......Page 269
10.2.3 Reduction to W-MAXSAT......Page 271
MAP and constrained width......Page 272
10.3.2 Computing MAP by systematic search......Page 274
Improving the bound......Page 276
10.3.3 Computing MAP by local search......Page 277
Bibliographic remarks......Page 278
10.4 Exercises......Page 279
10.5 Proofs......Page 281
11.1 Introduction......Page 284
11.2 Complexity classes......Page 285
11.3 Showing hardness......Page 286
Membership of D-MAP in NPPP......Page 288
11.5 Complexity of MAP on polytrees......Page 289
11.6 Reducing probability of evidence to weighted model counting......Page 290
11.6.1 The first encoding......Page 291
11.6.2 The second encoding......Page 293
11.7 Reducing MPE to W-MAXSAT......Page 294
11.8 Exercises......Page 297
11.9 Proofs......Page 298
12.1 Introduction......Page 301
12.2 Circuit semantics......Page 303
12.3 Circuit propagation......Page 305
12.3.1 Evaluation and differentiation passes......Page 307
12.3.2 Computing MPEs......Page 310
12.4 Circuit compilation......Page 314
12.4.1 The circuits of variable elimination......Page 315
12.4.2 Circuits embedded in a jointree......Page 316
12.4.3 The circuits of CNF encodings......Page 318
12.5 Exercises......Page 320
12.6 Proofs......Page 323
13.2 The impact of local structure on inference complexity......Page 327
13.2.2 Determinism......Page 330
13.2.3 Evidence......Page 331
13.3.1 Encoding network structure......Page 333
One parameters......Page 334
Putting it all together......Page 335
13.3.3 Encoding evidence......Page 336
13.4 Conditioning with local structure......Page 337
13.4.2 Context-specific caching......Page 338
13.4.3 Determinism......Page 339
13.5 Elimination with local structure......Page 340
13.5.1 Algebraic decision diagrams......Page 341
13.5.2 ADD operations......Page 342
Restrict......Page 343
From tabular to ADD representations......Page 345
13.5.3 Variable elimination using ADDs......Page 346
13.5.4 Compiling arithmetic circuits using ADDs......Page 348
Bibliographic remarks......Page 350
13.6 Exercises......Page 351
14.2 The belief propagation algorithm......Page 354
14.3 Iterative belief propagation......Page 357
14.4.1 The Kullback-Leibler divergence......Page 360
14.4.2 Optimizing the KL divergence......Page 361
14.5 Generalized belief propagation......Page 363
14.6 Joingraphs......Page 364
14.7 Iterative joingraph propagation......Page 366
14.8 Edge-deletion semantics of belief propagation......Page 368
14.8.1 Edge parameters......Page 369
14.8.3 Searching for edge parameters......Page 370
14.8.4 Choosing edges to delete (or recover)......Page 373
14.8.5 Approximating the probability of evidence......Page 376
Bibliographic remarks......Page 378
14.9 Exercises......Page 379
14.10 Proofs......Page 384
15.2 Simulating a Bayesian network......Page 392
15.3 Expectations......Page 395
15.3.1 Probability as an expectation......Page 397
15.4 Direct sampling......Page 399
15.4.1 Bounds on the absolute error......Page 400
15.4.2 Bounds on the relative error......Page 401
15.4.3 Rao-Blackwell sampling......Page 405
15.5 Estimating a conditional probability......Page 406
15.6 Importance sampling......Page 407
15.6.1 Likelihood weighting......Page 409
15.6.2 Particle filtering......Page 410
Likelihood weighting......Page 412
Using particles......Page 413
Particle filtering on more general DBNs......Page 414
15.7 Markov chain simulation......Page 415
15.7.1 Markov chains......Page 416
15.7.2 Gibbs sampling......Page 417
Computing Gibbs estimates......Page 418
15.7.3 A Gibbs sampler for Bayesian networks......Page 419
Bibliographic remarks......Page 421
15.8 Exercises......Page 422
15.9 Proofs......Page 425
16.2 Query robustness......Page 431
16.2.1 Network-independent robustness......Page 432
More on single parameter changes......Page 436
16.2.2 Network-specific robustness......Page 437
16.2.3 Robustness of MPEs......Page 438
16.3 Query control......Page 441
16.3.1 Single parameter changes......Page 442
16.3.2 Multiple parameter changes......Page 444
Bibliographic remarks......Page 447
16.4 Exercises......Page 448
16.5 Proofs......Page 449
17.1 Introduction......Page 453
17.2 Estimating parameters from complete data......Page 455
17.3 Estimating parameters from incomplete data......Page 458
17.3.1 Expectation maximization......Page 460
17.3.2 Gradient ascent......Page 465
17.3.3 The missing-data mechanism......Page 467
17.4 Learning network structure......Page 469
17.4.1 Learning tree structures......Page 470
17.4.2 Learning DAG structures......Page 472
17.5 Searching for network structure......Page 475
17.5.1 Local search......Page 476
Greedy search......Page 477
Optimal search......Page 478
Bibliographic remarks......Page 480
17.6 Exercises......Page 481
17.7 Proofs......Page 484
18.1 Introduction......Page 491
18.2 Meta-networks......Page 493
18.2.2 Data as evidence......Page 494
18.2.3 Parameter independence......Page 495
18.3 Learning with discrete parameter sets......Page 496
18.3.1 Computing Bayesian estimates......Page 497
18.3.2 Closed forms for complete data......Page 498
18.3.3 Dealing with incomplete data......Page 499
Using MAP estimates......Page 500
18.4.1 Dirichlet priors......Page 503
18.4.2 The semantics of continuous parameter sets......Page 505
18.4.3 Bayesian learning......Page 506
18.4.4 Computing Bayesian estimates......Page 507
18.4.5 Closed forms for complete data......Page 508
Using MAP estimates......Page 509
The marginal likelihood: Large-sample approximations......Page 510
18.5 Learning network structure......Page 512
18.5.1 The BD score......Page 514
18.5.2 The BDe score......Page 515
18.5.3 Searching for a network structure......Page 517
Bibliographic remarks......Page 518
18.6 Exercises......Page 519
18.7 Proofs......Page 522
Logic......Page 529
Factors......Page 530
Appendix B Concepts from Information Theory......Page 531
Appendix C Fixed Point Iterative Methods......Page 534
Appendix D Constrained Optimization......Page 537
Bibliography......Page 541
Index......Page 555