Bayesian Networks: An Introduction (Wiley Series in Probability and Statistics)

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Bayesian Networks: An Introduction provides a self-contained introduction to the theory and applications of Bayesian networks, a topic of interest and importance for statisticians, computer scientists and those involved in modelling complex data sets. The material has been extensively tested in classroom teaching and assumes a basic knowledge of probability, statistics and mathematics. All notions are carefully explained and feature exercises throughout.

Features include:

  • An introduction to Dirichlet Distribution, Exponential Families and their applications.
  • A detailed description of learning algorithms and Conditional Gaussian Distributions using Junction Tree methods.
  • A discussion of Pearl's intervention calculus, with an introduction to the notion of see and do conditioning.
  • All concepts are clearly defined and illustrated with examples and exercises. Solutions are provided online.

This book will prove a valuable resource for postgraduate students of statistics, computer engineering, mathematics, data mining, artificial intelligence, and biology.

Researchers and users of comparable modelling or statistical techniques such as neural networks will also find this book of interest.

Author(s): Timo Koski, John Noble
Series: Wiley Series in Probability and Statistics
Edition: 1
Publisher: Wiley
Year: 2009

Language: English
Pages: 368

Bayesian Networks......Page 5
Contents......Page 7
Preface......Page 11
1.1 Introduction......Page 13
1.2 Axioms of probability and basic notations......Page 16
1.3 The Bayes update of probability......Page 21
1.4 Inductive learning......Page 23
1.4.1 Bayes’ rule......Page 24
1.4.3 Pearl’s method of virtual evidence......Page 25
1.5 Interpretations of probability and Bayesian networks......Page 26
1.6 Learning as inference about parameters......Page 27
1.7 Bayesian statistical inference......Page 29
1.8 Tossing a thumb-tack......Page 32
1.9 Multinomial sampling and the Dirichlet integral......Page 36
Notes......Page 40
Exercises: Probabilistic theories of causality, Bayes’ rule, multinomial sampling and the Dirichlet density......Page 43
2.1 Joint probabilities......Page 49
2.2 Conditional independence......Page 50
2.3.1 Graphs......Page 53
2.3.2 Directed acyclic graphs and probability distributions......Page 57
2.4 The Bayes ball......Page 62
2.4.1 Illustrations......Page 63
2.5 Potentials......Page 65
2.6 Bayesian networks......Page 70
2.7 Object oriented Bayesian networks......Page 75
2.8 d-Separation and conditional independence......Page 78
2.9 Markov models and Bayesian networks......Page 79
2.10 I-maps and Markov equivalence......Page 81
2.10.1 The trek and a distribution without a faithful graph......Page 84
Notes......Page 85
Exercises: Conditional independence and d-separation......Page 87
3 Evidence, sufficiency and Monte Carlo methods......Page 93
3.1 Hard evidence......Page 94
3.2 Soft evidence and virtual evidence......Page 97
3.2.1 Jeffrey’s rule......Page 98
3.2.2 Pearl’s method of virtual evidence......Page 99
3.3 Queries in probabilistic inference......Page 100
3.4 Bucket elimination......Page 101
3.5.1 Bayesian sufficient statistics......Page 104
3.5.2 Prediction sufficiency......Page 107
3.5.3 Prediction sufficiency for a Bayesian network......Page 109
3.6 Time variables......Page 110
3.7 A brief introduction to Markov chain Monte Carlo methods......Page 112
3.7.1 Simulating a Markov chain......Page 115
3.7.2 Irreducibility, aperiodicity and time reversibility......Page 116
3.7.3 The Metropolis-Hastings algorithm......Page 120
3.7.4 The one-dimensional discrete Metropolis algorithm......Page 123
Notes......Page 124
Exercises: Evidence, sufficiency and Monte Carlo methods......Page 125
4 Decomposable graphs and chain graphs......Page 135
4.1 Definitions and notations......Page 136
4.2 Decomposable graphs and triangulation of graphs......Page 139
4.3 Junction trees......Page 143
4.4 Markov equivalence......Page 145
4.5 Markov equivalence, the essential graph and chain graphs......Page 150
Notes......Page 156
Exercises: Decomposable graphs and chain graphs......Page 157
5.1 Initial illustration: maximum likelihood estimate for a fork connection......Page 161
5.2 The maximum likelihood estimator for multinomial sampling......Page 163
5.3 MLE for the parameters in a DAG: the general setting......Page 167
5.4 Updating, missing data, fractional updating......Page 172
Notes......Page 173
Exercises: Learning the conditional probability potentials......Page 174
6 Learning the graph structure......Page 179
6.1 Assigning a probability distribution to the graph structure......Page 180
6.2 Markov equivalence and consistency......Page 183
6.2.1 Establishing the DAG isomorphic property......Page 185
6.3 Reducing the size of the search......Page 188
6.3.1 The Chow-Liu tree......Page 189
6.3.2 The Chow-Liu tree: A predictive approach......Page 191
6.3.3 The K2 structural learning algorithm......Page 195
6.3.4 The MMHC algorithm......Page 196
6.4 Monte Carlo methods for locating the graph structure......Page 198
6.5 Women in mathematics......Page 201
Notes......Page 203
Exercises: Learning the graph structure......Page 204
7 Parameters and sensitivity......Page 209
7.1 Changing parameters in a network......Page 210
7.2 Measures of divergence between probability distributions......Page 213
7.3 The Chan-Darwiche distance measure......Page 214
7.3.1 Comparison with the Kullback-Leibler divergence and euclidean distance......Page 221
7.3.2 Global bounds for queries......Page 222
7.3.3 Applications to updating......Page 224
7.4 Parameter changes to satisfy query constraints......Page 228
7.4.1 Binary variables......Page 230
7.5 The sensitivity of queries to parameter changes......Page 232
Notes......Page 236
Exercises: Parameters and sensitivity......Page 237
8.1 Introduction to exponential families......Page 241
8.2 Standard examples of exponential families......Page 243
8.3 Graphical models and exponential families......Page 245
8.4 Noisy ‘or’ as an exponential family......Page 246
8.5 Properties of the log partition function......Page 249
8.6 Fenchel Legendre conjugate......Page 251
8.7 Kullback-Leibler divergence......Page 253
8.8 Mean field theory......Page 255
8.9 Conditional Gaussian distributions......Page 258
8.9.2 Some results on marginalization......Page 261
8.9.3 CG regression......Page 262
Notes......Page 263
Exercises: Graphical models and exponential families......Page 264
9.1 Introduction......Page 267
9.2 Conditioning by observation and by intervention......Page 269
9.3 The intervention calculus for a Bayesian network......Page 270
9.4 Properties of intervention calculus......Page 274
9.5 Transformations of probability......Page 277
9.6 A note on the order of ‘see’ and ‘do’ conditioning......Page 279
9.7 The ‘Sure Thing’ principle......Page 280
9.8 Back door criterion, confounding and identifiability......Page 282
Notes......Page 285
Exercises: Causality and intervention calculus......Page 287
10.1 Probability updating using a junction tree......Page 291
10.2 Potentials and the distributive law......Page 292
10.2.1 Marginalization and the distributive law......Page 295
10.3 Elimination and domain graphs......Page 296
10.4 Factorization along an undirected graph......Page 300
10.5 Factorizing along a junction tree......Page 302
10.5.1 Flow of messages initial illustration......Page 304
10.6 Local computation on junction trees......Page 306
10.7 Schedules......Page 308
10.8 Local and global consistency......Page 314
10.9 Message passing for conditional Gaussian distributions......Page 317
10.10 Using a junction tree with virtual evidence and soft evidence......Page 323
Notes......Page 325
Exercises: The junction tree and probability updating......Page 326
11.1 Factorization and local potentials......Page 331
11.1.1 Examples of factor graphs......Page 332
11.2 The sum product algorithm......Page 335
11.3 Detailed illustration of the algorithm......Page 341
Notes......Page 344
Exercise: Factor graphs and the sum product algorithm......Page 345
References......Page 347
Index......Page 355