Note to Readers
Welcome to The Probability Lifesaver. My goal is to write a book introducing
students to the material through lots of worked out examples and code, and to
have lots of conversations about not just why equations and theorems are true, but
why they have the form they do. In a sense, this is a sequel to Adrian Banner’s
successful The Calculus Lifesaver. In addition to many worked out problems, there
are frequent explanations of proofs of theorems, with great emphasis placed on
discussing why certain arguments are natural and why we should expect certain forms
for the answers. Knowing why something is true, and how someone thought to prove
it, makes it more likely for you to use it properly and discover new relations yourself.
The book highlights at great lengths the methods and techniques behind proofs, as
these will be useful for more than just a probability class. See, for example, the
extensive entries in the index on proof techniques, or the discussion on Markov’s
inequality in §17.1. There are also frequent examples of computer code to investigate
probabilities. This is the twenty-first century; if you cannot write simple code you
are at a competitive disadvantage. Writing short programs helps us check our math
in situations where we can get a closed form solution; more importantly, it allows
us to estimate the answer in situations where the analysis is very involved and nice
solutions may be hard to obtain (if possible at all!).
The book is designed to be used either as a supplement to any standard probability
book, or as the primary textbook. The first part of the book, comprising six chapters,
is an introduction to probability. The first chapter is meant to introduce many of the
themes through fun problems; we’ll encounter many of the key ideas of the subject
which we’ll see again and again. The next chapter then gives the basic probability
laws, followed by a chapter with examples. This way students get to real problems in
the subject quickly, and are not overloaded with the development of the theory. After
this examples chapter we have another theoretical chapter, followed by two more
examples loaded chapters (which of course do introduce some theory to tackle these
problems).
The next part is the core of most courses, introducing random variables. It starts
with a review of useful techniques, and then goes through the “standard” techniques
to study them.
Specific, special distributions are the focus of Part III. There are many more
distributions that can be added, but a line has to be drawn somewhere. There’s a
nice mix of continuous and discrete, and after reading these chapters you’ll be ready
to deal with whatever new distributions you meet.
The next part is on convergence theorems. As this is meant to supplement or
serve as a first course, we don’t get into as much detail as possible, but we do prove
Markov’s inequality, Chebyshev’s theorem, the Weak and Strong Laws of Large
Numbers, Stirling’s formula, and the Central Limit Theorem (CLT). The last is a
particularly important topic. As such, we give a lot of detail here and in an appendix,
as the needed techniques are of interest in their own right; for those interest in more
see the online resources (which include an advanced chapter on complex analysis
and the CLT).
The last part is a hodgepodge of material to give the reader and instructor
some flexibility. We start with a chapter on hypothesis testing, as many classes
are a combined probability and statistics course. We then do difference equations,
continuing a theme from Chapter 1. I really like the Method of Least Squares.
This is more statistics, but it’s a nice application of linear algebra and multivariable
calculus, and assuming independent Gaussian distribution of errors we get a chisquare
distribution, which makes it a nice fit in a probability course. We touch upon
some famous problems and give a quick guide to coding (there’s a more extensive
introduction to programming in the online supplemental notes). In the twenty-first
century you absolutely must be able to do basic coding. First, it’s a great way to
check your answers and find missing factors. Second, if you can code you can get a
feel for the answer, and that might help you in guessing the correct solution. Finally,
though, often there is no simple closed form solution, and we have no choice but to
resort to simulation to estimate the probability. This then connects nicely with the
first part of this section, hypothesis testing: if we have a conjectured answer, do our
simulations support it? Analyzing simulations and data are central in modern science,
and I strongly urge you to continue with a statistics course (or, even better, courses!).
Finally, there are very extensive appendixes. This is deliberate. A lot of people
struggle with probability because of issues with material and techniques from
previous courses, especially in proving theorems. This is why the first appendix on
proof techniques is so long and detailed. Next is a quick review of needed analysis
results, followed by one on countable and uncountable sets; in mathematics the
greatest difficulties are when we encounter infinities, and the purpose here is to give
a quick introduction to some occurrences of the infinite in probability. We then end
the appendices by briefly touching on how complex analysis arises in probability,
in particular, in what is needed to make our proofs of the Central Limit Theorem
rigorous. While this is an advanced appendix, it’s well worth the time as mastering it
will give you a great sense of what comes next in mathematics, as well as hopefully
help you appreciate the beauty and complexity of the subject.
There is a lot of additional material I’d love to include, but the book is already
quite long with all the details; fortunately they’re freely available on the Web and
I encourage you to consider them. Just go to
http://press.princeton.edu/titles/11041.html
for a wealth of resources, including all my previous courses (with videos of all
lectures and additional comments from each day).
Returning to the supplemental material, the first is a set of practice calculus
problems and solutions. Doing the problems is a great way of testing how well
you know the material we’ll need. There are also some advanced topics that are
beyond many typical first courses, but are accessible and thus great supplements.
Next is the Change of Variable formula. As many students forget almost all of their
Multivariable Calculus, it’s useful to have this material easily available online. Then
comes the distribution of longest runs. I’ve always loved that topic, and it illustrates
some powerful techniques. Next is the Median Theorem. Though the Central Limit
Theorem deservedly sits at the pinnacle of a course, there are times its conditions
are not met and thus the Median Theorem has an important role. Finally, there is the
Central Limit Theorem itself. In a first course we can only prove it in special cases,
which begs the question of what is needed for a full proof. Our purpose here is to
introduce you to some complex analysis, a wonderful topic in its own right, and both
get a sense of the proof and a motivation for continuing your mathematical journey
forward.
Enjoy!
Author(s): Steven J. Miller
Publisher: Princeton University Press
Year: 2017
Language: english
Pages: 737
City: Princeton, Oxford
Contents
Note to Readers xv
How to Use This Book xix
I General Theory 1
1 Introduction 3
1.1 Birthday Problem 4
1.1.1 Stating the Problem 4
1.1.2 Solving the Problem 6
1.1.3 Generalizing the Problem and Solution: Efficiencies 11
1.1.4 Numerical Test 14
1.2 From Shooting Hoops to the Geometric Series 16
1.2.1 The Problem and Its Solution 16
1.2.2 Related Problems 21
1.2.3 General Problem Solving Tips 25
1.3 Gambling 27
1.3.1 The 2008 Super Bowl Wager 28
1.3.2 Expected Returns 28
1.3.3 The Value of Hedging 29
1.3.4 Consequences 31
1.4 Summary 31
1.5 Exercises 34
2 Basic Probability Laws 40
2.1 Paradoxes 41
2.2 Set Theory Review 43
2.2.1 Coding Digression 47
2.2.2 Sizes of Infinity and Probabilities 48
2.2.3 Open and Closed Sets 50
2.3 Outcome Spaces, Events, and the Axioms of Probability 52
2.4 Axioms of Probability 57
2.5 Basic Probability Rules 59
2.5.1 Law of Total Probability 60
2.5.2 Probabilities of Unions 61
2.5.3 Probabilities of Inclusions 64
2.6 Probability Spaces and σ-algebras 65
2.7 Appendix: Experimentally Finding Formulas 70
2.7.1 Product Rule for Derivatives 71
2.7.2 Probability of a Union 72
2.8 Summary 73
2.9 Exercises 73
3 Counting I: Cards 78
3.1 Factorials and Binomial Coefficients 79
3.1.1 The Factorial Function 79
3.1.2 Binomial Coefficients 82
3.1.3 Summary 87
3.2 Poker 88
3.2.1 Rules 88
3.2.2 Nothing 90
3.2.3 Pair 92
3.2.4 Two Pair 95
3.2.5 Three of a Kind 96
3.2.6 Straights, Flushes, and Straight Flushes 96
3.2.7 Full House and Four of a Kind 97
3.2.8 Practice Poker Hand: I 98
3.2.9 Practice Poker Hand: II 100
3.3 Solitaire 101
3.3.1 Klondike 102
3.3.2 Aces Up 105
3.3.3 FreeCell 107
3.4 Bridge 108
3.4.1 Tic-tac-toe 109
3.4.2 Number of Bridge Deals 111
3.4.3 Trump Splits 117
3.5 Appendix: Coding to Compute Probabilities 120
3.5.1 Trump Split and Code 120
3.5.2 Poker Hand Codes 121
3.6 Summary 124
3.7 Exercises 124
4 Conditional Probability, Independence, and
Bayes’ Theorem 128
4.1 Conditional Probabilities 129
4.1.1 Guessing the Conditional Probability Formula 131
4.1.2 Expected Counts Approach 132
4.1.3 Venn Diagram Approach 133
4.1.4 The Monty Hall Problem 135
4.2 The General Multiplication Rule 136
4.2.1 Statement 136
4.2.2 Poker Example 136
4.2.3 Hat Problem and Error Correcting Codes 138
4.2.4 Advanced Remark: Definition of Conditional Probability 138
4.3 Independence 139
4.4 Bayes’ Theorem 142
4.5 Partitions and the Law of Total Probability 147
4.6 Bayes’ Theorem Revisited 150
4.7 Summary 151
4.8 Exercises 152
5 Counting II: Inclusion-Exclusion 156
5.1 Factorial and Binomial Problems 157
5.1.1 “How many” versus “What’s the probability” 157
5.1.2 Choosing Groups 159
5.1.3 Circular Orderings 160
5.1.4 Choosing Ensembles 162
5.2 The Method of Inclusion-Exclusion 163
5.2.1 Special Cases of the Inclusion-Exclusion Principle 164
5.2.2 Statement of the Inclusion-Exclusion Principle 167
5.2.3 Justification of the Inclusion-Exclusion Formula 168
5.2.4 Using Inclusion-Exclusion: Suited Hand 171
5.2.5 The At Least to Exactly Method 173
5.3 Derangements 176
5.3.1 Counting Derangements 176
5.3.2 The Probability of a Derangement 178
5.3.3 Coding Derangement Experiments 178
5.3.4 Applications of Derangements 179
5.4 Summary 181
5.5 Exercises 182
6 Counting III: Advanced Combinatorics 186
6.1 Basic Counting 187
6.1.1 Enumerating Cases: I 187
6.1.2 Enumerating Cases: II 188
6.1.3 Sampling With and Without Replacement 192
6.2 Word Orderings 199
6.2.1 Counting Orderings 200
6.2.2 Multinomial Coefficients 202
6.3 Partitions 205
6.3.1 The Cookie Problem 205
6.3.2 Lotteries 207
6.3.3 Additional Partitions 212
6.4 Summary 214
6.5 Exercises 215
II Introduction to Random Variables 219
7 Introduction to Discrete Random Variables 221
7.1 Discrete Random Variables: Definition 221
7.2 Discrete Random Variables: PDFs 223
7.3 Discrete Random Variables: CDFs 226
7.4 Summary 233
7.5 Exercises 235
8 Introduction to Continuous Random Variables 238
8.1 Fundamental Theorem of Calculus 239
8.2 PDFs and CDFs: Definitions 241
8.3 PDFs and CDFs: Examples 243
8.4 Probabilities of Singleton Events 248
8.5 Summary 250
8.6 Exercises 250
9 Tools: Expectation 254
9.1 Calculus Motivation 255
9.2 Expected Values and Moments 257
9.3 Mean and Variance 261
9.4 Joint Distributions 265
9.5 Linearity of Expectation 269
9.6 Properties of the Mean and the Variance 274
9.7 Skewness and Kurtosis 279
9.8 Covariances 280
9.9 Summary 281
9.10 Exercises 281
10 Tools: Convolutions and Changing Variables 285
10.1 Convolutions: Definitions and Properties 286
10.2 Convolutions: Die Example 289
10.2.1 Theoretical Calculation 289
10.2.2 Convolution Code 290
10.3 Convolutions of Several Variables 291
10.4 Change of Variable Formula: Statement 294
10.5 Change of Variables Formula: Proof 297
10.6 Appendix: Products and Quotients
of Random Variables 302
10.6.1 Density of a Product 302
10.6.2 Density of a Quotient 303
10.6.3 Example: Quotient of Exponentials 304
10.7 Summary 305
10.8 Exercises 305
11 Tools: Differentiating Identities 309
11.1 Geometric Series Example 310
11.2 Method of Differentiating Identities 313
11.3 Applications to Binomial Random Variables 314
11.4 Applications to Normal Random Variables 317
11.5 Applications to Exponential
Random Variables 320
11.6 Summary 322
11.7 Exercises 323
III Special Distributions 325
12 Discrete Distributions 327
12.1 The Bernoulli Distribution 328
12.2 The Binomial Distribution 328
12.3 The Multinomial Distribution 332
12.4 The Geometric Distribution 335
12.5 The Negative Binomial Distribution 336
12.6 The Poisson Distribution 340
12.7 The Discrete Uniform Distribution 343
12.8 Exercises 346
13 Continuous Random Variables:
Uniform and Exponential 349
13.1 The Uniform Distribution 349
13.1.1 Mean and Variance 350
13.1.2 Sums of Uniform Random Variables 352
13.1.3 Examples 354
13.1.4 Generating Random Numbers Uniformly 356
13.2 The Exponential Distribution 357
13.2.1 Mean and Variance 357
13.2.2 Sums of Exponential Random Variables 361
13.2.3 Examples and Applications of Exponential Random
Variables 364
13.2.4 Generating Random Numbers from
Exponential Distributions 365
13.3 Exercises 367
14 Continuous Random Variables: The Normal Distribution 371
14.1 Determining the Normalization Constant 372
14.2 Mean and Variance 375
14.3 Sums of Normal Random Variables 379
14.3.1 Case 1: μX = μY = 0 and σ2
X
= σ2
Y
= 1 380
14.3.2 Case 2: General μX,μY and σ2
X, σ2
Y 383
14.3.3 Sums of Two Normals: Faster Algebra 385
14.4 Generating Random Numbers from
Normal Distributions 386
14.5 Examples and the Central Limit Theorem 392
14.6 Exercises 393
15 The Gamma Function and Related Distributions 398
15.1 Existence of (s) 398
15.2 The Functional Equation of (s) 400
15.3 The Factorial Function and (s) 404
15.4 Special Values of (s) 405
15.5 The Beta Function and the Gamma Function 407
15.5.1 Proof of the Fundamental Relation 408
15.5.2 The Fundamental Relation and (1/2) 410
15.6 The Normal Distribution and the Gamma Function 411
15.7 Families of Random Variables 412
15.8 Appendix: Cosecant Identity Proofs 413
15.8.1 The Cosecant Identity: First Proof 414
15.8.2 The Cosecant Identity: Second Proof 418
15.8.3 The Cosecant Identity: Special Case s=1/2 421
15.9 Cauchy Distribution 423
15.10 Exercises 424
16 The Chi-square Distribution 427
16.1 Origin of the Chi-square Distribution 429
16.2 Mean and Variance of X ∼χ2(1) 430
16.3 Chi-square Distributions and Sums of Normal Random
Variables 432
16.3.1 Sums of Squares by Direct Integration 434
16.3.2 Sums of Squares by the Change of Variables Theorem 434
16.3.3 Sums of Squares by Convolution 439
16.3.4 Sums of Chi-square Random Variables 441
16.4 Summary 442
16.5 Exercises 443
IV Limit Theorems 447
17 Inequalities and Laws of Large Numbers 449
17.1 Inequalities 449
17.2 Markov’s Inequality 451
17.3 Chebyshev’s Inequality 453
17.3.1 Statement 453
17.3.2 Proof 455
17.3.3 Normal and Uniform Examples 457
17.3.4 Exponential Example 458
17.4 The Boole and Bonferroni Inequalities 459
17.5 Types of Convergence 461
17.5.1 Convergence in Distribution 461
17.5.2 Convergence in Probability 463
17.5.3 Almost Sure and Sure Convergence 463
17.6 Weak and Strong Laws of Large Numbers 464
17.7 Exercises 465
18 Stirling’s Formula 469
18.1 Stirling’s Formula and Probabilities 471
18.2 Stirling’s Formula and Convergence of Series 473
18.3 From Stirling to the Central Limit Theorem 474
18.4 Integral Test and the Poor Man’s Stirling 478
18.5 Elementary Approaches towards
Stirling’s Formula 482
18.5.1 Dyadic Decompositions 482
18.5.2 Lower Bounds towards Stirling: I 484
18.5.3 Lower Bounds toward Stirling II 486
18.5.4 Lower Bounds towards Stirling: III 487
18.6 Stationary Phase and Stirling 488
18.7 The Central Limit Theorem and Stirling 490
18.8 Exercises 491
19 Generating Functions and Convolutions 494
19.1 Motivation 494
19.2 Definition 496
19.3 Uniqueness and Convergence of
Generating Functions 501
19.4 Convolutions I: Discrete Random Variables 503
19.5 Convolutions II: Continuous Random Variables 507
19.6 Definition and Properties of Moment Generating
Functions 512
19.7 Applications of Moment Generating Functions 520
19.8 Exercises 524
20 Proof of the Central Limit Theorem 527
20.1 Key Ideas of the Proof 527
20.2 Statement of the Central Limit Theorem 529
20.3 Means, Variances, and Standard Deviations 531
20.4 Standardization 533
20.5 Needed Moment Generating Function Results 536
20.6 Special Case: Sums of Poisson
Random Variables 539
20.7 Proof of the CLT for General Sums via MGF 542
20.8 Using the Central Limit Theorem 544
20.9 The Central Limit Theorem and
Monte Carlo Integration 545
20.10 Summary 546
20.11 Exercises 548
21 Fourier Analysis and the Central Limit Theorem 553
21.1 Integral Transforms 554
21.2 Convolutions and Probability Theory 558
21.3 Proof of the Central Limit Theorem 562
21.4 Summary 565
21.5 Exercises 565
V Additional Topics 567
22 Hypothesis Testing 569
22.1 Z-tests 570
22.1.1 Null and Alternative Hypotheses 570
22.1.2 Significance Levels 571
22.1.3 Test Statistics 573
22.1.4 One-sided versus Two-sided Tests 576
22.2 On p-values 579
22.2.1 Extraordinary Claims and p-values 580
22.2.2 Large p-values 580
22.2.3 Misconceptions about p-values 581
22.3 On t-tests 583
22.3.1 Estimating the Sample Variance 583
22.3.2 From z-tests to t-tests 584
22.4 Problems with Hypothesis Testing 587
22.4.1 Type I Errors 587
22.4.2 Type II Errors 588
22.4.3 Error Rates and the Justice System 588
22.4.4 Power 590
22.4.5 Effect Size 590
22.5 Chi-square Distributions, Goodness of Fit 590
22.5.1 Chi-square Distributions and Tests of Variance 591
22.5.2 Chi-square Distributions and t-distributions 595
22.5.3 Goodness of Fit for List Data 595
22.6 Two Sample Tests 598
22.6.1 Two-sample z-test: Known Variances 598
22.6.2 Two-sample t-test: Unknown but Same Variances 600
22.6.3 Unknown and Different Variances 602
22.7 Summary 604
22.8 Exercises 605
23 Difference Equations, Markov Processes,
and Probability 607
23.1 From the Fibonacci Numbers to Roulette 607
23.1.1 The Double-plus-one Strategy 607
23.1.2 A Quick Review of the Fibonacci Numbers 609
23.1.3 Recurrence Relations and Probability 610
23.1.4 Discussion and Generalizations 612
23.1.5 Code for Roulette Problem 613
23.2 General Theory of Recurrence Relations 614
23.2.1 Notation 614
23.2.2 The Characteristic Equation 615
23.2.3 The Initial Conditions 616
23.2.4 Proof that Distinct Roots Imply Invertibility 618
23.3 Markov Processes 620
23.3.1 Recurrence Relations and Population Dynamics 620
23.3.2 General Markov Processes 622
23.4 Summary 622
23.5 Exercises 623
24 The Method of Least Squares 625
24.1 Description of the Problem 625
24.2 Probability and Statistics Review 626
24.3 The Method of Least Squares 628
24.4 Exercises 633
25 Two Famous Problems and Some Coding 636
25.1 The Marriage/Secretary Problem 636
25.1.1 Assumptions and Strategy 636
25.1.2 Probability of Success 638
25.1.3 Coding the Secretary Problem 641
25.2 Monty Hall Problem 642
25.2.1 A Simple Solution 643
25.2.2 An Extreme Case 644
25.2.3 Coding the Monty Hall Problem 644
25.3 Two Random Programs 645
25.3.1 Sampling with and without Replacement 645
25.3.2 Expectation 646
25.4 Exercises 646
Appendix A Proof Techniques 649
A.1 How to Read a Proof 650
A.2 Proofs by Induction 651
A.2.1 Sums of Integers 653
A.2.2 Divisibility 655
A.2.3 The Binomial Theorem 656
A.2.4 Fibonacci Numbers Modulo 2 657
A.2.5 False Proofs by Induction 659
A.3 Proof by Grouping 660
A.4 Proof by Exploiting Symmetries 661
A.5 Proof by Brute Force 663
A.6 Proof by Comparison or Story 664
A.7 Proof by Contradiction 666
A.8 Proof by Exhaustion (or Divide and Conquer) 668
A.9 Proof by Counterexample 669
A.10 Proof by Generalizing Example 669
A.11 Dirichlet’s Pigeon-Hole Principle 670
A.12 Proof by Adding Zero or Multiplying by One 671
Appendix B Analysis Results 675
B.1 The Intermediate and Mean Value Theorems 675
B.2 Interchanging Limits, Derivatives, and Integrals 678
B.2.1 Interchanging Orders: Theorems 678
B.2.2 Interchanging Orders: Examples 679
B.3 Convergence Tests for Series 682
B.4 Big-Oh Notation 685
B.5 The Exponential Function 688
B.6 Proof of the Cauchy-Schwarz Inequality 691
B.7 Exercises 692
Appendix C Countable and Uncountable Sets 693
C.1 Sizes of Sets 693
C.2 Countable Sets 695
C.3 Uncountable Sets 698
C.4 Length of the Rationals 700
C.5 Length of the Cantor Set 701
C.6 Exercises 702
Appendix D Complex Analysis and the Central Limit
Theorem 704
D.1 Warnings from Real Analysis 705
D.2 Complex Analysis and Topology Definitions 706
D.3 Complex Analysis and Moment Generating Functions 711
D.4 Exercises 715
Bibliography 717
Index 721