The massive research effort known as the Human Genome Project is an attempt to record the sequence of the three trillion nucleotides that make up the human genome and to identify individual genes within this sequence. The description and classification of sequences is heavily dependent on mathematical and statistical models. This short textbook presents a brief description of several ways in which mathematics and statistics are being used in genome analysis and sequencing.
Author(s): Jerome K. Percus
Edition: 1
Year: 2001
Language: English
Pages: 151
Half-title......Page 3
Series-title......Page 5
Title......Page 7
Copyright......Page 8
Contents......Page 9
Preface......Page 11
1.1. DNA Sequences......Page 13
1.2. Restriction Fragments......Page 18
1.3. Clone Libraries......Page 21
2.1. Fingerprint Assembly......Page 25
2.2. Anchoring......Page 30
2.3. Restriction-Fragment-Length Polymorphism (RFLP) Analysis......Page 35
2.4. Pooling......Page 40
2.5. Reprise......Page 48
3.1. Local Properties of DNA......Page 54
3.2.1. Longest Repeat......Page 61
3.2.2. Displaced Correlations......Page 65
3.2.3. Nucleotide-Level Criteria......Page 66
Within-Window Correlations......Page 73
Factorial Moments......Page 76
Patch Model......Page 77
Hidden Markov Models......Page 79
Walking Markov Model......Page 81
3.2.1. Spectral Analysis......Page 85
3.2.2. Entropic Criteria......Page 90
4.1. Basic Matching......Page 93
4.1.1. Mutual-Exclusion Model......Page 94
4.1.2. Independence Model......Page 97
4.1.3. Direct Asymptotic Evaluation......Page 99
4.1.4. Extreme-Value Technique......Page 101
4.2.1. Score Distribution......Page 104
4.2.2. Penalty-Free Limit......Page 109
4.2.3. Effect of Indel Penalty......Page 111
4.2.4. Score Acquisition......Page 112
4.3.1. Locating a Common Pattern......Page 115
4.3.2. Assessing Significance......Page 117
4.3.3. Category Analysis......Page 120
Bayesian Analysis......Page 123
Hopfield Networks......Page 128
5.1. Thermal Behavior......Page 130
5.2. Dynamics......Page 135
5.3. Effect of Heterogeneity......Page 138
Bibliography......Page 141
Index......Page 149