Bioinformatics: Sequence Alignment and Markov Models

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

GET FULLY UP-TO-DATE ON BIOINFORMATICS-THE TECHNOLOGY OF THE 21ST CENTURY

Bioinformatics showcases the latest developments in the field along with all the foundational information you'll need. It provides in-depth coverage of a wide range of autoimmune disorders and detailed analyses of suffix trees, plus late-breaking advances regarding biochips and genomes.

Featuring helpful gene-finding algorithms, Bioinformatics offers key information on sequence alignment, HMMs, HMM applications, protein secondary structure, microarray techniques, and drug discovery and development. Helpful diagrams accompany mathematical equations throughout, and exercises appear at the end of each chapter to facilitate self-evaluation.

This thorough, up-to-date resource features:

  • Worked-out problems illustrating concepts and models
  • End-of-chapter exercises for self-evaluation
  • Material based on student feedback
  • Illustrations that clarify difficult math problems
  • A list of bioinformatics-related websites

Bioinformatics covers:

  • Sequence representation and alignment
  • Hidden Markov models
  • Applications of HMMs
  • Gene finding
  • Protein secondary structure prediction
  • Microarray techniques
  • Drug discovery and development
  • Internet resources and public domain databases

Author(s): Kal Sharma
Edition: 1
Publisher: McGraw-Hill Professional
Year: 2008

Language: English
Pages: 338

Contents......Page 7
Preface......Page 13
Acknowledgments......Page 17
1 Preliminaries......Page 19
1.1.1 Amino Acids and Proteins......Page 20
1.1.2 Structures of Proteins......Page 21
1.1.3 Sequence Distribution of Insulin......Page 24
1.1.4 Bioseparation Techniques......Page 27
1.1.5 Nucleic Acids and Genetic Code......Page 30
1.1.6 Genomes—Diversity, Size, and Structure......Page 38
1.2 Probability and Statistics......Page 41
1.2.1 Three Definitions of Probability......Page 42
1.2.3 Independent Events and Bernoulli’s Theorem......Page 43
1.2.4 Discrete Probability Distributions......Page 44
1.2.5 Continuous Probability Distributions......Page 46
1.2.6 Statistical Inference and Hypothesis Testing......Page 48
1.3 Which Is Larger, 2[sup(n)] or n[sup(2)]?......Page 49
1.4 Big O Notation and Asymptotic Order of Functions......Page 50
Summary......Page 51
References and Sources......Page 52
Exercises......Page 53
Part 1 Sequence Alignment and Representation......Page 57
2.1 Introduction to Pairwise Sequence Alignment......Page 59
2.2 Why Study Sequence Alignment......Page 61
2.3 Alignment Grading Function......Page 65
2.4 Optimal Global Alignment of a Pair of Sequences......Page 69
2.5 Dynamic Programming......Page 73
2.7 Dynamic Arrays and O(N) Space......Page 74
2.8 Subquadratic Algorithms for Longest Common Subsequence Problems......Page 75
2.9 Optimal Local Alignment of a Pair of Sequences......Page 77
2.10 Affine Gap Model......Page 78
2.11 Greedy Algorithms for Pairwise Alignment......Page 81
2.12 Other Alignment Methods......Page 83
2.13 Pam and Blosum Matrices......Page 84
Summary......Page 87
References......Page 88
Exercises......Page 89
3.1 Suffix Trees......Page 103
3.2 Algorithm for Suffix Tree Representation of a Sequence......Page 106
3.3 Streaming a Sequence Against a Suffix Tree......Page 107
3.4 String Algorithms......Page 109
3.5 Suffix Trees in String Algorithms......Page 115
3.6 Look-up Tables......Page 117
Summary......Page 118
References......Page 119
Exercises......Page 120
4.1 What Is Multiple-Sequence Alignment?......Page 133
4.3 Optimal MSA by Dynamic Programming......Page 135
4.5 What Are NP Complete Problems?......Page 136
4.6 Center-Star-Alignment Algorithm [4]......Page 137
4.7 Progressive Alignment Methods......Page 139
4.8 The Consensus Sequence......Page 140
4.10 Geometry of Multiple Sequences......Page 141
References......Page 143
Exercises......Page 144
Part 2 Probability Models......Page 149
5.1 Introduction......Page 151
5.2 kth-order Markov Chain......Page 152
5.3 DNA Sequence and Geometric Distribution [2–4]......Page 153
5.4 Three Questions in the HMM......Page 161
5.6 Decoding Problem and Viterbi Algorithm......Page 164
5.7 Relative Entropy......Page 165
5.8 Probabilistic Approach to Phylogeny......Page 167
5.9 Sequence Alignment Using HMMs......Page 170
5.10 Protein Families......Page 171
5.11 Wheel HMMs to Model Periodicity in DNA......Page 174
5.12 Generalized HMM (GHMM)......Page 175
5.14 Multiple Alignments......Page 178
5.15 Classification Using HMMs......Page 179
5.16 Signal Peptide and Signal Anchor Prediction by HMMs......Page 180
5.17 Markov Model and Chargaff's Parity Rules......Page 181
Summary......Page 182
References......Page 183
Exercises......Page 184
6.1 Introduction......Page 197
6.2 Relative Entropy Site-Selection Problem......Page 198
6.3 Maximum-Subsequence Problem......Page 200
6.4 Interpolated Markov Model (IMM)......Page 202
6.5 Shine Dalgarno SD Sites Finding......Page 203
6.6 Gene Annotation Methods......Page 205
6.7 Secondary Structures of Proteins......Page 209
Summary......Page 221
References......Page 222
Exercises......Page 224
Part 3 Measurement Techniques......Page 229
7.1 Introduction......Page 231
7.2 Microarray Detection......Page 241
7.3 Microarray Surfaces......Page 245
7.4 Phosphoramadite Synthesis......Page 249
7.5 Microarray Manufacture......Page 251
7.6 Normalization for cDNA Microarray Data......Page 254
Summary......Page 258
References......Page 259
Exercises......Page 260
8.1 Role of Electrophoresis in the Measurement of Sequence Distribution......Page 263
8.2 Fick’s Laws of Molecular Diffusion......Page 264
8.3 Generalized Fick’s Law of Diffusion......Page 267
8.4 Electrophoresis Apparatus......Page 287
8.5 Electrophoretic Term, Ballistic Term, and Fick Term in the Governing Equation......Page 288
Summary......Page 292
References......Page 293
Exercises......Page 294
A: Internet Hotlinks to Public-Domain Databases......Page 305
B: PERL for Bioinformaticists......Page 317
A......Page 321
B......Page 322
C......Page 323
D......Page 324
F......Page 325
G......Page 326
H......Page 327
J......Page 328
L......Page 329
M......Page 330
N......Page 331
P......Page 332
R......Page 334
S......Page 335
T......Page 337
Z......Page 338