Computer access is the only way to retrieve up-to-date sequences and this book shows researchers puzzled by the maze of URLs, sites, and searches how to use internet technology to find and analyze genetic data. The book describes the different types of databases, how to use a specific database to find a sequence that you need, and how to analyze the data to compare it with your own work. The content also covers sequence phenotype, mutation, and genetic linkage databases; simple repetitive DNA sequences; gene feature identification; and prediction of structure and function of proteins from sequence information. This book will be invaluable to those starting a career in life sciences research as well as to established researchers wishing to make full use of available resources. Key Features* Describes a wide range of databases: DNA, RNA, protein, pathways, and gene expression* Enables readers to access the information they need from databases on the web* Includes a directory of URLs for easy reference* Invaluable for those starting a career in life sciences research and also for established researchers wishing to make full use of available resources.
Author(s): Martin J. Bishop
Publisher: ACADEMIC PRESS
Year: 1999
Language: English
Pages: 311
Front Cover......Page 1
Genetics Databases......Page 4
Copyright Page......Page 5
Contents......Page 6
1.1 Internet resources......Page 16
1.2 Organisms and proteins......Page 17
1.3 Phenotypes and genotypes......Page 19
1.4 Physical mapping......Page 21
1.5 Expression profiling......Page 22
1.6 Multiprotein complexes and pathways......Page 23
1.7 Sequence, structure and function......Page 24
2.2 The main sequence databases......Page 26
2.3 Rate of database growth......Page 49
2.4 Problems with the data......Page 50
References......Page 52
3.1 Introduction......Page 54
3.3 Definitions......Page 55
3.4 Types of databases......Page 56
3.5 Using mutation databases......Page 64
3.6 Exercises......Page 65
3.8 Conclusion......Page 66
References......Page 67
4.1 Introduction......Page 68
4.2 Measures dependent on a model of coding DNA......Page 70
4.3 Measures independent of a model of coding DNA......Page 84
4.4 Coding statistics in gene identification programs......Page 91
References......Page 94
5.1 Introduction......Page 96
5.2 Properties of amino acids......Page 97
5.3 Empirically derived amino acid relationships......Page 102
5.4 Relationship to the genetic code......Page 107
5.5 Multiple sequence alignments......Page 110
References......Page 117
6.1 Introduction......Page 120
6.2 Dotplots......Page 122
6.3 Alignments......Page 126
6.4 Motif-based approaches......Page 130
6.5 Conclusion......Page 133
References......Page 134
7.1 Introduction......Page 136
7.2 Microsatellites in databases for population genetic analyses......Page 137
7.3 Genetic distances......Page 142
7.4 Population sizes and gene flow......Page 144
7.5 Tandem repeat block expansion diseases – a continuum from trinucleotides to minisatellites? Implications for database usage in population genetics......Page 146
References......Page 147
8.1 Introduction......Page 150
8.2 Biologically interesting sequences features......Page 151
8.3 Sequence analysis methods......Page 159
8.4 Computer programs, databases and WWW servers......Page 167
8.5 An example......Page 173
References......Page 177
9.1 Introduction......Page 180
9.2 Selecting the sequences to align......Page 182
9.3 Automatic sequence alignment......Page 184
9.4 Using Clustal W and Clustal X......Page 187
9.5 Editing and viewing multiple alignments......Page 196
References......Page 197
10.2 Specialized RNA-related databases......Page 200
10.3 Tools for analysis: RNA structure and prediction......Page 206
10.4 Future directions......Page 207
References......Page 209
11.2 Protein evolution and function......Page 214
11.3 Protein structure and function......Page 217
11.4 Functions of enzymatic and regulatory domains......Page 218
11.5 Classical genetics and protein functions......Page 219
References......Page 225
12.1 Introduction......Page 230
12.3 The Cambridge Structural Database......Page 231
12.6 A typical PDB entry......Page 232
12.7 Protein structure classification resources......Page 236
12.9 Available classification schemes......Page 237
12.10 Constructing the CATH classification......Page 238
12.11 Making use of structural databases......Page 243
12.12 How does threading work?......Page 248
12.13 Conclusions......Page 252
References......Page 253
13.1 Introduction......Page 256
13.2 Data definition......Page 257
13.3 PKR functions and data......Page 259
References......Page 261
14.1 Introduction......Page 262
14.2 Gene-expression assays......Page 264
14.3 Database scope......Page 265
14.4 Gene-expression data......Page 267
14.5 Access and submission......Page 272
14.6 Specific database synopses......Page 274
14.7 Conclusion......Page 282
References......Page 283
15.1 Introduction......Page 284
15.2 Genes......Page 285
15.3 Proteins......Page 288
15.4 Pathways......Page 292
15.5 Summary......Page 294
References......Page 295
Appendix: List of URLs in Text and Tables......Page 296
Index......Page 304