Massive data acquisition technologies--such as genome sequencing, high-throughput drug screening, and DNA arrays--are in the process of revolutionizing biology and medicine. This concise, user-friendly and interdisciplinary guide to DNA microarray technology is an introduction and a reference for both biologists and computational scientists. The authors describe the underlying technologies and offer an awareness of the "noise" and pitfalls present in the data generated. They also provide an idea of the different data mining techniques and algorithms that are available to interpret data, and the advantages and disadvantages of each in differing situations.
Author(s): Pierre Baldi, G. Wesley Hatfield, Wesley G. Hatfield
Edition: 1
Year: 2002
Language: English
Pages: 230
Cover......Page 1
Half-title......Page 3
Title......Page 5
Copyright......Page 6
Contents......Page 7
Preface......Page 10
Audience and prerequisites......Page 14
Content and general outline of the book......Page 15
Acknowledgements......Page 17
1 A brief history of genomics......Page 19
In situ synthesized oligonucleotide arrays......Page 25
Pre-synthesized DNA arrays......Page 29
Filter-based DNA arrays......Page 30
Non-conventional gene expression profiling technologies......Page 31
REFERENCES......Page 33
Reading data from a fluorescent signal......Page 35
Reading data from a radioactive signal......Page 39
Differences among samples......Page 47
RNA isolation procedures......Page 49
Special considerations for gene expression profiling in bacteria......Page 50
A comparison of the use of targets prepared with message-specific primers and targets prepared with random hexamer…......Page 51
Rapid turnover of mRNA in bacterial cells......Page 55
Preparation of bacterial targets for in situ synthesized DNA arrays......Page 57
Target preparation with non-polyadenylated mRNA from bacterial cells......Page 58
Problems associated with target preparation with polyadenylated mRNA from eukaryotic cells......Page 60
A total RNA solution for target preparation from eukaryotic cells......Page 62
Data acquisition for nylon filter experiments......Page 63
Data acquisition for Affymetrix GeneChip glass slide experiments......Page 66
Normalization to total or ribosomal RNA......Page 67
Normalization by global scaling......Page 68
REFERENCES......Page 69
Problems and common approaches......Page 71
The Bayesian probabilistic framework......Page 73
Gaussian model for array data......Page 74
The conjugate prior......Page 76
Full-Bayesian treatment versus hypothesis testing......Page 78
Parameter point estimates......Page 80
Hyperparameter point estimates and implementation......Page 81
Simulations......Page 82
Extensions......Page 87
REFERENCES......Page 89
Problems and approaches......Page 91
Visualization, dimensionality reduction, and principal component analysis......Page 93
Clustering overview......Page 96
Supervised/unsupervised......Page 97
Number of clusters......Page 98
Cost function and probabilistic interpretation......Page 99
Tree visualization......Page 100
Mixture models and EM......Page 103
DNA arrays and regulatory regions......Page 108
REFERENCES......Page 111
7 The design, analysis,and interpretation of gene expression profiling experiments......Page 115
Experimental design......Page 117
Identification of differentially expressed genes......Page 118
Determination of the source of errors in DNA array experiments......Page 119
Estimation of the global false positive level for a DNA array experiment......Page 121
An ad hoc empirical method......Page 123
A computational method......Page 124
Improved statistical inference from DNA array data using a Bayesian statistical framework......Page 127
The Bayesian approach allows the identification of more true positives with fewer replicates.......Page 134
Deciding a cutoff level for differentially expressed genes worthy of further experimentation......Page 135
Nylon filter data vs. Affymetrix GeneChip data......Page 137
Application of clustering and visualization methods......Page 142
Identification of differential gene expression patterns resulting from two-variable perturbation experiments......Page 143
Hierarchical clustering and principal component analysis......Page 145
Interpretation of clustering results......Page 146
Clustering data sets resulting from multiple-variable perturbation experiments......Page 149
REFERENCES......Page 150
Introduction......Page 153
Molecular reactions......Page 155
Graphs and pathways......Page 157
Metabolic networks......Page 158
Protein networks......Page 159
Regulatory networks......Page 160
Computational models of regulatory networks......Page 164
Discrete models: Boolean networks......Page 165
Continuous models: Differential equations......Page 167
Learning or model .tting......Page 171
Qualitative modeling......Page 173
Partial differential equations......Page 175
Stochastic equations......Page 178
Probabilistic models: Bayesian networks......Page 180
Software and databases......Page 181
The search for general principles......Page 183
REFERENCES......Page 186
cDNA synthesis for the preparation of P-labeled bacterial or eukaryotic targets for hybridization to pre-synthesized…......Page 195
Cy3, Cy5 target labeling of RNA for hybridization to pre-synthesized glass slide arrays......Page 196
Hybridization of Cy3-, Cy5-labeled targets to glass slide arrays......Page 197
mRNA enrichment and biotin labeling methods for hybridization of bacterial targets to in situ synthesized Affymetrix…......Page 198
Biotin labeling methods for hybridization of eukaryotic targets to Affymetrix GeneChips......Page 199
The non-informative improper prior......Page 203
The beta distribution......Page 204
Gaussian process models......Page 205
Covariance parameterization......Page 206
Kernal methods and support vector machines......Page 207
Kernel selection......Page 208
Weight selection......Page 209
REFERENCES......Page 210
Tools, forums, and pointers......Page 213
Commercially available technical information......Page 215
Appendix D......Page 217
REFERENCES......Page 223
Index......Page 225