This book discusses the analysis of gene expression profile data from DNA micorarray studies and the design of such studies. The book will address design and analysis issues for both of the major classes of DNA microarrays, the cDNA microarrays and the oligonucleotide arrays. DNA microarrays are a new technology that is revolutionizing biological and biomedical research. Most biologists are trying to analyze their own data using a variety of commercial and public domain software. This book will provide an authoritative review of the methods available and present it in a manner that is intelligible to biologists.
Author(s): Richard M. Simon, Edward L. Korn, Lisa M. McShane, Michael D. Radmacher, George W. Wright, Yingdong
Edition: 1
Year: 2004
Language: English
Pages: 210
Contents......Page 8
Acknowledgments......Page 6
1 Introduction......Page 12
2.2 Measuring Label Intensity......Page 16
2.3 Labeling Methods......Page 17
2.4 Printed Microarrays......Page 18
2.5 Affymetrix GeneChip Arrays......Page 20
2.6 Other Microarray Platforms......Page 21
3.1 Introduction......Page 22
3.2.1 Class Comparison......Page 23
3.3 Comparing Two RNA Samples......Page 24
3.4 Sources of Variation and Levels of Replication......Page 25
3.5 Pooling of Samples......Page 27
3.6.1 The Reference Design......Page 28
3.6.2 The Balanced Block Design......Page 30
3.6.3 The Loop Design......Page 31
3.7 Reverse Labeling (Dye Swap)......Page 32
3.8 Number of Biological Replicates Needed......Page 34
4.1 Image Generation......Page 40
4.2.2 Gridding......Page 41
4.2.3 Segmentation......Page 42
4.2.4 Foreground Intensity Extraction......Page 43
4.2.5 Background Correction......Page 44
4.2.6 Image Output File......Page 45
4.3 Image Analysis for Affymetrix GeneChip......Page 46
5.1 Introduction......Page 50
5.2.2 Spots Flagged at Image Analysis......Page 51
5.2.3 Spot Size......Page 52
5.2.4 Weak Signal......Page 53
5.2.5 Large Relative Background Intensity......Page 54
5.3 Gene Level Quality Control for Two-Color Arrays......Page 55
5.3.2 Probe Quality Control Based on Duplicate Spots......Page 56
5.3.3 Low Variance Genes......Page 57
5.4 Array-Level Quality Control for Two-Color Arrays......Page 58
5.5 Quality Control for GeneChip Arrays......Page 59
5.6 Data Imputation......Page 61
6.2.1 Biologically Defined Housekeeping Genes......Page 64
6.2.2 Spiked Controls......Page 65
6.3 Normalization Methods for Two-Color Arrays......Page 66
6.3.1 Linear or Global Normalization......Page 67
6.3.2 Intensity-Based Normalization......Page 68
6.3.3 Location-Based Normalization......Page 70
6.4.1 Linear or Global Normalization......Page 72
6.4.2 Intensity-Based Normalization......Page 73
7.1 Introduction......Page 76
7.2 Examining Whether a Single Gene is Differentially Expressed Between Classes......Page 77
7.2.1 t-Test......Page 78
7.2.2 Permutation Tests......Page 79
7.2.3 More Than Two Classes......Page 82
7.2.4 Paired-Specimen Data......Page 84
7.3 Identifying Which Genes Are Differentially Expressed Between Classes......Page 86
7.3.1 Controlling for No False Positives......Page 87
7.3.2 Controlling the Number of False Positives......Page 91
7.3.3 Controlling the False Discovery Proportion......Page 92
7.4 Experiments with Very Few Specimens from Each Class......Page 95
7.5 Global Tests of Gene Expression Differences Between Classes......Page 97
7.6 Experiments with a Single Specimen from Each Class......Page 99
7.7 Regression Model Analysis; Generalizations of Class Comparison......Page 101
7.8 Evaluating Associations of Gene Expression to Survival......Page 102
7.9 Models for Nonreference Designs on Dual-Label Arrays......Page 103
8.1 Introduction......Page 106
8.2 Feature Selection......Page 108
8.3.2 Discriminant Analysis......Page 109
8.3.3 Variants of Diagonal Linear Discriminant Analysis......Page 112
8.3.4 Nearest Neighbor Classification......Page 114
8.3.5 Classification Trees......Page 115
8.3.6 Support Vector Machines......Page 117
8.3.7 Comparison of Methods......Page 118
8.4.1 Bias of the Re-Substitution Estimate......Page 119
8.4.2 Cross-Validation and Bootstrap Estimates of Error Rate......Page 121
8.4.3 Reporting Error Rates......Page 123
8.4.5 Validation Dataset......Page 124
8.5 Example......Page 125
8.6 Prognostic Prediction......Page 129
9.1 Introduction......Page 132
9.2 Similarity and Distance Metrics......Page 133
9.3.1 Classical Multidimensional Scaling......Page 136
9.4.1 Hierarchical Clustering......Page 142
9.4.2 k-Means Clustering......Page 149
9.4.3 Self-Organizing Maps......Page 153
9.4.4 Other Clustering Procedures......Page 156
9.5 Assessing the Validity of Clusters......Page 157
9.5.1 Global Tests of Clustering......Page 159
9.5.2 Estimating the Number of Clusters......Page 161
9.5.3 Assessing Reproduciblity of Individual Clusters......Page 163
A.1 Introduction......Page 168
B.2 Bittner Melanoma Data......Page 176
B.4 Perou Breast Data......Page 177
B.5 Tamayo HL-60 Data......Page 178
B.6 Hedenfalk Breast Cancer Data......Page 179
C.1 Software Description......Page 180
C.2 Analysis of Bittner Melanoma Data......Page 182
C.3 Analysis of Perou Breast Cancer Chemotherapy Data......Page 189
C.4 Analysis of Hedenfalk Breast Cancer Data......Page 193
References......Page 196
C......Page 206
G......Page 207
N......Page 208
R......Page 209
W......Page 210