Pacific Symposium on Biocomputing 2004: Hawaii, USA 6-10 January 2004

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

The Pacific Symposium on Biocomputing (PSB 2004) is an international, multidisciplinary conference for the presentation and discussion of current research on the theory and application of computational methods in problems of biological significance. The rigorously peer-reviewed papers and presentations are collected in this archival proceedings volume. PSB is a forum for the presentation of work on databases, algorithms, interfaces, visualization, modeling and other computational methods, as applied to biological problems, with emphasis on applications in data-rich areas of molecular biology. PSB 2004 brings together top researchers from the US, the Asia-Pacific region and the rest of the world to exchange research findings and address open issues in all aspects of computational biology.

Author(s): Russ B. Altman, A. Keith Dunker, Lawrence Hunter, Tiffany A. Jung, T. E. D. Klein
Year: 2004

Language: English
Pages: 608

Preface......Page 12
A wealth of public data has become available.......Page 20
Recent measurement technology advances have enabled analysis of splice variants.......Page 21
1 Introduction......Page 22
2 Exon Profiling with a Polony Gel......Page 24
3 Distinguishing Splice Variants with Short Oligonucleotides......Page 25
3.1 Assay Designs with Unique Probes......Page 26
3.2 Pooling to Minimize Hybridization Costs......Page 27
3.3 A Tradeoff in Assay Design......Page 28
4 Empirical Results......Page 30
5 Discussion......Page 31
References......Page 32
1 Introduction......Page 34
2.1 . Gene structures and cDNA organization......Page 35
2.2 Protein Sequence Analysis......Page 36
3 Results......Page 37
3.1 Relation between exon boundaries and transmembrane protein regions......Page 38
3.2 Effects of alternative splicing on transmembrane protein regions......Page 39
3.3 Case Study 1: Alternative splicing of GPCRs......Page 41
3.4 Case Study 2: Alternative splicing and nonsense-mediated decay......Page 42
4 Conclusions......Page 43
References......Page 44
1 Introduction......Page 46
2.1 Overview......Page 47
2.2 Mapping the cluster of expressed sequences to genomic sequence......Page 48
2.3 Aligning expressed sequences to genomic sequence and to each other......Page 51
2.4 Splicing and alternative splicing detection in PO-MSAs......Page 52
3 Results......Page 55
4 Discussion and conclusions......Page 56
References......Page 57
1 Introduction......Page 59
2 Methods......Page 60
2.2 Splice junction consistency check......Page 61
2.4 Sequence data resources......Page 63
3.1 Detection of Known Constitutive and Alternative Splice Patterns......Page 64
3.2 Characterization of Novel Alternative Splice Patterns......Page 66
4 Discussion......Page 67
References......Page 69
1 Introduction......Page 71
Search Region Arrangements......Page 73
Pattern Models......Page 74
2.3 Search Space......Page 75
2.4 Score Function......Page 76
3.1 Alternative 5’ Splice Site......Page 77
3.2 Alternative 3’ Splice Site......Page 79
3.3 Cassette......Page 80
4. Discussssion......Page 81
References......Page 82
Transcriptome and Genome Conservation of Alternative Splicing Events in Humans and Mice C.W Sugnet, WJ. Kent, M. Ares J K , and D. Haussler......Page 83
1 Introduction......Page 84
2.1 Constructing Splicing Graphs.......Page 85
2.2 Comparing Orthologous Splicing Graphs.......Page 87
3 Results......Page 88
3.1 Conservation of Genomic Sequences Near Alternative Splicing Events.......Page 92
Acknowledgments......Page 93
References......Page 94
1 Introduction......Page 95
2. I The RASL Approach to Profiling Alternative Splicing......Page 96
2.3 Annotation of ASEs Using MAASE......Page 98
2.4 MAASE Database......Page 100
Acknowledgements......Page 103
References......Page 104
Session Introduction F: de la Vega, K.K. Kidd, and A. Collins......Page 106
Acknowledgements......Page 108
References......Page 109
1 Introduction......Page 110
2 System and Availability......Page 111
3 Algorithm......Page 112
4 Implementation......Page 115
5 Discussion......Page 118
References......Page 119
1 Introduction......Page 121
Breakdown of the Haplotype Reconstruction Problem......Page 123
Markov Chains......Page 124
Handling Missing Data......Page 125
3 Haplotype Reconstruction Algorithm......Page 126
Test setting......Page 127
Evaluation of the models......Page 128
Acknowledgments......Page 131
References......Page 132
1 Introduction......Page 133
2.1 Notation......Page 134
2.3 Cost function with symmetric no-call regions......Page 138
Research Approach......Page 139
Discussion......Page 142
Acknowledgments......Page 143
References......Page 144
1 Introduction......Page 145
2.1 The LD measure D '......Page 147
2.3 Estimation of the Confidence interval and the Coverage......Page 148
2.6 Variance Estimation by Zapata et al. [2]......Page 149
2.7 Adjustment for the Confidence Interval......Page 150
3 Results......Page 151
4 Discussions......Page 155
References......Page 156
1 Introduction......Page 157
2 Generic Genotyping Techniques......Page 159
2.1 Problem Formulation......Page 160
3.1 An Approximation Scheme......Page 161
3.2 Practical Heuristic Approaches......Page 164
4 Results......Page 165
5 Concluding Remarks......Page 166
References......Page 167
1 Introduction......Page 169
2 Methods......Page 170
3 Results......Page 172
4 Discussion......Page 177
References......Page 180
Session Introduction O. Bodenreidel; J.A. Mitchell, and A.T. McCray......Page 181
1 Introduction......Page 183
2.1 OBO and the Gene Ontology......Page 184
2.2 Axiomatising part-of for Anatomy......Page 186
3 The GO Schema......Page 188
4.1 Homology Data......Page 191
4.2 Methodology......Page 192
5 Conclusions......Page 193
References......Page 194
1 Introduction......Page 195
2 Mouse Phenotype Ontology......Page 196
3.1 Tools Summary......Page 198
3.3 Translation of existing ontologies into Protege-2000......Page 199
3.5 A typical example of implementation......Page 200
4 Proposed New Schema......Page 202
5 Discussion......Page 203
7 Acknowledgements......Page 204
References......Page 205
1 Introduction......Page 207
2 Motivations for an Evidence Ontology......Page 208
3 Overview of Pathway Tools......Page 209
4 Pathway Tools Implementation of the Evidence Ontology......Page 210
5 The Evidence Ontology......Page 211
5.1 The Hierarchy of Evidence Codes......Page 213
5.3 Object and Relational Implementations of Evidence Tuples......Page 216
6 Use of the Evidence Ontology within EcoCyc and MetaCyc......Page 217
References......Page 218
1 Introduction......Page 219
2.3 The Unified Medical Language System@ (UMLS) and Norm......Page 221
c. Mapping......Page 222
e. Semantic Processing.......Page 223
4 Results and Discussion......Page 224
4. I Quantitative Evaluation......Page 225
4.2 Qualitative Evaluation and Discussion......Page 226
5 Caveats and Implications for Future Work......Page 227
References......Page 228
I . I Motivation......Page 231
1.2 What it means to have (compositional) structure......Page 232
2. I lncidence of inclusion of terms in other terms......Page 233
2.2 Characteristics of complements......Page 235
3 Implications and conclusions......Page 237
3.1 Aids to the evaluation and curation of GO......Page 238
3.2 Enriching GO’S conceptual representations......Page 239
Acknowledgements......Page 241
References......Page 242
1.1 The problem: We want to rescue the “baby” from the “bathwater”......Page 243
1.2 Summary of Analysis......Page 244
2.1.1 Case 1: Specialisations......Page 245
2.1.3 Case 3: Representing context explicitly......Page 246
2.2 Implementation in OWL for cases 2-3......Page 247
2.3 Case 4:dealing with unpredictable number of exceptions, possibly with exceptions to the exceptions - representations requiring hybrid reasoning and a “ontology indexed knowledge base”......Page 249
3 Results and Discussion......Page 250
References......Page 253
1 Introduction......Page 255
2.1 Model Organism Databases......Page 256
2.2 Name Recognition Systems......Page 257
3.1 Creating a Lexical Resource and Measuring Its Ambiguity......Page 258
3.2 Evaluating Recall and Ambiguity......Page 259
4.1 Ambiguity of the Lexical Resources......Page 260
4.2 Recall......Page 261
4.3 Ambiguities in the Output......Page 262
5 Discussion......Page 263
6 Conclusions......Page 265
References......Page 266
1 Introduction......Page 267
2 Materials......Page 268
3.2 Acquiring implicit knowledge......Page 269
3.3 Identifying the origin of semantic relations......Page 270
4.2 Origin of the semantic relations acquired......Page 271
4.5 Inferred semantic relations......Page 273
5.1 Specificity and common features of the various methods generating relations......Page 274
5.2.1 Ontology auditing, validation, and maintenance......Page 275
5.2.2 Integration of multiple ontologies......Page 276
Acknowledgements......Page 277
References......Page 278
Session Introduction A. Hartemink and E. Segal......Page 279
1.1 Problem dejinition......Page 281
1.3 An overview of our method......Page 282
2.2 Feature vectors for sequence......Page 283
2.3 Indexing feature vectors......Page 284
3.1 Index search......Page 285
3.3 Post-processing......Page 287
4.1 Quality test......Page 288
4.2 Performance tesl......Page 290
5 Discussion......Page 291
References......Page 292
1 Introduction......Page 293
2.1 Operon length......Page 295
2.3 Gene expression data......Page 297
2.4 Bayesian classifier......Page 299
3 Prediction accuracy......Page 300
4 Conclusion......Page 302
Acknowledgments......Page 303
References......Page 304
1 Introduction......Page 305
2.1 Framework Ouemiew......Page 307
2.3 Training a Joint Sequence Text Classifier......Page 308
Sequence Kernel......Page 309
Support Vector Machines......Page 310
3 Results for Protein Localization......Page 311
3.2 Increasing the S e t of Localization Annotated Sequences......Page 312
3.3 Evaluation the Joint Text Sequence Classifier......Page 313
3.4 Identifying Regions Relevant t o Localization......Page 314
References......Page 315
1 Introduction......Page 317
2 Related Work......Page 319
Kernel Methods......Page 320
Kernel Methods for Data Fusion......Page 321
4 Experimental Design......Page 323
5 Results......Page 325
6 Discussion......Page 326
References......Page 327
1 Introduction......Page 329
2 Overview of Our Method and Data Used......Page 331
3.2 Extract Contact Segment Pairs......Page 332
4.I Seeded Sub-grouping and Consensus Motif Discovery......Page 335
4.2 Iterative Refinement......Page 336
5 Implementation and Results......Page 337
6 Conclusion and Further Work......Page 339
References......Page 340
2 Introduction......Page 341
3.1 Probabilistic model......Page 342
3.2 An EM algorithm to train parameters......Page 344
3.3 Implementation......Page 345
4.1 A test case from the budding yeasts......Page 346
4.3 Success of motif discovery is dependent on evolutionary distance......Page 347
4.4 The unified framework is preferable to using evolutionary information separately......Page 349
Acknowledgements......Page 350
References......Page 351
1 Introduction......Page 353
2 Bayesian Network Model with Protein Complex......Page 354
3 Criterion and Algorithm for Estimating a Gene Network......Page 356
4.1 Cell Cycle Pathway in KEGG......Page 358
4.2 Gene Network with 350 Cell Cycle Genes......Page 359
5 Discussion......Page 361
References......Page 363
1 Regulatory Elements and Sequence Sources......Page 365
1.2 Regulatory Elements from Heterogeneous Data......Page 366
2 Expectation-Maximization for Heterogeneous Data......Page 368
3 Experimental Results......Page 370
4 Conclusion......Page 373
References......Page 374
1 Introduction......Page 377
1.1 Related work......Page 378
2 Methods......Page 379
2.2 Sequences chosen by length......Page 380
Program parameters......Page 382
Binding data......Page 383
Results on shuffled data......Page 384
3.3 Future work......Page 386
References......Page 387
1. Structural Genomics......Page 389
1 Introduction......Page 392
References......Page 391
2 Methods......Page 393
3.1 Progress......Page 394
3.2 Target characteristics......Page 396
3.3 Structure characteristics......Page 397
4 Discussion......Page 401
References......Page 402
1 Introduction......Page 404
2 Review......Page 405
3 Overview of BAYESPROT......Page 406
4.2 Dataset II......Page 407
4.3 Feature Vectors or Global Descriptors of Amino Acid Sequence......Page 408
5.2 TAN Bayesian Classifier......Page 409
5.3 Mean Probability Voting......Page 410
6.1 Results......Page 411
7.1 Dataset I: Comparison with Ding and Dubchak(2001)......Page 412
7.2 Dataset II: Comparison with Markowetz et al.(2003)......Page 413
8 Conclusions and future work......Page 414
References......Page 415
1 Introduction......Page 416
2 Infinite Gaussian Mixture Models......Page 418
3 Methods......Page 420
4.1 Globin Sequences......Page 421
4.2 Globin Sequences of Known Structure......Page 423
4.3 G-Coupled Protein Receptors (GPCRs)......Page 424
5 Discussion......Page 426
References......Page 427
1.1 Spatial MotifDiscovery in Proteins......Page 428
1.2 Related Work......Page 429
2.1 Labeled Graph......Page 430
2.2 Canonical Representation of Graphs......Page 432
2.3.1 Mutual Information and Coherent Induced Subgraphs......Page 433
2.3.2 Coherent Subgraph Mining Algorithm......Page 434
3.3 Dataset.s and Coherent Subgraph Mining......Page 435
3.5 Identification of Fingerprints for the Serine Protease Family......Page 436
4 Conclusions and Future Work......Page 437
References......Page 438
1 Introduction and Basic Definitions......Page 440
2 The Expected Structure of rRNA Molecules......Page 443
3 Identifying Good Predictions......Page 446
4 Possible Improvements......Page 448
5 Conclusions......Page 449
References......Page 450
1 Introduction......Page 452
2.1 Datasets......Page 453
2.2 Contrast Classifiers......Page 454
2.3 Training Contrast ClassiJiers for Bias Detection in PDB......Page 455
2.5 Using Contrast Classifiers to Explore Bias in PDB......Page 456
3.2 Distributions of Contrast Classifier Outputs......Page 457
References......Page 462
1 Introduction......Page 464
2.1 Problem Formulation......Page 465
2.2 Lower Bound Algorithms......Page 467
2.3 Upper Bound Algorithms......Page 469
3 Results......Page 471
4 Conclusions......Page 473
References......Page 475
1 Introduction......Page 476
2.1 Alignment of RDC data with structural fold......Page 478
2.3 Principal alignment frame search and fold recognition......Page 480
4 Discussion......Page 482
4.2 Combination of RDC data and predicted secondary structure for fold recognition......Page 484
4.4 Comparisons with DipoCoup......Page 485
Acknowledgments......Page 486
References......Page 487
Session Introduction T. Ideker; E. Neumann, and V Schachter......Page 488
1 Introduction......Page 491
2 Method......Page 493
3.1 Data set I......Page 496
3.2 Data Set 2......Page 498
References......Page 501
1 Introduction......Page 503
2.1 Network model description......Page 504
2.3 Experimental approach......Page 505
2.4 Algorithm.......Page 506
2.5 Estimation of the variance of the parameters.......Page 507
2.8 Simulated data......Page 508
3.1 Identification of networks......Page 509
4 Discussion......Page 511
References......Page 514
1 Introduction......Page 515
2 Chain Functions......Page 517
3 Reconstruction of Chain Functions......Page 518
3.2 Reconstructing the Regulator Set and the Function......Page 519
3.3 Using High-Order Experiments......Page 521
4 Combining Several Chains......Page 522
6 Concluding Remarks......Page 525
References......Page 526
1 Introduction......Page 527
2.2 Gene Perturbation......Page 528
2.3 A look at the data from the Davidson lab......Page 529
3.1 The flowchart......Page 530
4.2 The Complete Regulatory Network......Page 532
4.3 Network reduction......Page 534
5.2 Incorporation of auxiliary information......Page 535
References......Page 536
1 Introduction......Page 538
2 Molecular mechanism of autoreactive lymphocyte recruitment in brain venules......Page 539
3 Kinetics models of cell adhesion......Page 541
4 The BioSpi model implementation and results......Page 543
4.1 Specification......Page 545
References......Page 548
1 Introduction......Page 550
2 The CVQ Model......Page 552
3 Bayesian Model Selection......Page 553
4 Variational Bayesian Learning......Page 554
5 Analysis of Simulated Data......Page 556
6 Application in Microarray Data Analysis......Page 557
7 Discussion......Page 559
Reference......Page 560
Introduction......Page 562
Representing biotransfomnations and rules......Page 564
Extracting transformation rules from reaction data......Page 565
Biotransformation rule application......Page 566
Implementation......Page 567
Results and Discussion......Page 568
Conclusion......Page 571
References......Page 572
1 Introduction......Page 574
2.1 Preliminaries......Page 575
2.3 Correctness and Tame Complexity......Page 577
3.1 Application to Heat Shock Data......Page 580
3.2 Computational Possibilities and Limitations......Page 581
4 Conclusion......Page 582
References......Page 583
1 Introduction......Page 585
1.1 Formal Methods in Biology......Page 586
1.2 Pathway Logic......Page 587
2.1 Activation of Rafl at Level I......Page 588
2.2 Activation of Rafl at Level II......Page 590
3 Using the Pathway Logic Model......Page 594
References......Page 596
1. Introduction......Page 598
2. Methods......Page 600
3. Applications......Page 604
4. Discussion......Page 607
Reference......Page 608