Advances in Knowledge Discovery and Data Mining, Part II: 14th Pacific-Asia Conference, PAKDD 2010, Hyderabad, India, June 21-24, 2010, Proceedings

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book constitutes the proceedings of the 14th Pacific-Asia Conference, PAKDD 2010, held in Hyderabad, India, in June 2010.

Author(s): Mohammed J. Zaki, Jeffrey Xu Yu, B. Ravindran, Vikram Pudi
Series: Lecture Notes in Computer Science - Lecture Notes Artificial Intelligence
Publisher: Springer
Year: 2010

Language: English
Pages: 443

Cover......Page 1
Advances in Knowledge Discovery and Data Mining, Part II......Page 3
Lecture Notes in Artificial Intelligence 6119......Page 2
ISBN-10 3642136710......Page 4
Preface......Page 5
PAKDD 2010 Conference Organization......Page 7
Table of Contents – Part II......Page 13
Table of Contents – Part I......Page 18
Introduction......Page 23
Problem Setting and Motivation......Page 24
Transformation from Must-Link Constraints......Page 26
Dimension Reduction......Page 28
Analysis of Experiments......Page 31
Conclusions and Future Works......Page 34
Introduction......Page 36
Related Work......Page 37
Data Indexing and Nearest Neighbours Retrieval......Page 38
Distributed Geodesic Distances Definition......Page 40
Approximating the Multidimensional Scaling......Page 41
Experiments......Page 43
Conclusion......Page 47
Introduction......Page 49
Related Works......Page 50
Distributed Progressive Sequential Pattern Mining......Page 51
Candidate Computing Job......Page 52
Experimental Results......Page 54
Conclusions......Page 55
Computing Trends and Challenges Using GPU......Page 57
HAC Algorithms......Page 58
HAC Implementation Results and Discussions......Page 59
Research Issues with Clustering Algorithms on CUDA......Page 60
CUDA Process Block Size and Threads......Page 61
Analysis of Threads in CUDA for Data Parallelism......Page 62
Conclusion......Page 63
References......Page 64
Introduction......Page 65
Schema Matching......Page 67
Subsequence Similarity Search......Page 68
Methods......Page 69
Simulated ERPs......Page 70
Sequence Post-processing......Page 71
Sequence Similarity Search......Page 72
Results......Page 73
Conclusion and Future Work......Page 74
References......Page 75
Introduction......Page 77
Support Vector Machines......Page 78
TSVM-RFE......Page 79
An Illustration......Page 80
Leukemia Data......Page 81
Colon Data......Page 82
Conclusions......Page 83
Introduction......Page 85
Background and Related Work......Page 86
Approach......Page 87
Knowledge Sphere Creation......Page 88
Gather Relevant SVs......Page 89
Performance Evaluation......Page 90
Conclusion......Page 92
Introduction......Page 93
Archetypes, Classes, and Sub-classes in EverQuest II......Page 94
Player Performance Prediction in EverQuest II......Page 95
Dataset......Page 96
Discretization Improves Prediction Coverage......Page 97
Comparison of Prediction Models......Page 98
Future Directions......Page 101
Introduction......Page 103
Related Work......Page 104
Proposed Method for Gene Selection......Page 106
Experimental Setup and Results......Page 107
Conclusion......Page 109
Introduction......Page 111
KNN Classification with Weighted Instances......Page 113
Learning the Instance Weights by WDKNN Algorithm......Page 114
Experiments on UCI Data Sets......Page 117
Effect of Noise......Page 120
Conclusions......Page 121
Introduction......Page 123
Two-Dimensional FLD (2DFLD) Method for Feature Extraction......Page 124
Key Idea and the Algorithm......Page 126
Experimental Results......Page 127
Experiments on the AT and T Face Database......Page 128
Experiments on the UMIST Face Database......Page 131
Conclusion......Page 133
References......Page 134
Introduction......Page 135
Learning Gradients......Page 137
Gaussian Processes Regression......Page 138
Gradients Estimation Model with Gaussian Processes......Page 139
Learning Kernel Hyperparameters......Page 141
Error Bar Estimation......Page 142
High-Dimensional Data Set......Page 144
Conclusions......Page 146
Introduction......Page 147
Radviz's Algorithm......Page 148
Radviz-Dependent DA......Page 149
Experimental Setting......Page 150
Conclusions......Page 154
Introduction......Page 155
Mining Interesting Subgraphs......Page 156
Density Computation on Graphs......Page 157
Density Based Clusters on Graphs......Page 159
Graph-Theoretic View and Algorithmic Aspects......Page 162
Complexity Analysis......Page 163
Experiments......Page 164
Conclusion......Page 167
Introduction......Page 169
Itemset-Sharing Subgraph (ISS) Set Enumeration Problem......Page 172
ISS Enumeration......Page 173
ISS Set Combination......Page 174
Results for a Synthetic Network......Page 176
Results for a Citation Network......Page 178
Related Work......Page 179
Concluding Remarks......Page 180
Introduction......Page 182
Implementing a Basic Utility within the Network Layer: BFS......Page 184
Extending to Graph Mining: Quasi Clique Detection......Page 185
The RCR Strategy in SQL-Based Approach......Page 186
Evaluation of Quasi Clique Detection......Page 187
Conclusions and Future Work......Page 189
Introduction......Page 190
The Most Reliable k-terminal Subgraph Problem......Page 191
Algorithms......Page 192
Experiments......Page 194
Test Set-Up......Page 195
Results......Page 196
Conclusions......Page 198
Introduction......Page 200
Representation of Graph Sequences......Page 201
Mining Frequent Transformation Subsequences......Page 203
Proposed Method: GTRACE2......Page 205
Experiment and Discussion......Page 208
Conclusion......Page 209
Introduction......Page 211
Related Work......Page 212
Incorporating Document-Level Constraints......Page 213
Algorithm Derivation......Page 214
Algorithm Correctness and Convergence......Page 215
Evaluation Metrics......Page 218
Clustering Results......Page 219
Conclusions and Future Work......Page 221
Introduction......Page 223
Related Work......Page 225
Rule Synthesizing Based on Both Items and Rules......Page 226
Item Based Clustering Algorithm......Page 227
Weighting Model for Rule Synthesizing......Page 229
Experimental Results on Similar Databases......Page 231
Experimental Results on Dissimilar Databases......Page 232
Scalability Assessment......Page 233
Conclusions......Page 234
Introduction......Page 236
Related Work......Page 237
FONT......Page 239
Experiments......Page 240
Performance Comparisons......Page 241
Conclusions......Page 242
Introduction......Page 244
Related Work......Page 245
In-Page Link-Structures......Page 246
Hierarchical Clustering......Page 247
Effectiveness......Page 248
Efficiency......Page 249
Conclusion......Page 250
References......Page 251
Introduction......Page 252
Number Suffix Arrays......Page 253
Number Clustering by Dirichlet Process Mixture Models......Page 255
Experiments: Synonym Extraction......Page 257
Results: Speed and Accuracy of the Algorithm......Page 258
References......Page 259
Introduction......Page 260
Related Work......Page 261
BK-FIRM......Page 262
Feature and Opinion Learner......Page 263
Opinion-Based Query Processor......Page 267
Evaluation Methods......Page 268
References......Page 270
Introduction......Page 271
Related Works......Page 273
Blog Representation and Query Generation......Page 274
Topic-Opinion Mixture Model......Page 275
Experiment Setup......Page 277
Experimental Results......Page 278
References......Page 281
Introduction......Page 283
Learning Paradigms and Related Work......Page 284
The Proposed Algorithm......Page 285
Substring-Group Feature Extracting......Page 286
Feature Selecting......Page 287
Evaluation Metrics......Page 288
Multilingual Characteristics......Page 289
Transductive Learning vs. Inductive Learning......Page 290
References......Page 292
Introduction......Page 294
Decentralised Frameworks......Page 295
Decentralisation in Structured Peer-to-Peer Networks......Page 296
Decentralisation in Unstructured Peer-to-Peer Networks......Page 298
Outline of Experiment......Page 299
Evaluation Method......Page 300
Results and Discussion......Page 301
Conclusion......Page 302
References......Page 303
Introduction......Page 305
Feature Selection Methods......Page 307
Mood Classification Results......Page 308
Mood Pattern Discovery......Page 310
Conclusion......Page 311
Introduction......Page 313
Interpretation of Evidence in a Document......Page 314
Definitions of `Subjective Logic' and Our Conceptualization......Page 315
Data Processing......Page 317
Results......Page 318
References......Page 320
Introduction......Page 321
Related Work......Page 322
Perceptron Learning......Page 323
Hoeffding Perceptron Tree......Page 324
Comparative Experimental Evaluation......Page 325
Real-World Data......Page 326
Results......Page 328
References......Page 331
Introduction......Page 333
Background: Novel Class Detection with MineClass......Page 335
ActMiner Algorithm......Page 337
Data Selection for Labeling......Page 338
Experiments......Page 340
Baseline Approach......Page 341
Evaluation......Page 342
References......Page 346
Introduction......Page 347
Bayesian Classification and the Bayes Tree......Page 348
Machine Learning and Statistical Approaches......Page 349
Experiments......Page 352
Conclusion......Page 355
References......Page 356
Introduction......Page 357
Problem Statement......Page 358
Description of the Structure......Page 359
Updating the Structure......Page 360
Frequent Itemset Synthesis......Page 361
Synthetic Datasets......Page 362
Real Dataset......Page 363
References......Page 364
Introduction......Page 365
Related Work......Page 366
Representativeness of Samples......Page 367
Insertion of the Samples in CluStream......Page 369
Median Evaluation......Page 371
Classification Evaluation......Page 372
Runtime Evaluation......Page 373
Conclusion......Page 374
References......Page 375
Introduction......Page 376
Preliminaries......Page 377
WDTW: A Weighted Algorithm for Dynamic Time Warping......Page 378
Experimental Results......Page 382
References......Page 383
Introduction......Page 384
Kernels and Normalization Methods......Page 385
Kernel Definition and the Cosine Normalization......Page 386
Kernel Normalization of Order t......Page 387
Properties of Normalized Kernels as Similarity Indices......Page 389
Basic Properties of Kt......Page 390
Metric Properties of Kt......Page 391
Kernel PCA Based k-means Clustering......Page 392
Experiments Settings......Page 393
Conclusion and Future Work......Page 394
References......Page 395
Motivation and Related Work......Page 396
Primer of Graph Theory......Page 398
Kernels on Graphs......Page 399
Experiments......Page 402
Results and Analysis......Page 403
Conclusions and Future Work......Page 406
References......Page 407
Introduction......Page 408
Metric Learning......Page 410
Scoring Based (Dis-)Similarity Learning......Page 412
Experiments......Page 415
Conclusions and Future Work......Page 418
References......Page 419
Introduction......Page 420
Search Semantics and Results......Page 421
Clustering Algorithms......Page 423
GC: Graph-Based Clustering Algorithm for XML Keyword Search......Page 424
CC: Core-Driven Clustering Algorithm for XML Keyword Search......Page 425
LCC: Loosened Core-Driven Clustering Algorithm for XML Keyword Search......Page 427
Ranking of Results......Page 428
Experiments......Page 429
Conclusion......Page 430
Introduction......Page 432
Data Description......Page 433
Proposed Method......Page 434
Feature Extraction......Page 435
Laws and Observations......Page 436
CliqueStar.......Page 438
Conclusion......Page 441
References......Page 442