Advances in Knowledge Discovery and Data Mining: 10th Pacific-Asia Conference, PAKDD 2006, Singapore, April 9-12, 2006. Proceedings

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is a leading international conference in the area of data mining and knowledge discovery. This year marks the tenth anniversary of the successful annual series of PAKDD conferences held in the Asia Pacific region. It was with pleasure that we hosted PAKDD 2006 in Singapore again, since the inaugural PAKDD conference was held in Singapore in 1997. PAKDD 2006 continues its tradition of providing an international forum for researchers and industry practitioners to share their new ideas, original research results and practical development experiences from all aspects of KDD data mining, including data cleaning, data warehousing, data mining techniques, knowledge visualization, and data mining applications. This year, we received 501 paper submissions from 38 countries and regions in Asia, Australasia, North America and Europe, of which we accepted 67 (13.4%) papers as regular papers and 33 (6.6%) papers as short papers. The distribution of the accepted papers was as follows: USA (17%), China (16%), Taiwan (10%), Australia (10%), Japan (7%), Korea (7%), Germany (6%), Canada (5%), Hong Kong (3%), Singapore (3%), New Zealand (3%), France (3%), UK (2%), and the rest from various countries in the Asia Pacific region.

Author(s): David J. Hand (auth.), Wee-Keong Ng, Masaru Kitsuregawa, Jianzhong Li, Kuiyu Chang (eds.)
Series: Lecture Notes in Computer Science 3918 : Lecture Notes in Artificial Intelligence
Edition: 1
Publisher: Springer-Verlag Berlin Heidelberg
Year: 2006

Language: English
Pages: 879
Tags: Artificial Intelligence (incl. Robotics); Database Management; Information Storage and Retrieval; Probability and Statistics in Computer Science; Multimedia Information Systems; Computer Appl. in Administrative Data Processing

Front Matter....Pages -
Protection or Privacy? Data Mining and Personal Data....Pages 1-10
The Changing Face of Web Search....Pages 11-11
Data Mining for Surveillance Applications....Pages 12-14
A Multiclass Classification Method Based on Output Design....Pages 15-19
Regularized Semi-supervised Classification on Manifold....Pages 20-29
Similarity-Based Sparse Feature Extraction Using Local Manifold Learning....Pages 30-34
Generalized Conditional Entropy and a Metric Splitting Criterion for Decision Trees....Pages 35-44
RNBL-MN: A Recursive Naive Bayes Learner for Sequence Classification....Pages 45-54
TRIPPER: Rule Learning Using Taxonomies....Pages 55-59
Using Weighted Nearest Neighbor to Benefit from Unlabeled Data....Pages 60-69
Constructive Meta-level Feature Selection Method Based on Method Repositories....Pages 70-80
Variable Randomness in Decision Tree Ensembles....Pages 81-90
Further Improving Emerging Pattern Based Classifiers Via Bagging....Pages 91-96
Improving on Bagging with Input Smearing....Pages 97-106
Boosting Prediction Accuracy on Imbalanced Datasets with SVM Ensembles....Pages 107-118
DeLi-Clu: Boosting Robustness, Completeness, Usability, and Efficiency of Hierarchical Clustering by a Closest Pair Ranking....Pages 119-128
Iterative Clustering Analysis for Grouping Missing Data in Gene Expression Profiles....Pages 129-138
An EM-Approach for Clustering Multi-Instance Objects....Pages 139-148
Mining Maximal Correlated Member Clusters in High Dimensional Database....Pages 149-159
Hierarchical Clustering Based on Mathematical Optimization....Pages 160-173
Clustering Multi-represented Objects Using Combination Trees....Pages 174-178
Parallel Density-Based Clustering of Complex Objects....Pages 179-188
Neighborhood Density Method for Selecting Initial Cluster Centers in K-Means Clustering....Pages 189-198
Uncertain Data Mining: An Example in Clustering Location Data....Pages 199-204
Parallel Randomized Support Vector Machine....Pages 205-214
ε -Tube Based Pattern Selection for Support Vector Machines....Pages 215-224
Self-adaptive Two-Phase Support Vector Clustering for Multi-Relational Data Mining....Pages 225-229
One-Class Support Vector Machines for Recommendation Tasks....Pages 230-239
Heterogeneous Information Integration in Hierarchical Text Classification....Pages 240-249
FISA: Feature-Based Instance Selection for Imbalanced Text Classification....Pages 250-254
Dynamic Category Profiling for Text Filtering and Classification....Pages 255-264
Detecting Citation Types Using Finite-State Machines....Pages 265-274
A Systematic Study of Parameter Correlations in Large Scale Duplicate Document Detection....Pages 275-284
Comparison of Documents Classification Techniques to Classify Medical Reports....Pages 285-291
XCLS: A Fast and Effective Clustering Algorithm for Heterogenous XML Documents....Pages 292-302
Clustering Large Collection of Biomedical Literature Based on Ontology-Enriched Bipartite Graph Representation and Mutual Refinement Strategy....Pages 303-312
Level-Biased Statistics in the Hierarchical Structure of the Web....Pages 313-322
Cleopatra : Evolutionary Pattern-Based Clustering of Web Usage Data....Pages 323-333
Extracting and Summarizing Hot Item Features Across Different Auction Web Sites....Pages 334-345
Clustering Web Sessions by Levels of Page Similarity....Pages 346-350
i Wed : An Integrated Multigraph Cut-Based Approach for Detecting Events from a Website....Pages 351-360
Enhancing Duplicate Collection Detection Through Replica Boundary Discovery....Pages 361-370
Summarization and Visualization of Communication Patterns in a Large-Scale Social Network....Pages 371-379
Patterns of Influence in a Recommendation Network....Pages 380-389
Constructing Decision Trees for Graph-Structured Data by Chunkingless Graph-Based Induction....Pages 390-399
Combining Smooth Graphs with Semi-supervised Classification....Pages 400-409
Network Data Mining: Discovering Patterns of Interaction Between Attributes....Pages 410-414
SGPM: Static Group Pattern Mining Using Apriori-Like Sliding Window....Pages 415-424
Mining Temporal Indirect Associations....Pages 425-434
Mining Top-K Frequent Closed Itemsets Is Not in APX....Pages 435-439
Quality-Aware Association Rule Mining....Pages 440-449
IMB3-Miner: Mining Induced/Embedded Subtrees by Constraining the Level of Embedding....Pages 450-461
Maintaining Frequent Itemsets over High-Speed Data Streams....Pages 462-467
Generalized Disjunction-Free Representation of Frequents Patterns with at Most k Negations....Pages 468-472
Mining Interesting Imperfectly Sporadic Rules....Pages 473-482
Improved Negative-Border Online Mining Approaches....Pages 483-492
Association-Based Dissimilarity Measures for Categorical Data: Limitation and Improvement....Pages 493-498
Is Frequency Enough for Decision Makers to Make Decisions?....Pages 499-503
Ramp : High Performance Frequent Itemset Mining with Efficient Bit-Vector Projection Technique....Pages 504-508
Evaluating a Rule Evaluation Support Method Based on Objective Rule Evaluation Indices....Pages 509-519
Scoring Method for Tumor Prediction from Microarray Data Using an Evolutionary Fuzzy Classifier....Pages 520-529
Efficient Discovery of Structural Motifs from Protein Sequences with Combination of Flexible Intra- and Inter-block Gap Constraints....Pages 530-539
Finding Consensus Patterns in Very Scarce Biosequence Samples from Their Minimal Multiple Generalizations....Pages 540-545
Kernels on Lists and Sets over Relational Algebra: An Application to Classification of Protein Fingerprints....Pages 546-551
Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results....Pages 552-556
A Nonparametric Outlier Detection for Effectively Discovering Top-N Outliers from Engineering Data....Pages 557-566
A Fast Greedy Algorithm for Outlier Mining....Pages 567-576
Ranking Outliers Using Symmetric Neighborhood Relationship....Pages 577-593
Construction of Finite Automata for Intrusion Detection from System Call Sequences by Genetic Algorithms....Pages 594-602
An Adaptive Intrusion Detection Algorithm Based on Clustering and Kernel-Method....Pages 603-610
Weighted Intra-transactional Rule Mining for Database Intrusion Detection....Pages 611-620
On Robust and Effective K-Anonymity in Large Databases....Pages 621-636
Achieving Private Recommendations Using Randomized Response Techniques....Pages 637-646
Privacy-Preserving SVM Classification on Vertically Partitioned Data....Pages 647-656
Data Mining Using Relational Database Management Systems....Pages 657-667
Bias-Free Hypothesis Evaluation in Multirelational Domains....Pages 668-672
Enhanced DB-Subdue: Supporting Subtle Aspects of Graph Mining Using a Relational Approach....Pages 673-678
Multimedia Semantics Integration Using Linguistic Model....Pages 679-688
A Novel Indexing Approach for Efficient and Fast Similarity Search of Captured Motions....Pages 689-698
Mining Frequent Spatial Patterns in Image Databases....Pages 699-703
Image Classification Via LZ78 Based String Kernel: A Comparative Study....Pages 704-712
Distributed Pattern Discovery in Multiple Streams....Pages 713-718
COMET: Event-Driven Clustering over Multiple Evolving Streams....Pages 719-723
Variable Support Mining of Frequent Itemsets over Data Streams Using Synopsis Vectors....Pages 724-728
Hardware Enhanced Mining for Association Rules....Pages 729-738
A Single Index Approach for Time-Series Subsequence Matching That Supports Moving Average Transform of Arbitrary Order....Pages 739-749
Efficient Mining of Emerging Events in a Dynamic Spatiotemporal Environment....Pages 750-754
A Multi-Hierarchical Representation for Similarity Measurement of Time Series....Pages 755-764
Multistep-Ahead Time Series Prediction....Pages 765-774
Sequential Pattern Mining with Time Intervals....Pages 775-779
A Wavelet Analysis Based Data Processing for Time Series of Data Mining Predicting....Pages 780-789
Intelligent Particle Swarm Optimization in Multi-objective Problems....Pages 790-800
Hidden Space Principal Component Analysis....Pages 801-805
Neighbor Line-Based Locally Linear Embedding....Pages 806-815
Predicting Rare Extreme Values....Pages 816-820
Domain-Driven Actionable Knowledge Discovery in the Real World....Pages 821-830
Evaluation of Attribute-Aware Recommender System Algorithms on Data with Varying Characteristics....Pages 831-840
An Intelligent System Based on Kernel Methods for Crop Yield Prediction....Pages 841-846
A Machine Learning Application for Human Resource Data Mining Problem....Pages 847-856
Towards Automated Design of Large-Scale Circuits by Combining Evolutionary Design with Data Mining....Pages 857-866
Mining Unexpected Associations for Signalling Potential Adverse Drug Reactions from Administrative Health Databases....Pages 867-876
Back Matter....Pages -