This book constitutes the refereed proceedings of the 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2008, held in Osaka, Japan, in May 2008.
The 37 revised long papers, 40 revised full papers, and 36 revised short papers presented together with 1 keynote talk and 4 invited lectures were carefully reviewed and selected from 312 submissions. The papers present new ideas, original research results, and practical development experiences from all KDD-related areas including data mining, data warehousing, machine learning, databases, statistics, knowledge acquisition, automatic scientific discovery, data visualization, causal induction, and knowledge-based systems.
Author(s): Christos Faloutsos (auth.), Takashi Washio, Einoshin Suzuki, Kai Ming Ting, Akihiro Inokuchi (eds.)
Series: Lecture Notes in Computer Science 5012 : Lecture Notes in Artificial Intelligence
Edition: 1
Publisher: Springer-Verlag Berlin Heidelberg
Year: 2008
Language: English
Pages: 1102
Tags: Artificial Intelligence (incl. Robotics); Data Mining and Knowledge Discovery; Information Storage and Retrieval; Probability and Statistics in Computer Science; Multimedia Information Systems; Computer Appl. in Administrative Data Proce
Front Matter....Pages -
Graph Mining: Laws, Generators and Tools....Pages 1-1
Efficient Algorithms for Mining Frequent and Closed Patterns from Semi-structured Data....Pages 2-13
Supporting Creativity: Towards Associative Discovery of New Insights....Pages 14-25
Cost-Sensitive Classifier Evaluation Using Cost Curves....Pages 26-29
Prospective Scientific Methodology in Knowledge Society....Pages 30-39
SubClass: Classification of Multidimensional Noisy Data Using Subspace Clusters....Pages 40-52
Mining Quality-Aware Subspace Clusters....Pages 53-63
A Decremental Approach for Mining Frequent Itemsets from Uncertain Data....Pages 64-75
Multi-class Named Entity Recognition Via Bootstrapping with Dependency Tree-Based Patterns....Pages 76-87
Towards Region Discovery in Spatial Datasets....Pages 88-99
Accurate and Efficient Retrieval of Multimedia Time Series Data Under Uniform Scaling and Time Warping....Pages 100-111
Feature Construction Based on Closedness Properties Is Not That Simple....Pages 112-123
On Addressing Accuracy Concerns in Privacy Preserving Association Rule Mining....Pages 124-135
Privacy-Preserving Linear Fisher Discriminant Analysis....Pages 136-147
Unsupervised Change Analysis Using Supervised Learning....Pages 148-159
ANEMI: An Adaptive Neighborhood Expectation-Maximization Algorithm with Spatial Augmented Initialization....Pages 160-171
Minimum Variance Associations — Discovering Relationships in Numerical Data....Pages 172-183
An Efficient Unordered Tree Kernel and Its Application to Glycan Classification....Pages 184-195
Generation of Globally Relevant Continuous Features for Classification....Pages 196-208
Mining Bulletin Board Systems Using Community Generation....Pages 209-221
Extreme Support Vector Machine Classifier....Pages 222-233
LCM over ZBDDs: Fast Generation of Very Large-Scale Frequent Itemsets Using a Compact Graph-Based Representation....Pages 234-246
Unusual Pattern Detection in High Dimensions....Pages 247-259
Person Name Disambiguation in Web Pages Using Social Network, Compound Words and Latent Topics....Pages 260-271
Mining Correlated Subgraphs in Graph Databases....Pages 272-283
A Minimal Description Length Scheme for Polynomial Regression....Pages 284-295
Handling Numeric Attributes in Hoeffding Trees....Pages 296-307
Scaling Record Linkage to Non-uniform Distributed Class Sizes....Pages 308-319
Large-Scale k-Means Clustering with User-Centric Privacy Preservation....Pages 320-332
Semi-Supervised Local Fisher Discriminant Analysis for Dimensionality Reduction....Pages 333-344
An Efficient Algorithm for Finding Similar Short Substrings from Large Scale String Data....Pages 345-356
Ambiguous Frequent Itemset Mining and Polynomial Delay Enumeration....Pages 357-368
Characteristic-Based Descriptors for Motion Sequence Recognition....Pages 369-380
Protecting Privacy in Incremental Maintenance for Distributed Association Rule Mining....Pages 381-392
SEM: Mining Spatial Events from the Web....Pages 393-404
BOAI: Fast Alternating Decision Tree Induction Based on Bottom-Up Evaluation....Pages 405-416
Feature Selection by Nonparametric Bayes Error Minimization....Pages 417-428
A Framework for Modeling Positive Class Expansion with Single Snapshot....Pages 429-440
A Decomposition Algorithm for Learning Bayesian Network Structures from Data....Pages 441-453
Learning Classification Rules for Multiple Target Attributes....Pages 454-465
A Mixture Model for Expert Finding....Pages 466-478
On Privacy in Time Series Data Mining....Pages 479-493
Exploiting Propositionalization Based on Random Relational Rules for Semi-supervised Learning....Pages 494-502
On Discrete Data Clustering....Pages 503-510
Automatic Training Example Selection for Scalable Unsupervised Record Linkage....Pages 511-518
Analyzing PETs on Imbalanced Datasets When Training and Testing Class Distributions Differ....Pages 519-526
Improving the Robustness to Outliers of Mixtures of Probabilistic PCAs....Pages 527-535
Exploratory Hot Spot Profile Analysis Using Interactive Visual Drill-Down Self-Organizing Maps....Pages 536-543
Maintaining Optimal Multi-way Splits for Numerical Attributes in Data Streams....Pages 544-553
Efficient Mining of High Utility Itemsets from Large Datasets....Pages 554-561
Tradeoff Analysis of Different Markov Blanket Local Learning Approaches....Pages 562-571
Forecasting Urban Air Pollution Using HMM-Fuzzy Model....Pages 572-581
Relational Pattern Mining Based on Equivalent Classes of Properties Extracted from Samples....Pages 582-591
Evaluating Standard Techniques for Implicit Diversity....Pages 592-599
A Simple Characterization on Serially Constructible Episodes....Pages 600-607
Bootstrap Based Pattern Selection for Support Vector Regression....Pages 608-615
Tracking Topic Evolution in On-Line Postings: 2006 IBM Innovation Jam Data....Pages 616-625
PAID: Packet Analysis for Anomaly Intrusion Detection....Pages 626-633
A Comparison of Different Off-Centered Entropies to Deal with Class Imbalance for Decision Trees....Pages 634-643
FIsViz: A Frequent Itemset Visualizer....Pages 644-652
A Tree-Based Approach for Frequent Pattern Mining from Uncertain Data....Pages 653-661
Connectivity Based Stream Clustering Using Localised Density Exemplars....Pages 662-672
Learning User Purchase Intent from User-Centric Data....Pages 673-680
Query Expansion for the Language Modelling Framework Using the Naïve Bayes Assumption....Pages 681-688
Fast Online Estimation of the Joint Probability Distribution....Pages 689-696
Fast k Most Similar Neighbor Classifier for Mixed Data Based on Approximating and Eliminating....Pages 697-704
Entity Network Prediction Using Multitype Topic Models....Pages 705-714
Using Supervised and Unsupervised Techniques to Determine Groups of Patients with Different Doctor-Patient Stability....Pages 715-722
Local Projection in Jumping Emerging Patterns Discovery in Transaction Databases....Pages 723-730
Applying Latent Semantic Indexing in Frequent Itemset Mining for Document Relation Discovery....Pages 731-738
G-TREACLE: A New Grid-Based and Tree-Alike Pattern Clustering Technique for Large Databases....Pages 739-748
A Clustering-Oriented Star Coordinate Translation Method for Reliable Clustering Parameterization....Pages 749-758
Constrained Clustering for Gene Expression Data Mining....Pages 759-766
Concept Lattice–Based Mutation Control for Reactive Motifs Discovery....Pages 767-776
Mining a Complete Set of Both Positive and Negative Association Rules from Large Databases....Pages 777-784
Designing a System for a Process Parameter Determined through Modified PSO and Fuzzy Neural Network....Pages 785-794
Data-Aware Clustering Hierarchy for Wireless Sensor Networks....Pages 795-802
A More Topologically Stable Locally Linear Embedding Algorithm Based on R*-Tree....Pages 803-812
Sparse Kernel-Based Feature Weighting....Pages 813-820
Term Committee Based Event Identification within News Topics....Pages 821-829
Locally Linear Online Mapping for Mining Low-Dimensional Data Manifolds....Pages 830-838
A Creditable Subspace Labeling Method Based on D-S Evidence Theory....Pages 839-848
Discovering New Orders of the Chemical Elements through Genetic Algorithms....Pages 849-857
What Is Frequent in a Single Graph?....Pages 858-863
A Cluster-Based Genetic-Fuzzy Mining Approach for Items with Multiple Minimum Supports....Pages 864-869
A Selective Classifier for Incomplete Data....Pages 870-876
Detecting Near-Duplicates in Large-Scale Short Text Databases....Pages 877-883
Customer Churn Time Prediction in Mobile Telecommunication Industry Using Ordinal Regression....Pages 884-889
Rule Extraction with Rough-Fuzzy Hybridization Method....Pages 890-895
I/O Scalable Bregman Co-clustering....Pages 896-903
Jumping Emerging Patterns with Occurrence Count in Image Classification....Pages 904-909
Mining Non-coincidental Rules without a User Defined Support Threshold....Pages 910-915
Transaction Clustering Using a Seeds Based Approach....Pages 916-922
Using Ontology-Based User Preferences to Aggregate Rank Lists in Web Search....Pages 923-931
The Application of Echo State Network in Stock Data Mining....Pages 932-937
Text Categorization of Multilingual Web Pages in Specific Domain....Pages 938-944
Efficient Joint Clustering Algorithms in Optimization and Geography Domains....Pages 945-950
Active Learning with Misclassification Sampling Using Diverse Ensembles Enhanced by Unlabeled Instances....Pages 951-957
A New Model for Image Annotation....Pages 958-963
Unmixed Spectrum Clustering for Template Composition in Lung Sound Classification....Pages 964-969
Forward Semi-supervised Feature Selection....Pages 970-976
Automatic Extraction of Basis Expressions That Indicate Economic Trends....Pages 977-984
A New Framework for Taxonomy Discovery from Text....Pages 985-991
R-Map: Mapping Categorical Data for Clustering and Visualization Based on Reference Sets....Pages 992-998
Mining Changes in Patent Trends for Competitive Intelligence....Pages 999-1005
Seeing Several Stars: A Rating Inference Task for a Document Containing Several Evaluation Criteria....Pages 1006-1014
Structure-Based Hierarchical Transformations for Interactive Visual Exploration of Social Networks....Pages 1015-1021
CP-Tree: A Tree Structure for Single-Pass Frequent Pattern Mining....Pages 1022-1027
Combining Context and Existing Knowledge When Recognizing Biological Entities – Early Results....Pages 1028-1034
Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features....Pages 1035-1041
Cell-Based Outlier Detection Algorithm: A Fast Outlier Detection Algorithm for Large Datasets....Pages 1042-1048
Fighting WebSpam: Detecting Spam on the Graph Via Content and Link Features....Pages 1049-1055
A Framework for Discovering Spatio-temporal Cohesive Networks....Pages 1056-1061
Efficient Mining of Minimal Distinguishing Subgraph Patterns from Graph Databases....Pages 1062-1068
Combined Association Rule Mining....Pages 1069-1074
Enriching WordNet with Folksonomies....Pages 1075-1080
A New Credit Scoring Method Based on Rough Sets and Decision Tree....Pages 1081-1089
Analyzing the Propagation of Influence and Concept Evolution in Enterprise Social Networks through Centrality and Latent Semantic Analysis....Pages 1090-1098
Back Matter....Pages -