This volume contains the proceedings of the International Conference on Advanced Data Mining and Applications (ADMA 2009), held in Beijing, China, during August 17–19, 2009. We are pleased to have a very strong program. Acceptance into the conference proceedings was extremely competitive. From the 322 submissions from 27 countries and regions, the Program Committee selected 34 full papers and 47 short papers for presentation at the conference and inclusion in the proceedings. The c- tributed papers cover a wide range of data mining topics and a diverse spectrum of interesting applications. The Program Committee worked very hard to select these papers through a rigorous review process and extensive discussion, and finally c- posed a diverse and exciting program for ADMA 2009. An important feature of the main program was the truly outstanding keynote spe- ers program. Edward Y. Chang, Director of Research, Google China, gave a talk titled "Confucius and 'Its' Intelligent Disciples". Being right in the forefront of data mining applications to the world's largest knowledge and data base, the Web, Dr. Chang - scribed how Google's Knowledge Search product help to improve the scalability of machine learning for Web-scale applications. Charles X. Ling, a seasoned researcher in data mining from the University of Western Ontario, Canada, talked about his in- vative applications of data mining and artificial intelligence to gifted child education.
Author(s): Edward Y. Chang (auth.), Ronghuai Huang, Qiang Yang, Jian Pei, João Gama, Xiaofeng Meng, Xue Li (eds.)
Series: Lecture Notes in Computer Science 5678 : Lecture Notes in Artificial Intelligence
Edition: 1
Publisher: Springer-Verlag Berlin Heidelberg
Year: 2009
Language: English
Pages: 807
Tags: Data Mining and Knowledge Discovery; Information Storage and Retrieval; Information Systems and Communication Service; Information Systems Applications (incl.Internet); Pattern Recognition; Artificial Intelligence (incl. Robotics)
Front Matter....Pages -
Confucius and “Its” Intelligent Disciples....Pages 1-1
From Machine Learning to Child Learning....Pages 2-2
Sensitivity Based Generalization Error for Supervised Learning Problems with Application in Feature Selection....Pages 3-3
Data Mining in Financial Markets....Pages 4-4
Cluster Analysis Based on the Central Tendency Deviation Principle....Pages 5-18
A Parallel Hierarchical Agglomerative Clustering Technique for Billingual Corpora Based on Reduced Terms with Automatic Weight Optimization....Pages 19-30
Automatically Identifying Tag Types....Pages 31-42
Social Knowledge-Driven Music Hit Prediction....Pages 43-54
Closed Non Derivable Data Cubes Based on Non Derivable Minimal Generators....Pages 55-66
Indexing the Function: An Efficient Algorithm for Multi-dimensional Search with Expensive Distance Functions....Pages 67-78
Anti-germ Performance Prediction for Detergents Based on Elman Network on Small Data Sets....Pages 79-90
A Neighborhood Search Method for Link-Based Tag Clustering....Pages 91-103
Mining the Structure and Evolution of the Airport Network of China over the Past Twenty Years....Pages 104-115
Mining Class Contrast Functions by Gene Expression Programming....Pages 116-127
McSOM: Minimal Coloring of Self-Organizing Map....Pages 128-139
Chinese Blog Clustering by Hidden Sentiment Factors....Pages 140-151
Semi Supervised Image Spam Hunter: A Regularized Discriminant EM Approach....Pages 152-164
Collaborative Filtering Recommendation Algorithm Using Dynamic Similar Neighbor Probability....Pages 165-174
Calculating Similarity Efficiently in a Small World....Pages 175-187
A Framework for Multi-Objective Clustering and Its Application to Co-Location Mining....Pages 188-199
Feature Selection in Marketing Applications....Pages 200-208
Instance Selection by Border Sampling in Multi-class Domains....Pages 209-221
Virus Propagation and Immunization Strategies in Email Networks....Pages 222-233
Semi-supervised Discriminant Analysis Based on Dependence Estimation....Pages 234-245
Nearest Neighbor Tour Circuit Encryption Algorithm Based Random Isomap Reduction....Pages 246-252
Bayesian Multi-topic Microarray Analysis with Hyperparameter Reestimation....Pages 253-264
Discovery of Correlated Sequential Subgraphs from a Sequence of Graphs....Pages 265-276
Building a Text Classifier by a Keyword and Wikipedia Knowledge....Pages 277-287
Discovery of Migration Habitats and Routes of Wild Bird Species by Clustering and Association Analysis....Pages 288-301
GOD-CS: A New Grid-Oriented Dissection Clustering Scheme for Large Databases....Pages 302-313
Study on Ensemble Classification Methods towards Spam Filtering....Pages 314-325
Crawling Deep Web Using a New Set Covering Algorithm....Pages 326-337
A Hybrid Statistical Data Pre-processing Approach for Language-Independent Text Classification....Pages 338-349
A Potential-Based Node Selection Strategy for Influence Maximization in a Social Network....Pages 350-361
A Novel Component-Based Model and Ranking Strategy in Constrained Evolutionary Optimization....Pages 362-373
A Semi-supervised Topic-Driven Approach for Clustering Textual Answers to Survey Questions....Pages 374-385
An Information-Theoretic Approach for Multi-task Learning....Pages 386-396
Online New Event Detection Based on IPLSA....Pages 397-408
Discovering Knowledge from Multi-relational Data Based on Information Retrieval Theory....Pages 409-416
A Secure Protocol to Maintain Data Privacy in Data Mining....Pages 417-426
Transfer Learning with Data Edit....Pages 427-434
Exploiting Temporal Authors Interests via Temporal-Author-Topic Modeling....Pages 435-443
Mining User Position Log for Construction of Personalized Activity Map....Pages 444-452
A Multi-Strategy Approach to KNN and LARM on Small and Incrementally Induced Prediction Knowledge....Pages 453-460
Predicting Click Rates by Consistent Bipartite Spectral Graph Model....Pages 461-468
Automating Gene Expression Annotation for Mouse Embryo....Pages 469-478
Structure Correlation in Mobile Call Networks....Pages 479-486
Initialization of the Neighborhood EM Algorithm for Spatial Clustering....Pages 487-495
Classification Techniques for Talent Forecasting in Human Resource Management....Pages 496-503
A Combination Classification Algorithm Based on Outlier Detection and C4.5....Pages 504-511
A Local Density Approach for Unsupervised Feature Discretization....Pages 512-519
A Hybrid Method of Multidimensional Scaling and Clustering for Determining Genetic Influence on Phenotypes....Pages 520-527
Mining Frequent Patterns from Network Data Flow....Pages 528-535
Several SVM Ensemble Methods Integrated with Under-Sampling for Imbalanced Data Learning....Pages 536-544
Crawling and Extracting Process Data from the Web....Pages 545-552
Asymmetric Feature Selection for BGP Abnormal Events Detection....Pages 553-560
Analysis and Experimentation of Grid-Based Data Mining with Dynamic Load Balancing....Pages 561-568
Incremental Document Clustering Based on Graph Model....Pages 569-576
Evaluating the Impact of Missing Data Imputation....Pages 577-586
Discovery of Significant Classification Rules from Incrementally Inducted Decision Tree Ensemble for Diagnosis of Disease....Pages 587-594
Application of the Cross-Entropy Method to Dual Lagrange Support Vector Machine....Pages 595-602
A Predictive Analysis on Medical Data Based on Outlier Detection Method Using Non-Reduct Computation....Pages 603-610
VisNetMiner: An Integration Tool for Visualization and Analysis of Networks....Pages 611-618
Anomaly Detection Using Time Index Differences of Identical Symbols with and without Training Data....Pages 619-626
An Outlier Detection Algorithm Based on Arbitrary Shape Clustering....Pages 627-635
A Theory of Kernel Extreme Energy Difference for Feature Extraction of EEG Signals....Pages 636-643
Semantic Based Text Classification of Patent Documents to a User-Defined Taxonomy....Pages 644-651
Mining Compressed Repetitive Gapped Sequential Patterns Efficiently....Pages 652-660
Mining Candlesticks Patterns on Stock Series: A Fuzzy Logic Approach....Pages 661-670
JCCM: Joint Cluster Communities on Attribute and Relationship Data in Social Networks....Pages 671-679
Similarity Evaluation of XML Documents Based on Weighted Element Tree Model....Pages 680-687
Quantitative Comparison of Similarity Measure and Entropy for Fuzzy Sets....Pages 688-695
Investigation of Damage Identification of 16Mn Steel Based on Artificial Neural Networks and Data Fusion Techniques in Tensile Test....Pages 696-703
OFFD: Optimal Flexible Frequency Discretization for Naïve Bayes Classification....Pages 704-712
Handling Class Imbalance Problems via Weighted BP Algorithm....Pages 713-720
Orthogonal Centroid Locally Linear Embedding for Classification....Pages 721-728
CCBitmaps: A Space-Time Efficient Index Structure for OLAP....Pages 729-735
Rewriting XPath Expressions Depending on Path Summary....Pages 736-744
Combining Statistical Machine Learning Models to Extract Keywords from Chinese Documents....Pages 745-754
Privacy-Preserving Distributed k -Nearest Neighbor Mining on Horizontally Partitioned Multi-Party Data....Pages 755-762
Alleviating Cold-Start Problem by Using Implicit Feedback....Pages 763-771
Learning from Video Game: A Study of Video Game Play on Problem-Solving....Pages 772-779
Image Classification Approach Based on Manifold Learning in Web Image Mining....Pages 780-787
Social Influence and Role Analysis Based on Community Structure in Social Network....Pages 788-795
Feature Selection Method Combined Optimized Document Frequency with Improved RBF Network....Pages 796-803
Back Matter....Pages -