For more than a decade, data warehousing together with knowledge discovery technology have made up the key technology for the decision-making process in companies. Since 1999, due to the relevant role of these technologies in academia and industry, the Data Warehousing and Knowledge Discovery (DaWaK) conference series has become an international forum for both practitioners and researchers to share their findings, publish their relevant results and debate in depth research issues and experiences on data warehousing and knowledge discovery systems and applications. th The 8 International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2006) continued the series of successful conferences dedicated to these topics. In this edition, DaWaK aimed at providing the right and logical balance between data warehousing and knowledge discovery. In data warehousing the papers cover different research problems, such as advanced techniques in OLAP visuali- tion and multidimensional modelling, innovation of ETL processes and integration problems, materialized view optimization, very large data warehouse processing, data warehouses and data mining applications integration, data warehousing for real-life applications, e. g. , medical applications and spatial applications. In data mining and knowledge discovery, papers are focused on a variety of topics from data streams analysis and mining, ontology-based mining techniques, mining frequent item sets, clustering, association and classification, patterns and so on. These proceedings contain the technical papers which were selected for presentation at the conference. We received 198 abstracts, and finally received 146 papers from 36 countries.
Author(s): Christian Thomsen, Torben Bach Pedersen (auth.), A Min Tjoa, Juan Trujillo (eds.)
Series: Lecture Notes in Computer Science 4081 : Information Systems and Applications, incl. Internet/Web, and HCI
Edition: 1
Publisher: Springer-Verlag Berlin Heidelberg
Year: 2006
Language: English
Pages: 582
Tags: Database Management; Information Storage and Retrieval; Information Systems Applications (incl.Internet); Computer Communication Networks; Artificial Intelligence (incl. Robotics); Business Information Systems
Front Matter....Pages -
ETLDiff: A Semi-automatic Framework for Regression Test of ETL Software....Pages 1-12
Applying Transformations to Model Driven Data Warehouses....Pages 13-22
Bulk Loading a Linear Hash File....Pages 23-32
Dynamic View Selection for OLAP....Pages 33-44
Preview: Optimizing View Materialization Cost in Spatial Data Warehouses....Pages 45-54
Preprocessing for Fast Refreshing Materialized Views in DB2....Pages 55-64
A Multiversion-Based Multidimensional Model....Pages 65-74
Towards Multidimensional Requirement Design....Pages 75-84
Multidimensional Design by Examples....Pages 85-94
Extending Visual OLAP for Handling Irregular Dimensional Hierarchies....Pages 95-105
A Hierarchy-Driven Compression Technique for Advanced OLAP Visualization of Multidimensional Data Cubes....Pages 106-119
Analysing Multi-dimensional Data Across Autonomous Data Warehouses....Pages 120-133
What Time Is It in the Data Warehouse?....Pages 134-144
Computing Iceberg Quotient Cubes with Bounding....Pages 145-154
An Effective Algorithm to Extract Dense Sub-cubes from a Large Sparse Cube....Pages 155-164
On the Computation of Maximal-Correlated Cuboids Cells....Pages 165-174
Warehousing Dynamic XML Documents....Pages 175-184
Integrating Different Grain Levels in a Medical Data Warehouse Federation....Pages 185-194
A Versioning Management Model for Ontology-Based Data Warehouses....Pages 195-206
Data Warehouses in Grids with High QoS....Pages 207-217
Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods....Pages 218-227
Efficient Mining of Dissociation Rules....Pages 228-237
Optimized Rule Mining Through a Unified Framework for Interestingness Measures....Pages 238-247
An Information-Theoretic Framework for Process Structure and Data Mining....Pages 248-259
Mixed Decision Trees: An Evolutionary Approach....Pages 260-269
ITER: An Algorithm for Predictive Regression Rule Extraction....Pages 270-279
COBRA: Closed Sequential Pattern Mining Using Bi-phase Reduction Approach....Pages 280-291
A Greedy Approach to Concurrent Processing of Frequent Itemset Queries....Pages 292-301
Two New Techniques for Hiding Sensitive Itemsets and Their Empirical Evaluation....Pages 302-311
EStream: Online Mining of Frequent Sets with Precise Error Guarantee....Pages 312-321
Granularity Adaptive Density Estimation and on Demand Clustering of Concept-Drifting Data Streams....Pages 322-331
Classification of Hidden Network Streams....Pages 332-341
Adaptive Load Shedding for Mining Frequent Patterns from Data Streams....Pages 342-351
An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams....Pages 352-362
Learning Classifiers from Distributed, Ontology-Extended Data Sources....Pages 363-373
A Coherent Biomedical Literature Clustering and Summarization Approach Through Ontology-Enriched Graphical Representations....Pages 374-383
Automatic Extraction for Creating a Lexical Repository of Abbreviations in the Biomedical Literature....Pages 384-393
Priority-Based k-Anonymity Accomplished by Weighted Generalisation Structures....Pages 394-404
Achieving k -Anonymity by Clustering in Attribute Hierarchical Structures....Pages 405-416
Calculation of Density-Based Clustering Parameters Supported with Distributed Processing....Pages 417-426
Cluster-Based Sampling Approaches to Imbalanced Data Distributions....Pages 427-436
Efficient Mining of Large Maximal Bicliques....Pages 437-448
Automatic Image Annotation by Mining the Web....Pages 449-458
Privacy Preserving Spatio-Temporal Clustering on Horizontally Partitioned Data....Pages 459-468
Discovering Semantic Sibling Associations from Web Documents with XTREEM-SP....Pages 469-480
Difference Detection Between Two Contrast Sets....Pages 481-490
EGEA : A New Hybrid Approach Towards Extracting Reduced Generic Association Rule Set (Application to AML Blood Cancer Therapy)....Pages 491-502
AISS: An Index for Non-timestamped Set Subsequence Queries....Pages 503-512
A Method for Feature Selection on Microarray Data Using Support Vector Machine....Pages 513-523
Providing Persistence for Sensor Data Streams by Remote WAL....Pages 524-533
Support Vector Machine Approach for Fast Classification....Pages 534-543
Document Representations for Classification of Short Web-Page Descriptions....Pages 544-553
GARC : A New Associative Classification Approach....Pages 554-565
Conceptual Modeling for Classification Mining in Data Warehouses....Pages 566-575
Back Matter....Pages -