The Encyclopedia of Data Warehousing and Mining provides a comprehensive, critical and descriptive examination of concepts, issues, trends, and challenges in this rapidly expanding field of data warehousing and mining (DWM). This encyclopedia consists of more than 350 contributors from 32 countries, 1,800 terms and definitions, and more than 4,400 references. This authoritative publication offers in-depth coverage of evolutions, theories, methodologies, functionalities, and applications of DWM in such interdisciplinary industries as healthcare informatics, artificial intelligence, financial modeling, and applied statistics, making it a single source of knowledge and latest discoveries in the field of DWM.
Author(s): John Wang
Edition: 2nd
Publisher: Information Science Reference
Year: 2008
Language: English
Pages: 2542
Title Page......Page 2
Editorial Advisory Board......Page 4
List of Contributors......Page 5
Contents......Page 17
Contents
by Topic......Page 37
Foreword......Page 63
Preface......Page 64
About the Editor......Page 81
Action Rules Mining......Page 82
Active Learning with Multiple Views......Page 87
Adaptive Web Presence and Evolution through
Web Log Analysis......Page 93
Aligning the Warehouse and the Web......Page 99
Analytical Competition for Managing
Customer Relations......Page 106
Analytical Knowledge Warehousing
for Business Intelligence......Page 112
Anomaly Detection for Inferring Social
Structure......Page 120
The Application of Data-Mining to
Recommender Systems......Page 126
Applications of Kernel Methods......Page 132
Architecture for Symbolic Object Warehouse......Page 139
Association Bundle Identification......Page 147
Association Rule Hiding Methods......Page 152
Association Rule Mining......Page 157
On Association Rule Mining for the QSAR
Problem......Page 164
Association Rule Mining of Relational Data......Page 168
Association Rules and Statistics......Page 175
Audio and Speech Processing for Data Mining......Page 179
Audio Indexing......Page 185
An Automatic Data Warehouse Conceptual
Design Approach......Page 191
Automatic Genre-Specific Text Classification......Page 201
Automatic Music Timbre Indexing......Page 209
A Bayesian Based Machine Learning
Application to Task Analysis......Page 214
Behavioral Pattern-Based Customer
Segmentation......Page 221
Best Practices in Data Warehousing......Page 227
Bibliomining for Library Decision-Making......Page 234
Bioinformatics and Computational Biology......Page 241
Biological Image Analysis via Matrix
Approximation......Page 247
Bitmap Join Indexes vs. Data Partitioning......Page 252
Bridging Taxonomic Semantics to Accurate
Hierarchical Classification......Page 259
A Case Study of a Data Warehouse in the
Finnish Police......Page 264
Classification and Regression Trees......Page 273
Classification Methods......Page 277
Classification of Graph Structures......Page 283
Classifying Two-Class Chinese Texts in Two
Steps......Page 289
Cluster Analysis for Outlier Detection......Page 295
Cluster Analysis in Fitting Mixtures of Curves......Page 300
Cluster Analysis with General Latent Class
Model......Page 306
Cluster Validation......Page 312
Clustering Analysis of Data with High
Dimensionality......Page 318
Clustering Categorical Data with k-Modes......Page 327
Clustering Data in Peer-to-Peer Systems......Page 332
Clustering of Time Series Data......Page 339
On Clustering Techniques......Page 345
Comparing Four-Selected Data Mining
Software......Page 350
Compression-Based Data Mining......Page 359
Computation of OLAP Data Cubes......Page 367
Conceptual Modeling for Data Warehouse and
OLAP Applications......Page 374
Constrained Data Mining......Page 382
Constraint-Based Association Rule Mining......Page 388
Constraint-Based Pattern Discovery......Page 394
Context-Driven Decision Mining......Page 401
Context-Sensitive Attribute Evaluation......Page 409
Control-Based Database Tuning Under
Dynamic Workloads......Page 414
Cost-Sensitive Learning......Page 420
Count Models for Software Quality Estimation......Page 427
Data Analysis for Oil Production Prediction......Page 434
Data Confidentiality and Chase-Based
Knowledge Discovery......Page 442
Data Cube Compression Techniques:
A Theoretical Review......Page 448
A Data Distribution View of Clustering
Algorithms......Page 455
Data Driven vs. Metric Driven Data
Warehouse Design......Page 463
Data Mining and Privacy......Page 469
Data Mining and the Text Categorization
Framework......Page 475
Data Mining Applications in Steel Industry......Page 481
Data Mining Applications in the Hospitality
Industry......Page 487
Data Mining for Fraud Detection System......Page 492
Data Mining for Improving Manufacturing
Processes......Page 498
Data Mining for Internationalization......Page 505
Data Mining for Lifetime Value Estimation......Page 512
Data Mining for Model Identification......Page 519
Data Mining for Obtaining Secure E-Mail
Communications......Page 526
Data Mining for Structural Health Monitoring......Page 531
Data Mining for the Chemical Process Industry......Page 539
Data Mining in Genome Wide Association
Studies......Page 546
Data Mining in Protein Identification by
Tandem Mass Spectrometry......Page 553
Data Mining in Security Applications......Page 560
Data Mining in the Telecommunications
Industry......Page 567
Data Mining Lessons Learned in the Federal
Government......Page 573
A Data Mining Methodology for Product
Family Design......Page 578
Data Mining on XML Data......Page 587
Data Mining Tool Selection......Page 592
Data Mining with Cubegrades......Page 600
Data Mining with Incomplete Data......Page 607
Data Pattern Tutor for AprioriAll and
PrefixSpan......Page 612
Data Preparation for Data Mining......Page 619
Data Provenance......Page 625
Data Quality in Data Warehouses......Page 631
Data Reduction with Rough Sets......Page 637
Data Streams......Page 642
Data Transformation for Normalization......Page 647
Data Warehouse Back-End Tools......Page 653
Data Warehouse Performance......Page 661
Data Warehousing and Mining in Supply
Chains......Page 667
Data Warehousing for Association Mining......Page 673
Database Queries, Data Mining, and OLAP......Page 679
Database Sampling for Data Mining......Page 685
Database Security and Statistical Database
Security......Page 691
Data-Driven Revision of Decision Models......Page 698
Decision Tree Induction......Page 705
Deep Web Mining through Web Services......Page 712
DFM as a Conceptual Model for Data
Warehouse......Page 719
Direction-Aware Proximity on Graphs......Page 727
Discovering an Effective Measure in Data
Mining......Page 735
Discovering Knowledge from XML Documents......Page 744
Discovering Unknown Patterns in Free Text......Page 750
Discovery Informatics from Data
to Knowledge......Page 757
Discovery of Protein Interaction Sites......Page 764
Distance-Based Methods for Association
Rule Mining......Page 770
Distributed Association Rule Mining......Page 776
Distributed Data Aggregation Technology for
Real-Time DDoS Attacks Detection......Page 782
Distributed Data Mining......Page 790
Document Indexing Techniques for Text
Mining......Page 797
Dynamic Data Mining......Page 803
Dynamical Feature Extraction from Brain
Activity Time Series......Page 810
Efficient Graph Matching......Page 817
Enclosing Machine Learning......Page 825
Enhancing Web Search through Query
Expansion......Page 833
Enhancing Web Search through Query Log
Mining
......Page 839
Enhancing Web Search through Web
Structure Mining......Page 845
Ensemble Data Mining Methods......Page 851
Ensemble Learning for Regression......Page 858
Ethics of Data Mining......Page 864
Evaluation of Data Mining Methods......Page 870
Evaluation of Decision Rules by Qualities for
Decision-Making Systems......Page 876
The Evolution of SDI Geospatial Data
Clearinghouses......Page 883
Evolutionary Approach to Dimensionality
Reduction......Page 891
Evolutionary Computation and Genetic
Algorithms......Page 898
Evolutionary Data Mining for Genomics......Page 904
Evolutionary Development of ANNs for Data
Mining......Page 910
Evolutionary Mining of Rule Ensembles......Page 917
On Explanation-Oriented Data Mining......Page 923
Extending a Conceptual Multidimensional
Model for Representing Spatial Data......Page 930
Facial Recognition......Page 938
Feature Extraction/Selection in
High-Dimensional Spectral Data......Page 944
Feature Reduction for Support Vector
Machines......Page 951
Feature Selection......Page 959
Financial Time Series Data Mining......Page 964
Flexible Mining of Association Rules......Page 971
Formal Concept Analysis Based Clustering......Page 976
Frequent Sets Mining in Data Stream
Environments......Page 982
Fuzzy Methods in Data Mining......Page 988
A General Model for Data Warehouses......Page 994
A Genetic Algorithm for Selecting Horizontal
Fragments......Page 1001
Genetic Programming......Page 1007
Genetic Programming for Automatically
Constructing Data Mining Algorithms......Page 1013
Global Induction of Decision Trees......Page 1018
Graph-Based Data Mining......Page 1024
Graphical Data Mining......Page 1031
Guide Manifold Alignment by Relative
Comparisons......Page 1038
Guided Sequence Alignment......Page 1045
Hierarchical Document Clustering......Page 1051
Histograms for OLAP and Data-Stream
Queries......Page 1057
Homeland Security Data Mining and Link
Analysis......Page 1063
Humanities Data Warehousing......Page 1068
Hybrid Genetic Algorithms in Data Mining
Applications......Page 1074
Imprecise Data and the Data Mining Process......Page 1080
Incremental Learning......Page 1087
Incremental Mining from News Streams......Page 1094
Inexact Field Learning Approach for DataMining......Page 1100
Information Fusion for Scientific Literature
Classification......Page 1104
Information Veins and Resampling with Rough
Set Theory......Page 1115
Instance Selection......Page 1122
Integration of Data Mining and Operations
Research......Page 1127
Integration of Data Sources through
Data Mining......Page 1134
Integrative Data Analysis for Biological
Discovery......Page 1139
Intelligent Image Archival and Retrieval
System......Page 1147
Intelligent Query Answering......Page 1154
On Interacting Features in Subset Selection......Page 1160
On Interactive Data Mining......Page 1166
Interest Pixel Mining......Page 1172
An Introduction to Kernel Methods......Page 1178
The Issue of Missing Values in Data Mining......Page 1183
Knowledge Acquisition from Semantically
Heterogeneous Data......Page 1191
Knowledge Discovery in Databases with
Diversity of Data Types......Page 1198
Learning Bayesian Networks......Page 1205
Learning Exceptions to Refine a Domain
Expertise......Page 1210
Learning from Data Streams......Page 1218
Learning Kernels for Semi-Supervised
Clustering......Page 1223
Learning Temporal Information from Text......Page 1227
Learning with Partial Supervision......Page 1231
Legal and Technical Issues of Privacy
Preservation in Data Mining......Page 1239
Leveraging Unlabeled Data for Classification......Page 1245
Locally Adaptive Techniques for Pattern
Classification......Page 1251
Mass Informatics in Differential Proteomics......Page 1257
Materialized View Selection for
Data Warehouse Design......Page 1263
Matrix Decomposition Techniques for Data
Privacy......Page 1269
Measuring the Interestingness of News Articles......Page 1275
Metaheuristics in Data Mining......Page 1281
Meta-Learning......Page 1288
A Method of Recognizing Entity and Relation......Page 1297
Microarray Data Mining......Page 1305
Minimum Description Length Adaptive
Bayesian Mining......Page 1312
Mining 3D Shape Data for Morphometric
Pattern Discovery......Page 1317
Mining Chat Discussions......Page 1324
Mining Data Streams......Page 1329
Mining Data with Group Theoretical Means......Page 1338
Mining Email Data......Page 1343
Mining Generalized Association Rules in an
Evolving Environment......Page 1349
Mining Generalized Web Data for Discovering
Usage Patterns......Page 1356
Mining Group Differences......Page 1363
Mining Repetitive Patterns in Multimedia Data......Page 1368
Mining Smart Card Data from an Urban
Transit Network......Page 1373
Mining Software Specifications......Page 1384
Mining the Internet for Concepts......Page 1391
Model Assessment with ROC Curves......Page 1397
Modeling Quantiles......Page 1405
Modeling Score Distributions......Page 1411
Modeling the KDD Process......Page 1418
A Multi-Agent System for Handling Adaptive
E-Services......Page 1427
Multiclass Molecular Classification......Page 1433
Multidimensional Modeling of Complex Data......Page 1439
Multi-Group Data Classification via MILP......Page 1446
Multi-Instance Learning with MultiObjective
Genetic Programming......Page 1453
Multilingual Text Mining......Page 1461
Multiple Criteria Optimization in Data Mining......Page 1467
Multiple Hypothesis Testing for Data Mining......Page 1471
Music Information Retrieval......Page 1477
Neural Networks and Graph
Transformations......Page 1484
New Opportunities in Marketing Data Mining......Page 1490
Non-Linear Dimensionality Reduction
Techniques......Page 1497
A Novel Approach on Negative Association
Rules......Page 1506
Offline Signature Recognition......Page 1512
OLAP Visualization: Models, Issues,
and Techniques......Page 1520
Online Analytical Processing Systems......Page 1528
Online Signature Recognition......Page 1537
Ontologies and Medical Terminologies......Page 1544
Order Preserving Data Mining......Page 1551
Outlier Detection......Page 1557
Outlier Detection Techniques for Data Mining......Page 1564
Path Mining and Process Mining for Workflow
Management Systems......Page 1570
Pattern Discovery as Event Association......Page 1578
Pattern Preserving Clustering......Page 1586
Pattern Synthesis for Nonparametric Pattern
Recognition......Page 1592
Pattern Synthesis in SVM Based Classifier......Page 1598
The Personal Name Problem and a Data
Mining Solution......Page 1605
Perspectives and Key Technologies of
Semantic Web Search......Page 1613
A Philosophical Perspective on Knowledge
Creation......Page 1619
Physical Data Warehousing Design......Page 1627
Positive Unlabelled Learning for Document
Classification......Page 1633
Predicting Resource Usage for Capital Efficient
Marketing......Page 1639
Preference Modeling and Mining for
Personalization......Page 1651
Privacy Preserving OLAP and OLAP Security......Page 1656
Privacy-Preserving Data Mining......Page 1663
Process Mining to Analyze the Behaviour of
Specific Users......Page 1670
Profit Mining......Page 1679
Program Comprehension through Data Mining......Page 1684
Program Mining Augmented with Empirical
Properties......Page 1691
Projected Clustering for Biological Data
Analysis......Page 1698
Proximity-Graph-Based Tools for DNA
Clustering......Page 1704
Pseudo-Independent Models and Decision
Theoretic Knowledge Discovery......Page 1713
Quality of Association Rules by
Chi-Squared Test......Page 1720
Quantization of Continuous Data for Pattern
Based Rule Extraction......Page 1727
Realistic Data for Testing Rule Mining
Algorithms......Page 1734
Real-Time Face Detection and Classification
for ICCTV......Page 1740
Reasoning about Frequent Patterns with
Negation......Page 1748
Receiver Operating Characteristic (ROC)
Analysis......Page 1756
Reflecting Reporting Problems and Data
Warehousing......Page 1763
Rough Sets and Data Mining......Page 1777
Sampling Methods in Approximate Query
Answering Systems......Page 1783
Scalable Non-Parametric Methods for Large
Data Sets......Page 1789
Scientific Web Intelligence......Page 1795
Seamless Structured Knowledge Acquisition......Page 1801
Search Engines and their Impact on Data
Warehouses......Page 1808
Search Situations and Transitions......Page 1816
Secure Building Blocks for Data Privacy......Page 1822
Secure Computation for Privacy Preserving
Data Mining......Page 1828
Segmentation of Time Series Data......Page 1834
Segmenting the Mature Travel Market with
Data Mining Tools......Page 1840
Semantic Data Mining......Page 1846
Semantic Multimedia Content Retrieval and
Filtering......Page 1852
Semi-Structured Document Classification......Page 1860
Semi-Supervised Learning......Page 1868
Sentiment Analysis of Product Reviews......Page 1875
Sequential Pattern Mining......Page 1881
Soft Computing for XML Data Mining......Page 1887
Soft Subspace Clustering for
High-Dimensional Data......Page 1891
Spatio-Temporal Data Mining for Air Pollution
Problems......Page 1896
Spectral Methods for Data Clustering......Page 1904
Stages of Knowledge Discovery in
E-Commerce Sites......Page 1911
Statistical Data Editing......Page 1916
Statistical Metadata Modeling and
Transformations......Page 1922
Statistical Models for Operational Risk......Page 1929
Statistical Web Object Extraction......Page 1935
Storage Systems for Data Warehousing......Page 1940
Subgraph Mining......Page 1946
Subsequence Time Series Clustering......Page 1952
Summarization in Pattern Mining......Page 1958
Supporting Imprecision in Database Systems......Page 1965
A Survey of Feature Selection Techniques......Page 1969
Survival Data Mining......Page 1977
Symbiotic Data Miner......Page 1984
Tabu Search for Variable Selection in
Classification......Page 1990
Techniques for Weighted Clustering
Ensembles......Page 1997
Temporal Event Sequence Rule Mining......Page 2004
Temporal Extension for a Conceptual
Multidimensional Model......Page 2010
Text Categorization......Page 2017
Text Mining by Pseudo-Natural Language
Understanding......Page 2023
Text Mining for Business Intelligence......Page 2028
Text Mining Methods for Hierarchical
Document Indexing......Page 2038
Theory and Practice of Expectation
Maximization (EM) Algorithm......Page 2047
Time-Constrained Sequential Pattern Mining......Page 2055
Topic Maps Generation by Text Mining......Page 2060
Transferable Belief Model......Page 2066
Tree and Graph Mining......Page 2071
Uncertainty Operators in a Many-Valued Logic......Page 2078
A User-Aware Multi-Agent System for Team
Building......Page 2085
Using Dempster-Shafer Theory in Data Mining......Page 2092
Using Prior Knowledge in Data Mining......Page 2100
Utilizing Fuzzy Decision Trees in Decision
Making......Page 2105
Variable Length Markov Chains for Web
Usage Mining......Page 2112
Vertical Data Mining on Very Large Data Sets......Page 2117
Video Data Mining......Page 2123
View Selection in DW and OLAP:
A Theoretical Review......Page 2129
Visual Data Mining from Visualization to
Visual Information Mining......Page 2137
Visualization of High-Dimensional Data with
Polar Coordinates......Page 2143
Visualization Techniques for Confidence
Based Data......Page 2149
Web Design Based on User Browsing Patterns......Page 2155
Web Mining in Thematic Search Engines......Page 2161
Web Mining Overview......Page 2166
Web Page Extension of Data Warehouses......Page 2171
Web Usage Mining with Web Logs......Page 2177
Wrapper Feature Selection......Page 2184
XML Warehousing and OLAP......Page 2190
XML-Enabled Association Analysis......Page 2198
Index......Page 2204