Data Mining Multimedia, Soft Computing, and Bioinformatics

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Author(s): Sushmita Mitra, Tinku Acharya

Language: English
Commentary: +OCR
Pages: 420

Data Mining \r\nMultimedia, Soft Computing,\r\nand Bioinformatics......Page 4
Copyright......Page 5
Contents......Page 8
Preface......Page 16
1.1 INTRODUCTION......Page 20
1.2 KNOWLEDGE DISCOVERY AND DATA MINING......Page 24
1.3 DATA COMPRESSION......Page 29
1.4 INFORMATION RETRIEVAL......Page 31
1.5 TEXT MINING......Page 33
1.6 WEB MINING......Page 34
1.7 IMAGE MINING......Page 35
1.8 CLASSIFICATION......Page 37
1.9 CLUSTERING......Page 38
1.10 RULE MINING......Page 39
1.11 STRING MATCHING......Page 40
1.12 BIOINFORMATICS......Page 42
1.13 DATA WAREHOUSING......Page 43
1.14 APPLICATIONS AND CHALLENGES......Page 44
1.15 CONCLUSIONS AND DISCUSSION......Page 47
REFERENCES......Page 49
2.1 INTRODUCTION......Page 54
2.2.1 Relevance......Page 56
2.2.2 Fuzzy sets......Page 58
2.2.3 Neural networks......Page 63
2.2.4 Neuro- fuzzy computing......Page 72
2.2.5 Genetic algorithms......Page 74
2.2.6 Rough sets......Page 78
2.2.7 Wavelets......Page 80
2.3 ROLE OF FUZZY SETS IN DATA MINING......Page 81
2.3.2 Granular computing......Page 82
2.3.3 Association rules......Page 83
2.3.5 Data summarization......Page 84
2.3.6 Image mining......Page 85
2.4.2 Rule evaluation......Page 86
2.4.5 Information retrieval......Page 88
2.5 ROLE OF GENETIC ALGORITHMS IN DATA MINING......Page 89
2.5.2 Association rules......Page 90
2.6 ROLE OF ROUGH SETS IN DATA MINING......Page 91
2.7 ROLE OF WAVELETS IN DATA MINING......Page 92
2.8 ROLE OF HYBRIDIZATIONS IN DATA MINING......Page 93
2.9 CONCLUSIONS AND DISCUSSION......Page 96
REFERENCES......Page 97
3.1 INTRODUCTION......Page 108
3.2.1 Discrete memoryless model and entropy......Page 110
3.2.2 Noiseless Source Coding Theorem......Page 111
3.3 CLASSIFICATION OF COMPRESSION ALGORITHMS......Page 113
3.4 A DATA COMPRESSION MODEL......Page 114
3.5 MEASURES OF COMPRESSION PERFORMANCE......Page 115
3.5.2 Quality metric......Page 116
3.6.1 Run- length coding......Page 118
3.6.2 Huffman coding......Page 119
3.7 PRINCIPAL COMPONENT ANALYSIS FOR DATA COMPRESSION......Page 122
3.8.1 Predictive coding......Page 124
3.8.2 Transform coding......Page 126
3.8.3 Wavelet coding......Page 128
3.9 IMAGE COMPRESSION STANDARD: JPEG......Page 131
3.10 THE JPEG LOSSLESS CODING ALGORITHM......Page 132
3.11.1 Color space conversion......Page 135
3.11.2 Source image data arrangement......Page 137
3.11.3 The baseline compression algorithm......Page 138
3.11.4 Decompression process in baseline JPEG......Page 145
3.11.5 JPEG2000: Next generation still picture coding standard......Page 148
3.12 TEXT COMPRESSION......Page 150
3.12.1 The LZ77 algorithm......Page 151
3.12.2 The LZ78 algorithm......Page 152
3.12.3 The LZW algorithm......Page 155
3.12.4 Other applications of Lempeh- Ziv coding......Page 158
REFERENCES......Page 159
4.1 INTRODUCTION......Page 162
4.1.1 Some definitions and preliminaries......Page 163
4.1.2 String matching problem......Page 165
4.1.3 Brute force string matching......Page 167
4.2.1 String matching with finite automata......Page 169
4.2.2 Knuth- Morris- Pratt algorithm......Page 171
4.2.3 Boyer- Moore algorithm......Page 177
4.2.4 Boyer- Moore- Horspool algorithm......Page 180
4.2.5 Karp- Rabin algorithm......Page 184
4.3 STRING MATCHING IN BIOINFORMATICS......Page 188
4.4 APPROXIMATE STRING MATCHING......Page 190
4.4.1 Basic definitions......Page 191
4.4.2 Wagner- Fischer algorithm for computation of string distance......Page 192
4.4.3 Text search with fc- differences......Page 195
4.5 COMPRESSED PATTERN MATCHING......Page 196
REFERENCES......Page 198
5.1 INTRODUCTION......Page 200
5.2 DECISION TREE CLASSIFIERS......Page 203
5.2.1 IDS......Page 206
5.2.3 Serial PaRallelizable INduction of decision Trees ( SPRINT)......Page 208
5.2.5 Overfilling......Page 211
5.2.7 Extracting classification rules from trees......Page 213
5.2.8 Fusion with neural networks......Page 214
5.3.2 Naive Bayesian classifier......Page 215
5.3.3 Bayesian belief network......Page 217
5.4.1 Minimum distance classifiers......Page 218
5.4.3 Locally weighted regression......Page 220
5.4.4 Radial basis functions ( RBFs)......Page 221
5.4.6 Granular computing and CBR......Page 222
5.5 SUPPORT VECTOR MACHINES......Page 223
5.6 FUZZY DECISION TREES......Page 224
5.6.1 Classification......Page 226
5.6.2 Rule generation and evaluation......Page 231
5.6.3 Mapping of rules to fuzzy neural network......Page 233
5.6.4 Results......Page 235
5.7 CONCLUSIONS AND DISCUSSION......Page 239
REFERENCES......Page 240
6.1 INTRODUCTION......Page 246
6.2.2 Binary objects......Page 248
6.2.4 Symbolic objects......Page 250
6.3.1 Partitional clustering......Page 251
6.3.2 Hierarchical clustering......Page 254
6.4 SCALABLE CLUSTERING ALGORITHMS......Page 256
6.4.1 Clustering large applications......Page 257
6.4.2 Density- based clustering......Page 258
6.4.3 Hierarchical clustering......Page 260
6.4.4 Grid- based methods......Page 262
6.5.1 Fuzzy sets......Page 263
6.5.2 Neural networks......Page 265
6.5.3 Wavelets......Page 267
6.5.4 Rough sets......Page 268
6.5.5 Evolutionary algorithms......Page 269
6.6 CLUSTERING WITH CATEGORICAL ATTRIBUTES......Page 270
6.6.2 Robust Hierarchical Clustering with Links ( ROCK)......Page 271
6.6.3 c- modes algorithm......Page 272
6.7.1 Conceptual clustering......Page 274
6.7.2 Agglomerative symbolic clustering......Page 275
6.7.3 Cluster validity indices......Page 276
6.7.4 Results......Page 278
6.8 CONCLUSIONS AND DISCUSSION......Page 280
REFERENCES......Page 281
7.1 INTRODUCTION......Page 286
7.2.1 A priori algorithm......Page 288
7.2.3 Some extensions......Page 291
7.3 DEPTH- FIRST SEARCH METHODS......Page 292
7.4 INTERESTING RULES......Page 294
7.5 MULTILEVEL RULES......Page 295
7.6 ONLINE GENERATION OF RULES......Page 296
7.7 GENERALIZED RULES......Page 297
7.8 SCALABLE MINING OF RULES......Page 299
7.9.2 Temporal association rules......Page 300
7.9.4 Localized associations......Page 301
7.10 FUZZY ASSOCIATION RULES......Page 302
7.11 CONCLUSIONS AND DISCUSSION......Page 307
REFERENCES......Page 308
8.1 INTRODUCTION......Page 312
8.2 CONNECTIONIST RULE GENERATION......Page 313
8.2.1 Neural models......Page 314
8.2.2 Neuro- fuzzy models......Page 315
8.2.3 Using knowledge- based networks......Page 316
8.3.1 Rough fuzzy MLP......Page 321
8.3.2 Modular knowledge- based network......Page 324
8.3.3 Evolutionary design......Page 327
8.3.4 Rule extraction......Page 329
8.3.5 Results......Page 330
REFERENCES......Page 334
......Page
9.1 INTRODUCTION......Page 338
9.2 TEXT MINING......Page 339
9.2.1 Keyword- based search and mining......Page 340
9.2.2 Text analysis and retrieval......Page 341
9.2.3 Mathematical modeling of documents......Page 342
9.2.4 Similarity- based matching for documents and queries......Page 344
9.2.5 Latent semantic analysis......Page 345
9.2.6 Soft computing approaches......Page 347
9.3 IMAGE MINING......Page 348
9.3.1 Content- Based Image Retrieval......Page 349
9.3.2 Color features......Page 351
9.3.3 Texture features......Page 356
9.3.4 Shape features......Page 357
9.3.5 Topology......Page 359
9.3.6 Multidimensional indexing......Page 361
9.3.7 Results of a simple CBIR system......Page 362
9.4 VIDEO MINING......Page 364
9.4.1 MPEG- 7: Multimedia content description interface......Page 366
9.4.2 Content- based video retrieval system......Page 367
9.5 WEB MINING......Page 369
9.5.1 Search engines......Page 370
9.5.2 Soft computing approaches......Page 372
REFERENCES......Page 376
10.1 INTRODUCTION......Page 384
10.2.1 Deoxyribonucleic acid......Page 386
10.2.2 Amino acids......Page 387
10.2.3 Proteins......Page 388
10.3 INFORMATION SCIENCE ASPECTS......Page 390
10.3.1 Protein folding......Page 391
10.3.2 Protein structure modeling......Page 392
10.3.4 Homology search......Page 393
10.4 CLUSTERING OF MICROARRAY DATA......Page 397
10.4.1 First- generation algorithms......Page 398
10.4.2 Second- generation algorithms......Page 399
10.6 ROLE OF SOFT COMPUTING......Page 400
10.6.2 Predicting protein tertiary structure......Page 401
10.6.4 Classifying gene expression data......Page 404
10.7 CONCLUSIONS AND DISCUSSION......Page 405
REFERENCES......Page 406
Index......Page 411
About the Authors......Page 418