Information Storage and Retrieval Systems: Theory and Implementation (The Information Retrieval Series, Vol. 8)

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book provides a theoretical and practical explanation of the latest advancements in information retrieval and their application to existing systems. It takes a system approach, discussing all aspects of an Information Retrieval System. The major difference between this book and the first edition is the addition to this text of descriptions of the automated indexing of multimedia documents, as items in information retrieval are now considered to be a combination of text along with graphics, audio, image and video data types. The growth of the Internet and the availability of enormous volumes of data in digital form have necessitated intense interest in techniques to assist the user in locating data. The importance of the Internet and its associated hypertext linked structure are put into perspective as a new type of information retrieval data structure. The total system approach also includes discussion of the human interface and the importance of information visualization for identification of relevant information. With the availability of large quantities of multi-media on the Internet (audio, video, images), Information Retrieval Systems need to address multi-modal retrieval. The primary use of this book is as a college text on Information Retrieval Systems. But in addition to the theoretical aspects, the book maintains a theme of practicality that puts into perspective the importance and utilization of the theory in systems that are being used by anyone on the Internet. The student will gain an understanding of what is achievable using existing technologies and deficient areas that warrant additional research. The text provides coverage of all of the major aspects of information retrieval and has sufficient detail to allow students to implement a simple Information Retrieval System.

Author(s): Gerald J. Kowalski, Mark T. Maybury
Series: The Information Retrieval Series
Edition: 2nd
Publisher: Springer
Year: 2000

Language: English
Pages: 336
Tags: Информатика и вычислительная техника;Искусственный интеллект;Интеллектуальный анализ данных;

0792379241......Page 1
CONTENTS......Page 8
PREFACE......Page 12
1 Introduction to Information Retrieval Systems......Page 16
1.1 Definition of Information Retrieval System......Page 17
1.2 Objectives of Information Retrieval Systems......Page 19
1.3.1 Item Normalization......Page 25
1.3.2 Selective Dissemination of Information......Page 31
1.3.4 Index Database Search......Page 33
1.4 Relationship to Database Management Systems......Page 35
1.5 Digital Libraries and Data Warehouses......Page 36
1.6 Summary......Page 39
2 Information Retrieval System Capabilities......Page 42
2.1 Search Capabilities......Page 43
2.1.1 Boolean Logic......Page 44
2.1.2 Proximity......Page 45
2.1.3 Contiguous Word Phrases......Page 46
2.1.5 Term Masking......Page 47
2.1.6 Numeric and Date Ranges......Page 48
2.1.7 Concept/Thesaurus Expansion......Page 49
2.1.8 Natural Language Queries......Page 51
2.1.9 Multimedia Queries......Page 52
2.2.1 Ranking......Page 53
2.2.3 Highlighting......Page 55
2.3.1 Vocabulary Browse......Page 56
2.3.2 Iterative Search and Search History Log......Page 57
2.3.4 Multimedia......Page 58
2.4 Z39.50 and WAIS Standards......Page 59
2.5 Summary......Page 62
3 Cataloging and Indexing......Page 66
3.1.1 History......Page 67
3.1.2 Objectives......Page 69
3.2 Indexing Process......Page 71
3.2.1 Scope of Indexing......Page 72
3.3 AUTOMATIC INDEXING......Page 73
3.3.1 Indexing by Term......Page 76
3.3.2 Indexing by Concept......Page 78
3.3.3 Multimedia Indexing......Page 79
3.4 Information Extraction......Page 80
3.5 Summary......Page 83
4 Data Structure......Page 86
4.1 Introduction to Data Structures......Page 87
4.2 Stemming Algorithms......Page 88
4.2.1 Introduction to the Stemming Process......Page 89
4.2.2 Porter Stemming Algorithm......Page 90
4.2.3 Dictionary Look-Up Stemmers......Page 92
4.2.4 Successor Stemmers......Page 93
4.2.5 Conclusions......Page 95
4.3 Inverted File Structure......Page 97
4.4 N-Gram Data Structures......Page 100
4.4.1 History......Page 101
4.4.2 N-Gram Data Structure......Page 102
4.5 PAT Data Structure......Page 103
4.6 Signature File Structure......Page 108
4.7 Hypertext and XML Data Structures......Page 109
4.7.1 Definition of Hypertext Structure......Page 110
4.7.2 Hypertext History......Page 112
4.7.3 XML......Page 113
4.8 Hidden Markov Models......Page 114
4.9 Summary......Page 117
5.1 Classes of Automatic Indexing......Page 120
5.2.1 Probabilistic Weighting......Page 123
5.2.2 Vector Weighting......Page 126
5.2.3 Bayesian Model......Page 137
5.3 Natural Language......Page 138
5.3.1 Index Phrase Generation......Page 140
5.3.2 Natural Language Processing......Page 143
5.4 Concept Indexing......Page 145
5.5 Hypertext Linkages......Page 147
5.6 Summary......Page 150
6 Document and Term Clustering......Page 154
6.1 Introduction to Clustering......Page 155
6.2 Thesaurus Generation......Page 158
6.2.1 Manual Clustering......Page 159
6.2.2 Automatic Term Clustering......Page 160
6.3 Item Clustering......Page 169
6.4 Hierarchy of Clusters......Page 171
6.5 Summary......Page 175
7 User Search Techniques......Page 180
7.1 Search Statements and Binding......Page 181
7.2 Similarity Measures and Ranking......Page 182
7.2.1 Similarity Measures......Page 183
7.2.2 Hidden Markov Models Techniques......Page 188
7.2.3 Ranking Algorithms......Page 189
7.3 Relevance Feedback......Page 190
7.4 Selective Dissemination of Information Search......Page 194
7.5 Weighted Searches of Boolean Systems......Page 201
7.6 Searching the INTERNET and Hypertext......Page 206
7.7 Summary......Page 209
8 Information Visualization......Page 214
8.1 Introduction to Information Visualization......Page 215
8.2.1 Background......Page 218
8.2.2 Aspects of the Visualization Process......Page 219
8-3 Information Visualization Technologies......Page 223
8.4 Summary......Page 233
9.1 Introduction to Text Search Techniques......Page 236
9.2 Software Text Search Algorithms......Page 240
9.3 Hardware Text Search Systems......Page 248
9.4 Summary......Page 253
10 Multimedia Information Retrieval......Page 256
10.1 Spoken Language Audio Retrieval......Page 257
10.2 Non-Speech Audio Retrieval......Page 259
10.3 Graph Retrieval......Page 260
10.4 Imagery Retrieval......Page 261
10.5 Video Retrieval......Page 264
10.6 Summary......Page 270
11.1 Introduction to Information System Evaluation......Page 272
11.2 Measures Used in System Evaluations......Page 275
11.3 Measurement Example-TREC-Results......Page 282
11.4 Summary......Page 293
REFERENCES......Page 296
C......Page 328
H......Page 329
L......Page 330
R......Page 331
T......Page 332
Z......Page 333