This book constitutes the refereed proceedings of the 11th International Conference on String Processing and Information Retrieval, SPIRE 2004, held in Padova, Italy, in October 2004.
The 28 revised full papers and 16 revised short papers presented were carefully reviewed and selected from 123 submissions. The papers address current issues in string pattern searching and matching, string discovery, data compression, data mining, text mining, machine learning, information retrieval, digital libraries, and applications in various fields, such as bioinformatics, speech and natural language processing, Web links and communities, and multilingual data.
Author(s): Amihood Amir, Ayelet Butman (auth.), Alberto Apostolico, Massimo Melucci (eds.)
Series: Lecture Notes in Computer Science 3246
Edition: 1
Publisher: Springer-Verlag Berlin Heidelberg
Year: 2004
Language: English
Pages: 334
Tags: Information Storage and Retrieval; Artificial Intelligence (incl. Robotics); Database Management; Data Structures; Coding and Information Theory; Algorithm Analysis and Problem Complexity
Front Matter....Pages -
Efficient One Dimensional Real Scaled Matching....Pages 1-9
Linear Time Algorithm for the Longest Common Repeat Problem....Pages 10-17
Automaton-Based Sublinear Keyword Pattern Matching....Pages 18-29
Techniques for Efficient Query Expansion....Pages 30-42
Inferring Query Performance Using Pre-retrieval Predictors....Pages 43-54
A Scalable System for Identifying Co-derivative Documents....Pages 55-67
Searching for a Set of Correlated Patterns....Pages 68-69
Linear Nondeterministic Dawg String Matching Algorithm (Abstract)....Pages 70-71
Permuted and Scaled String Matching....Pages 72-73
Bit-Parallel Branch and Bound Algorithm for Transposition Invariant LCS....Pages 74-75
A New Feature Normalization Scheme Based on Eigenspace for Noisy Speech Recognition....Pages 76-78
Fast Detection of Common Sequence Structure Patterns in RNAs....Pages 79-92
An Efficient Algorithm for the Longest Tandem Scattered Subsequence Problem....Pages 93-100
Automatic Document Categorization Based on k-NN and Object-Based Thesauri....Pages 101-112
Indexing Text Documents Based on Topic Identification....Pages 113-124
Cross-Comparison for Two-Dimensional Text Categorization....Pages 125-126
DDOC: Overlapping Clustering of Words for Document Classification....Pages 127-128
Evaluation of Web Page Representations by Content Through Clustering....Pages 129-130
Evaluating Relevance Feedback and Display Strategies for Searching on Small Displays....Pages 131-133
Information Extraction by Embedding HMM to the Set of Induced Linguistic Features....Pages 134-135
Finding Cross-Lingual Spelling Variants....Pages 136-137
An Efficient Index Data Structure with the Capabilities of Suffix Trees and Suffix Arrays for Alphabets of Non-negligible Size....Pages 138-149
An Alphabet-Friendly FM-Index....Pages 150-160
Concurrency Control and I/O-Optimality in Bulk Insertion....Pages 161-170
Processing Conjunctive and Phrase Queries with the Set-Based Model....Pages 171-182
Metric Indexing for the Vector Model in Text Retrieval....Pages 183-195
Negations and Document Length in Logical Retrieval....Pages 196-207
An Improvement and an Extension on the Hybrid Index for Approximate String Matching....Pages 208-209
First Huffman, Then Burrows-Wheeler: A Simple Alphabet-Independent FM-Index....Pages 210-211
Metric Indexes for Approximate String Matching in a Dictionary....Pages 212-213
Simple Implementation of String B-Trees....Pages 214-215
Alphabet Permutation for Differentially Encoding Text....Pages 216-217
A Space-Saving Linear-Time Algorithm for Grammar-Based Compression....Pages 218-229
Simple, Fast, and Efficient Natural Language Adaptive Compression....Pages 230-241
Searching XML Documents Using Relevance Propagation....Pages 242-254
Dealing with Syntactic Variation Through a Locality-Based Approach....Pages 255-266
Efficient Extraction of Structured Motifs Using Box-Links....Pages 267-268
Efficient Computation of Balancedness in Binary Sequence Generators....Pages 269-270
On Asymptotic Finite-State Error Repair....Pages 271-272
New Algorithms for Finding Monad Patterns in DNA Sequences....Pages 273-285
Motif Extraction from Weighted Sequences....Pages 286-297
Longest Motifs with a Functionally Equivalent Central Block....Pages 298-309
On the Transformation Distance Problem....Pages 310-320
On Classification of Strings....Pages 321-330
Back Matter....Pages -