Thisvolumecontainspapersselectedforpresentationatthe6thIAPRWorkshop on Document Analysis Systems (DAS 2004) held during September 8–10, 2004 at the University of Florence, Italy. Several papers represent the state of the art in a broad range of “traditional” topics such as layout analysis, applications to graphics recognition, and handwritten documents. Other contributions address the description of complete working systems, which is one of the strengths of this workshop. Some papers extend the application domains to other media, like the processing of Internet documents. The peculiarity of this 6th workshop was the large number of papers related to digital libraries and to the processing of historical documents, a taste which frequently requires the analysis of color documents. A total of 17 papers are associated with these topics, whereas two yearsago (in DAS 2002) only a couple of papers dealt with these problems. In our view there are three main reasons for this new wave in the DAS community. From the scienti?c point of view, several research ?elds reached a thorough knowledge of techniques and problems that can be e?ectively solved, and this expertise can now be applied to new domains. Another incentive has been provided by several research projects funded by the EC and the NSF on topics related to digital libraries.
Author(s): Henry S. Baird, Venugopal Govindaraju, Daniel P. Lopresti (auth.), Simone Marinai, Andreas R. Dengel (eds.)
Series: Lecture Notes in Computer Science 3163
Edition: 1
Publisher: Springer-Verlag Berlin Heidelberg
Year: 2004
Language: English
Pages: 568
Tags: Pattern Recognition; Information Storage and Retrieval; Image Processing and Computer Vision; Simulation and Modeling; Computer Appl. in Administrative Data Processing
Front Matter....Pages -
Document Analysis Systems for Digital Libraries: Challenges and Opportunities....Pages 1-16
The Trinity College Dublin 1872 Online Catalogue....Pages 17-27
DL Architecture for Indic Scripts....Pages 28-38
A Semantic-Based System for Querying Personal Digital Libraries....Pages 39-46
Toward Personalized Digital Library for Providing “Information JIT”....Pages 47-50
Tilting at Windmills: Adventures in Attempting to Reconstruct Don Quixote ....Pages 51-62
A Segmentation-Free Recognition Technique to Assist Old Greek Handwritten Manuscript OCR....Pages 63-74
Automatic Metadata Retrieval from Ancient Manuscripts....Pages 75-89
A Complete Approach to the Conversion of Typewritten Historical Documents for Digital Archives....Pages 90-101
An Adaptive Binarization Technique for Low Quality Historical Documents....Pages 102-113
Segmentation of Handwritten Characters for Digitalizing Korean Historical Documents....Pages 114-124
Self-organizing Maps and Ancient Documents....Pages 125-134
Enriching Historical Manuscripts: The Bovary Project....Pages 135-146
Word Grouping in Document Images Based on Voronoi Tessellation....Pages 147-157
Multi-component Document Image Coding Using Regions-of-Interest....Pages 158-169
Physical Layout Analysis of Complex Structured Arabic Documents Using Artificial Neural Nets....Pages 170-178
An Integrated Approach for Automatic Semantic Structure Extraction in Document Images....Pages 179-190
Multi-view hac for Semi-supervised Document Image Classification....Pages 191-200
Configurable Text Stamp Identification Tool with Application of Fuzzy Logic....Pages 201-212
Layout and Content Extraction for PDF Documents....Pages 213-224
Automatic Extraction of Filled-In Items from Bank-Check Images....Pages 225-228
Bleed-Through Removal from Degraded Documents Using a Color Decorrelation Method....Pages 229-240
Colour Map Classification for Archive Documents....Pages 241-251
Serialized k -Means for Adaptative Color Image Segmentation....Pages 252-263
Adaptive Region Growing Color Segmentation for Text Using Irregular Pyramid....Pages 264-275
Preprocessing and Segmentation of Bad Quality Machine Typed Documents....Pages 276-285
Ensembles of Classifiers for Handwritten Word Recognition Specialized on Individual Handwriting Style....Pages 286-297
Information Retrieval System for Handwritten Documents....Pages 298-309
Word–Wise Script Identification from Indian Documents....Pages 310-321
Recognizing Freeform Digital Ink Annotations....Pages 322-331
Post-processing of Handwritten Pitman’s Shorthand Using Unigram and Heuristic Approaches....Pages 332-336
Multiscale Handwriting Characterization for Writers’ Classification....Pages 337-341
A Hybrid Approach to Detect Graphical Symbols in Documents....Pages 342-353
Performance Evaluation of Symbol Recognition....Pages 354-365
The Search for Genericity in Graphics Recognition Applications: Design Issues of the Qgar Software System....Pages 366-377
Attributed Graph Matching Based Engineering Drawings Retrieval....Pages 378-388
A Platform to Extract Knowledge from Graphic Documents. Application to an Architectural Sketch Understanding Scenario....Pages 389-400
A Graph-Based Framework for Web Document Mining....Pages 401-412
XML Documents Within a Legal Domain: Standards and Tools for the Italian Legislative Environment....Pages 413-424
Rule-Based Structural Analysis of Web Pages....Pages 425-437
Extracting Table Information from the Web....Pages 438-441
A Neural Network Classifier for Junk E-Mail....Pages 442-450
Results of a Study on Invoice-Reading Systems in Germany....Pages 451-462
A Document Analysis System Based on Text Line Matching of Multiple OCR Outputs....Pages 463-471
DocMining: A Document Analysis System Builder....Pages 472-483
Automatic Fax Routing....Pages 484-495
Contextual Swarm -Based Multi-layered Lattices: A New Architecture for Contextual Pattern Recognition....Pages 496-507
Natural Language Processing of Patents and Technical Documentation....Pages 508-520
Document Image Retrieval in a Question Answering System for Document Images....Pages 521-532
A Robust Braille Recognition System....Pages 533-545
Document Image Watermarking Based on Weight-Invariant Partition Using Support Vector Machine....Pages 546-554
Video Degradation Model and Its Application to Character Recognition in e-Learning Videos....Pages 555-558
Unity Is Strength: Coupling Media for Thematic Segmentation....Pages 559-562
Back Matter....Pages -