Computer-readable documents have become ubiquitous in everyday life - from legacy documents that have been digitized, to new documents that have been created electronically. As the number of electronic documents continues to grow, so does the importance of digital methods for processing and managing these documents.
This comprehensive text/reference provides a broad review of the issues involved in handling and processing digital documents. Examining the full range of a document's lifetime, the book covers acquisition, representation, security, pre-processing, layout analysis, understanding, analysis of single components, information extraction, filing, indexing and retrieval. A background knowledge of the area is not required, beyond familiarity with basic concepts of computer science and mathematics; deeper technical content is provided in discrete subsections that are not essential for an understanding of other parts of the book.
Topics and features:
- With a Foreword by Professor George Nagy of Rensselaer Polytechnic Institute, New York, USA
- Provides a list of acronyms and a glossary of technical terms
- Contains appendices covering key concepts in machine learning, and providing a case study on building an intelligent system for digital document and library management
- Discusses issues of security, and legal aspects of digital documents
- Examines core issues of document image analysis, and image processing techniques of particular relevance to digitized documents
- Reviews the resources available for natural language processing, in addition to techniques of linguistic analysis for content handling
- Investigates methods for extracting and retrieving data/information from a document, including representation at a semantic level
Undergraduate and graduate students will find the text a valuable general reference on the subject, and researchers will discover how their specific area of interest is interrelated with other disciplines involved in digital document processing. The book also supplies a repertoire of potential technological solutions for professionals working on digital documents.
Dr. Stefano Ferilli is an associate professor at the University of Bari, Italy, where he is Director of the Interdepartmental Center for Logic and Applications.