Information extraction (IE) and text summarization (TS) are powerful technologies for finding relevant pieces of information in text and presenting them to the user in condensed form. The ongoing information explosion makes IE and TS critical for successful functioning within the information society.
These technologies face particular challenges due to the inherent multi-source nature of the information explosion. The technologies must now handle not isolated texts or individual narratives, but rather large-scale repositories and streams---in general, in multiple languages---containing a multiplicity of perspectives, opinions, or commentaries on particular topics, entities or events. There is thus a need to adapt existing techniques and develop new ones to deal with these challenges.
This volume contains a selection of papers that present a variety of methodologies for content identification and extraction, as well as for content fusion and regeneration. The chapters cover various aspects of the challenges, depending on the nature of the information sought---names vs. events,--- and the nature of the sources---news streams vs. image captions vs. scientific research papers, etc. This volume aims to offer a broad and representative sample of studies from this very active research field.
Author(s): Horacio Saggion, Thierry Poibeau (auth.), Thierry Poibeau, Horacio Saggion, Jakub Piskorski, Roman Yangarber (eds.)
Series: Theory and Applications of Natural Language Processing
Edition: 1
Publisher: Springer-Verlag Berlin Heidelberg
Year: 2013
Language: English
Pages: 324
Tags: Data Mining and Knowledge Discovery; Computational Linguistics; Information Storage and Retrieval; Information Systems Applications (incl. Internet); Semantics
Front Matter....Pages i-xx
Front Matter....Pages 1-1
Automatic Text Summarization: Past, Present and Future....Pages 3-21
Information Extraction: Past, Present and Future....Pages 23-49
Front Matter....Pages 51-51
Learning to Match Names Across Languages....Pages 53-71
Computational Methods for Name Normalization Using Hypocoristic Personal Name Variants....Pages 73-91
Entity Linking: Finding Extracted Entities in a Knowledge Base....Pages 93-115
A Study of the Effect of Document Representations in Clustering-Based Cross-Document Coreference Resolution....Pages 117-134
Front Matter....Pages 135-135
Interactive Topic Graph Extraction and Exploration of Web Content....Pages 137-161
Predicting Relevance of Event Extraction for the End User....Pages 163-176
Open-Domain Multi-Document Summarization via Information Extraction: Challenges and Prospects....Pages 177-201
Front Matter....Pages 203-203
Generating Update Summaries: Using an Unsupervized Clustering Algorithm to Cluster Sentences....Pages 205-227
Multilingual Statistical News Summarization....Pages 229-252
A Bottom-Up Approach to Sentence Ordering for Multi-Document Summarization....Pages 253-276
Improving Speech-to-Text Summarization by Using Additional Information Sources....Pages 277-297
Multi-Document Summarization Techniques for Generating Image Descriptions: A Comparative Analysis....Pages 299-320
Back Matter....Pages 321-323