CICLing 2008 (www. CICLing. org) was the 9th Annual Conference on Intel- gent Text Processing and Computational Linguistics. The CICLing conferences are intended to provide a wide-scope forum for the discussion of both the art and craft of natural language processing research and the best practices in its applications. This volume contains the papers accepted for oral presentation at the c- ference, as well as several of the best papers accepted for poster presentation. Other papers accepted for poster presentationwerepublished in specialissues of other journals(seethe informationonthe website). Since 2001the CICLing p- ceedings have been published in Springer’s Lecture Notes in Computer Science series, as volumes 2004, 2276, 2588, 2945, 3406, 3878, and 4394. The book consists of 12 sections, representative of the main tasks and app- cations of Natural Language Processing: – Language resources – Morphology and syntax – Semantics and discourse – Word sense disambiguation and named entity recognition – Anaphora and co-reference – Machine translation and parallel corpora – Natural language generation – Speech recognition – Information retrieval and question answering – Text classi?cation – Text summarization – Spell checking and authoring aid A total of 204 papers by 438 authors from 39 countries were submitted for evaluation (see Tables 1 and 2). Each submission was reviewed by at least two independent Program Committee members. This volume contains revised v- sions of 52 papers by 129 authors from 24 countries selected for inclusion in the conference program (the acceptance rate was 25. 5%).
Author(s): Aleš Horák, Piek Vossen, Adam Rambousek (auth.), Alexander Gelbukh (eds.)
Series: Lecture Notes in Computer Science 4919
Edition: 1
Publisher: Springer-Verlag Berlin Heidelberg
Year: 2008
Language: English
Pages: 670
Tags: Information Storage and Retrieval; Artificial Intelligence (incl. Robotics); Language Translation and Linguistics; Mathematical Logic and Formal Languages; Document Preparation and Text Processing
Front Matter....Pages -
A Distributed Database System for Developing Ontological and Lexical Resources in Harmony....Pages 1-15
Verb Class Discovery from Rich Syntactic Data....Pages 16-27
Growing TreeLex....Pages 28-39
Acquisition of Elementary Synonym Relations from Biological Structured Terminology....Pages 40-51
A Comparison of Co-occurrence and Similarity Measures as Simulations of Context....Pages 52-63
Various Criteria of Collocation Cohesion in Internet: Comparison of Resolving Power....Pages 64-72
Why Don’t Romanians Have a Five O’clock Tea, Nor Halloween, But Have a Kind of Valentines Day?....Pages 73-84
SIGNUM: A Graph Algorithm for Terminology Extraction....Pages 85-95
Arabic Morphology Parsing Revisited....Pages 96-105
A Probabilistic Model for Guessing Base Forms of New Words by Analogy....Pages 106-116
Unsupervised and Knowledge-Free Learning of Compound Splits and Periphrases....Pages 117-127
German Decompounding in a Difficult Corpus....Pages 128-139
Clause Boundary Identification Using Conditional Random Fields....Pages 140-150
Natural Language as the Basis for Meaning Representation and Inference....Pages 151-170
Layer Structures and Conceptual Hierarchies in Semantic Representations for NLP....Pages 171-182
Deep Lexical Semantics....Pages 183-193
On Ontology Based Abduction for Text Interpretation....Pages 194-205
Analysis of Joint Inference Strategies for the Semantic Role Labeling of Spanish and Catalan....Pages 206-218
A Preliminary Study on the Robustness and Generalization of Role Sets for Semantic Role Labeling....Pages 219-230
XTM: A Robust Temporal Text Processor....Pages 231-240
What We Are Talking about and What We Are Saying about It....Pages 241-262
Trusting Politicians’ Words (for Persuasive NLP)....Pages 263-274
Sense Annotation in the Penn Discourse Treebank....Pages 275-286
A Semantics-Enhanced Language Model for Unsupervised Word Sense Disambiguation....Pages 287-298
Discovering Word Senses from Text Using Random Indexing....Pages 299-310
Domain Information for Fine-Grained Person Name Categorization....Pages 311-321
Language Independent First and Last Name Identification in Person Names....Pages 322-333
Mixing Statistical and Symbolic Approaches for Chemical Names Recognition....Pages 334-343
Portuguese Pronoun Resolution: Resources and Evaluation....Pages 344-350
Semantic and Syntactic Features for Dutch Coreference Resolution....Pages 351-361
Stat-XFER: A General Search-Based Syntax-Driven Framework for Machine Translation....Pages 362-375
Statistical Machine Translation into a Morphologically Complex Language....Pages 376-387
Translation Paraphrases in Phrase-Based Machine Translation....Pages 388-398
n-Best Reranking for the Efficient Integration of Word Sense Disambiguation and Statistical Machine Translation....Pages 399-410
Learning Finite State Transducers Using Bilingual Phrases....Pages 411-422
Learning Spanish-Galician Translation Equivalents Using a Comparable Corpus and a Bilingual Dictionary....Pages 423-433
Context-Based Sentence Alignment in Parallel Corpora....Pages 434-444
Bilingual Segmentation for Alignment and Translation....Pages 445-453
Dynamic Translation Memory: Using Statistical Machine Translation to Improve Translation Memory Fuzzy Matches....Pages 454-465
Identification of Transliterated Foreign Words in Hebrew Script....Pages 466-477
Innovative Approach for Engineering NLG Systems: The Content Determination Case Study....Pages 478-487
Comparison of Different Modeling Units for Language Model Adaptation for Inflected Languages....Pages 488-499
Word Distribution Analysis for Relevance Ranking and Query Expansion....Pages 500-511
Hybrid Method for Personalized Search in Scientific Digital Libraries....Pages 512-521
Alignment-Based Expansion of Textual Database Fields....Pages 522-531
Detecting Expected Answer Relations through Textual Entailment....Pages 532-543
Improving Question Answering by Combining Multiple Systems Via Answer Validation....Pages 544-554
Evaluation of Internal Validity Measures in Short-Text Corpora....Pages 555-567
Arabic/English Multi-document Summarization with CLASSY—The Past and the Future....Pages 568-581
Lexical Cohesion Based Topic Modeling for Summarization....Pages 582-592
Terms Derived from Frequent Sequences for Extractive Text Summarization....Pages 593-604
Real-Word Spelling Correction with Trigrams: A Reconsideration of the Mays, Damerau, and Mercer Model....Pages 605-616
Non-interactive OCR Post-correction for Giga-Scale Digitization Projects....Pages 617-630
Linguistic Support for Revising and Editing....Pages 631-642
The Role of PP Attachment in Preposition Generation....Pages 643-654
EFL Learner Reading Time Model for Evaluating Reading Proficiency....Pages 655-664
Back Matter....Pages -