Exploring Formal Models of Linguistic Data Structuring. Enhanced Solutions for Knowledge Management Systems Based on NLP Applications

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

The principal aim of this research is describing to which extent formal models for linguistic data structuring are crucial in Natural Language Processing (NLP) applications. In this sense, we will pay particular attention to those Knowledge Management Systems (KMS) which are designed for the Internet, and also to the enhanced solutions they may require. In order to appropriately deal with this topics, we will describe how to achieve computational linguistics applications helpful to humans in establishing and maintaining an advantageous relationship with technologies, especially with those technologies which are based on or produce man-machine interactions in natural language. We will explore the positive relationship which may exist between well-structured Linguistic Resources (LR) and KMS, in order to state that if the information architecture of a KMS is based on the formalization of linguistic data, then the system works better and is more consistent. As for the topics we want to deal with, frist of all it is indispensable to state that in order to structure efficient and effective Information Retrieval (IR) tools, understanding and formalizing natural language combinatory mechanisms seems to be the first operation to achieve, also because any piece of information produced by humans on the Internet is necessarily a linguistic act. Therefore, in this research work we will also discuss the NLP structuring of a linguistic formalization Hybrid Model, which we hope will prove to be a useful tool to support, improve and refine KMSs. 12 Exploring Formal Models of Linguistic Data Structuring More specifically, in section 1 we will describe how to structure language resources implementable inside KMSs, to what extent they can improve the performance of these systems and how the problem of linguistic data structuring is dealt with by natural language formalization methods. In section 2 we will proceed with a brief review of computational linguistics, paying particular attention to specific software packages such Intex, Unitex, NooJ, and Cataloga, which are developed according to Lexicon-Grammar (LG) method, a linguistic theory established during the 60’s by Maurice Gross. In section 3 we will describe some specific works useful to monitor the state of the art in Linguistic Data Structuring Models, Enhanced Solutions for KMSs, and NLP Applications for KMSs. In section 4 we will cope with problems related to natural language formalization methods, describing mainly Transformational-Generative Grammar (TGG) and LG, plus other methods based on statistical approaches and ontologies. In section 5 we will propose a Hybrid Model usable in NLP applications in order to create effective enhanced solutions for KMSs. Specific features and elements of our hybrid model will be shown through some results on experimental research work. The case study we will present is a very complex NLP problem yet little explored in recent years, i.e. Multi Word Units (MWUs) treatment. In section 6 we will close our research evaluating its results and presenting possible future work perspectives. Keywords Knowledge Management System, Natural Language Processing, Linguistic Formal Model, Hybrid Formal Model.

Author(s): Federica Marano
Series: X Ciclo – Nuova Serie 2008-2011
Publisher: Universitá degli Studi di Salerno
Year: 2011

Language: English
Pages: 168
City: Salerno
Tags: Knowledge Management System; Natural Language Processing; NLP; Linguistic Formal Model; Hybrid Formal Model

Foreword 13
Introduction 17
The Relationship between Linguistic Resources and Knowledge
Management Systems 27
1 Well-structured Linguistic Resources for effective
Knowledge Management Systems 27
The Point of View of Computational Linguistics 39
2 A Brief Review of Computational Linguistics 39
2.1 A Short Survey on Some Main Computational
Linguistics Subfields 46
2.2 Lexicon-Grammar, a Frame for Computational
Linguistics 49
2.3 Lexicon-Grammar: Resources, Tools and Software
for Computational Linguistics 55
8
A State of the Art 67
3 Natural Language Formalization 67
3.1 Models of Linguistic Data Structuring 68
3.1.1 PAULA XML: Interchange Format for Linguistic
Annotations 68
3.1.2 EXMARaLDA 70
3.1.3 TUSNELDA 71
3.2 Enhanced Solutions for Knowledge Management
Systems 73
3.2.1 Defining Knowledge Management and Knowledge
Management System Structure 74
3.2.2 Different Types of Knowledge 77
3.2.3 From Knowledge Management to Enhanced
Knowledge Management Systems 80
3.2.4 Knowledge Management Systems 81
3.2.5 KMSs and Data-Driven Decision Support Systems 87
3.3 NLP Applications for Knowledge Management Systems 88
3.3.1 WordNet 89
3.3.2 FrameNet 91
3.3.3 KIM 94
Formal Models for Linguistic Data 97
4 The Question of Linguistic Data Structuring Formal
Models 97
9
4.1 “On the Failure of Generative Grammar” 102
4.2 Lexicon-Grammar: a Theoretical and Methodological
Challenge in the Formal Modelling of Linguistic Data 106
4.3 Statistical Models: Faster Methods of Data Processing 110
4.3.1 Statistical Analysis Tools and Procedures 112
4.4 Ontology-Based Models: a Survey on Classification
Tools 116
Hybrid Model of Linguistic Formalization for Knowledge
Management 121
5 Hybrid Model of NLP 121
5.1 Linguistic Pre-processing of Data for NLP Applications 126
5.1.1 Linguistic Resources and Tools in Translation
Processes 134
Discussions and Conclusions 147
References 151