Modeling with Words. Learning, Fusion and Reasoning within a Formal Linguistic Representation Framework

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Издательство Springer, 2003, -240 pp.
The development of high-performance computers and the corresponding advances in global communications have lead to an explosion in data collection, transmission and storage. Large-scale multidimensional databases are being generated to describe a wide variety of systems. These can range from engineering applications such as computer vision, to scientific data such as that from the genome project, to customer and price modelling in business and finance. In all of these cases the data is useless without methods of analysis by which we can discover the important underlying trends and relationships, integrate other background information, and then carry out inference on the learnt models. For a number of reasons we argue that in order to fulfill these requirements we should move towards a modelling paradigm that is as close to natural language as possible.
In recent years the area of machine learning has focused on the development of induction algorithms that is maximize predictive accuracy. However, since there has been little emphasis on knowledge representation the models derived are typically ‘black box’ and therefore difficult to understand and interpret. For many applications a high level of predictive accuracy is all that is required. However, in a large number of cases, including many critical application, a clear understanding of the prediction mechanisms is vital if there is to be sufficient confidence in the model for it to be used as a decision-making tool. Model transparency of this kind is best achieved within a natural-language-based modelling framework that allows for the representation of both uncertainty and fuzziness. We must be aware, however, of an inherent trade-off between model accuracy and transparency. Simple models, while the most transparent, are often inadequate to capture the complex dependencies that exist in many practical modelling problems. Alternatively, more complex models are much more difficult to represent in a clear and understandable manner. This trade-off is best managed by close collaboration with domain experts who can provide the modeller with an unbiased assessment of the transparency of their models while also establishing what level of accuracy is necessary for the current problem. Another important justification for learning models at a linguistic level is that it facilitates their fusion with background knowledge obtained from domain experts.
In any data modelling problem there is almost certain to be some expert knowledge available, derived from either an in-depth understanding of the underlying physical processes or from years of practical experience. In expert systems the emphasis is placed almost entirely on this expert information, with data being used only to optimize the performance of the model. On the other hand, in machine learning, background knowledge is largely ignored, except perhaps in the limited role of constraining prior distributions in Bayesian methods. As part of modelling with words we propose that there should be a high-level fusion of expert- and data-derived knowledge. By integrating these two types of information it should be possible to improve on the performance of models that are based solely on one or the other. Furthermore, the effective use of background knowledge can allow for the application of simpler learning algorithms, producing simpler, and hence more transparent, models.
Given a model of a data problem it is highly desirable that practitioners be able to interrogate it in order to evaluate interesting hypotheses. Since these hypotheses are most likely to be in natural-language form, to achieve this a high-level inference mechanism on linguistic terms is required. Such an inference process is, in essence, what Zadeh calls ‘computing with words.’ The nature of any reasoning mechanism at this level will depend on the nature of the data models. For example, if the models take the form of a fuzzy rule base then methods similar to those proposed by Zadeh may be appropriate. Alternatively, if the model consists of conceptual graphs then graph matching and other similar methods from conceptual graph theory will need to be used. However, no matter what methodology is applied it must be formally well defined and based on a clear underlying semantics. In this respect modelling with words differs from natural language since we require a much more formal representation and reasoning framework for the former than for the latter. In fact this high level of formal rigor is necessary if we are to obtain models that are sufficiently transparent to satisfy practitioners of their validity in critical applications. Certainly, a modelling process cannot be truly transparent if there are significant doubts regarding the meaning of the underlying concepts used or the soundness of the learning and reasoning mechanisms employed. This formal aspect of modelling with words is likely to mean that some of the flexibility and expressiveness of natural language will need to be sacrificed. The goal, however, is to maintain rigor within a representation framework that captures many of the important characteristics of natural language so as to allow relative ease of translation between the two domains. This is very similar to the idea behind Zadeh’s ‘precisiated natural language.’
Modelling with words can be defined in terms of the trilogy, learning, fusion and reasoning as carried out within a formal linguistic representation framework. As such this new paradigm gives rise to a number of interesting and distinct challenges within each of these three areas. In learning, how can the dual goals of good predictive accuracy and a high level of transparency be reconciled? Also, how can we scale our linguistic algorithms to high-dimensional data problems? In fusion, what are the most effective methods for integrating linguistic expert knowledge with data-derived knowledge, and how does this process constrain the representation of both types of knowledge? In reasoning, what sound and useful rules of inference can be identified and what type of queries can they evaluate? In general, how can we effectively integrate fuzzy and probabilistic uncertainty in data modelling and what type of knowledge representation framework is most appropriate? This volume contains a collection of papers that begin to address some of these issues in depth. Papers by E. Hernandez et al. and A. Laurent et al. investigate the use of fuzzy decision trees to derive linguistic rules from data. H. Ishibuchi et al. and R. Alcala et al. describe how genetic algorithms can be used to improve the performance of fuzzy models. The area of fuzzy conceptual graphs is the topic of papers by T. Cao and P. Paulson et al. Linguistic modelling and reasoning frameworks based on random sets are discussed in papers by J. Lawry and F. Diaz-Hermida et al., and Q. Shen introduces an algorithm according to which rough sets can be used to identify important attributes. The application of fuzzy sets to text classification is investigated by Y. Chen, and J. Rossiter discusses the paradigm of humanist computing and its relationship to modelling with words.
Random Set-Based Approaches for Modelling Fuzzy Operators
A General Framework for Induction of Decision Trees under Uncertainty
Combining Rule Weight Learning and Rule Selection to Obtain Simpler and More Accurate Linguistic Fuzzy Models
Semantics-Preserving Dimensionality Reductionin Intelligent Modelling
Conceptual Graphs for Modelling and Computing with Generally Quantified Statements
Improvement of the Interpretabilityof Fuzzy Rule Based Systems: Quantifiers, Similarities and Aggregators
Humanist Computing: Modelling with Words, Concepts, and Behaviours
A Hybrid Framework Using SOM and Fuzzy Theory for Textual Classification in Data Mining
Combining Collaborative and Content-Based Filtering Using Conceptual Graphs
Random Sets and Appropriateness Degrees for Modelling with Labels
Interpretability Issues in Fuzzy Genetics-Based Machine Learning for Linguistic Modelling

Author(s): Lawry J., Shanahan J., Ralescu A. (eds.)

Language: English
Commentary: 729624
Tags: Информатика и вычислительная техника;Искусственный интеллект;Интеллектуальный анализ данных