Requirements-Oriented Methodology for Evaluating Ontologies

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Author(s): Jonathan Yu
Publisher: RMIT University
Year: 2008

Language: English
Pages: 267
City: Melbourne
Tags: Ontologies

1 Introduction 3
1.1 Ontology evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Challenges in ontology evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Thesis overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Ontologies and their evaluation 10
2.1 Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Simple and structured ontologies . . . . . . . . . . . . . . . . . . . . . 12
2.1.2 Ontology specification languages and OWL . . . . . . . . . . . . . . . 15
2.1.3 Ontology granularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.4 Ontologies used in applications . . . . . . . . . . . . . . . . . . . . . . 21
Data integration and interoperability . . . . . . . . . . . . . . . . . . . 22
Navigation systems and web applications . . . . . . . . . . . . . . . . 24
Information and multimedia retrieval systems . . . . . . . . . . . . . . 24
Knowledge management, organisational memory and group memory . 25
Software specification and development . . . . . . . . . . . . . . . . . 25
Teaching systems and eLearning . . . . . . . . . . . . . . . . . . . . . 26
2.2 Ontology engineering methodologies . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.1 Formal method for ontology engineering . . . . . . . . . . . . . . . . . 27
2.2.2 Methontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.3 On-To-Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
iv
CONTENTS v
2.2.4 SENSUS-based ontology methodology . . . . . . . . . . . . . . . . . . 32
2.2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3 Ontology evaluation methodologies . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3.1 OntoClean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Meta-properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3.2 OntoMetric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.3 Software evaluation methodologies . . . . . . . . . . . . . . . . . . . . 41
Factors-Criteria-Metric framework (FCM) . . . . . . . . . . . . . . . . 41
The Goal Question Metric methodology (GQM) . . . . . . . . . . . . 42
2.4 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.4.1 Ontology evaluation criteria . . . . . . . . . . . . . . . . . . . . . . . . 45
2.4.2 Ontology evaluation measures . . . . . . . . . . . . . . . . . . . . . . . 48
Detailed descriptions of selected ontology measures . . . . . . . . . . . 50
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.4.3 Validating measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3 The ROMEO methodology 67
3.1 The ROMEO methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.2 Ontology requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.2.1 Establishing the roles of the ontology . . . . . . . . . . . . . . . . . . 69
3.2.2 Obtaining a set of ontology requirements. . . . . . . . . . . . . . . . . 72
3.3 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.3.1 Criteria-questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.4 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.4.1 Suggested mappings between criteria-questions and existing measures 80
3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
CONTENTS vi
4 Lonely Planet 91
4.1 Content management for Lonely Planet . . . . . . . . . . . . . . . . . . . . . 92
4.1.1 Travel guidebooks and their issues . . . . . . . . . . . . . . . . . . . . 93
Achieving consistent vocabulary . . . . . . . . . . . . . . . . . . . . . 94
Achieving consistent book structure . . . . . . . . . . . . . . . . . . . 94
Achieving consistent content across guidebooks . . . . . . . . . . . . . 94
4.1.2 Digital content and its issues . . . . . . . . . . . . . . . . . . . . . . . 95
4.1.3 Previous experience of Lonely Planet in reusing ontologies . . . . . . . 96
Appropriate representation of geographic places . . . . . . . . . . . . . 97
Right level of content granularity . . . . . . . . . . . . . . . . . . . . . 100
4.1.4 Roles of the suitable ontology . . . . . . . . . . . . . . . . . . . . . . . 100
4.2 ROMEO ontology requirements for Lonely Planet . . . . . . . . . . . . . . . . 102
4.2.1 Identifying ontology requirements . . . . . . . . . . . . . . . . . . . . . 102
4.2.2 Ontology requirement 1: Controlled vocabulary of names, places and
terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.2.3 Ontology requirement 2: Flexible classification of geographic items . . 105
4.2.4 Ontology requirement 3: Appropriate granularity . . . . . . . . . . . . 105
4.3 ROMEO questions for Lonely Planet . . . . . . . . . . . . . . . . . . . . . . . 106
4.3.1 Questions for ‘Controlled vocabulary of names, places and terms’ . . . 106
4.3.2 Questions for ‘Flexible classification of geographic items’ . . . . . . . . 107
4.3.3 Questions for ‘Appropriate granularity’ . . . . . . . . . . . . . . . . . 108
4.4 ROMEO measures for Lonely Planet . . . . . . . . . . . . . . . . . . . . . . . 108
4.4.1 Measures for ‘How many identical concepts are modelled using different
names?’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.4.2 Measures for ‘How many identical instances are modelled using different
names?’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.4.3 Measures for ‘Do the relationships between concepts in the ontology
adequately cover the relationships between concepts in the domain?’ . 110
CONTENTS vii
4.4.4 Measures for ‘Does the ontology have an appropriate level of granularity
with regard to its concepts compared with the domain being
modelled?’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.4.5 Measures for ‘Does the ontology have an appropriate level of granularity
with regard to its instances compared with the domain being
modelled?’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5 Wikipedia 117
5.1 Wikipedia and its categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.1.1 Wikipedia content, policies and guidelines . . . . . . . . . . . . . . . . 118
Wikipedia policies and guidelines . . . . . . . . . . . . . . . . . . . . . 119
5.1.2 Navigating and exploring articles . . . . . . . . . . . . . . . . . . . . . 121
5.1.3 Wikipedia categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Design of category structure . . . . . . . . . . . . . . . . . . . . . . . . 122
Wikipedia category structure as an ontology . . . . . . . . . . . . . . 123
5.2 ROMEO ontology requirements for Wikipedia . . . . . . . . . . . . . . . . . 124
5.2.1 Identifying ontology requirements . . . . . . . . . . . . . . . . . . . . . 125
5.2.2 Ontology requirement 1: Adequate level of category intersection . . . 129
5.2.3 Ontology requirement 2: Categories should be appropriately grouped . 129
5.2.4 Ontology requirement 3: Avoiding cycles in the category structure . . 130
5.2.5 Ontology requirement 4: Ensure the set of categories is complete . . . 131
5.2.6 Ontology requirement 5: Ensure categories associated in articles are
correct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.3 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.3.1 Questions for ‘Adequate level of category intersection’ . . . . . . . . . 132
5.3.2 Questions for ‘Categories should be appropriately grouped’ . . . . . . 132
5.3.3 Questions for ‘Avoiding cycles in the category structure’ . . . . . . . . 133
5.3.4 Questions for ‘Ensure a complete set of categories’ . . . . . . . . . . . 133
5.3.5 Questions for ‘Ensure categories associated in articles are correct’ . . . 135
CONTENTS viii
5.4 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.4.1 Measures for ‘Does the category structure have an adequate intersection
of categories?’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.4.2 Measures for ‘Does the ontology capture concepts of the domain correctly?’
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.4.3 Measures for ‘How many cycles are found in the ontology?’ . . . . . . 139
5.4.4 Measures for ‘Does the ontology have concepts missing with regard to
the relevant frames of reference?’ . . . . . . . . . . . . . . . . . . . . . 139
5.4.5 Measures for ‘Is the set of categories correctly associated with a given
article?’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6 Empirical validation 146
6.1 The validation process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.1.1 The validation environment . . . . . . . . . . . . . . . . . . . . . . . . 147
6.1.2 Obtaining comparable ontologies . . . . . . . . . . . . . . . . . . . . . 149
6.1.3 Select appropriate tasks and benchmarking standards . . . . . . . . . 150
6.2 Validating granularity mapping . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.2.1 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Ontologies used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Outcomes of validation experiment . . . . . . . . . . . . . . . . . . . . 155
6.3 Validating intersectedness mapping . . . . . . . . . . . . . . . . . . . . . . . . 157
6.3.1 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Ontologies used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Tasks and domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.3.2 Analysis of varied ontologies . . . . . . . . . . . . . . . . . . . . . . . . 164
6.3.3 Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Significance testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
CONTENTS ix
6.3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Best method for obtaining untangled ontology . . . . . . . . . . . . . 169
Comparing Subtree a (base) and Subtree b (untangled) . . . . . . . . 171
6.3.5 Outcome of validation experiment . . . . . . . . . . . . . . . . . . . . 176
6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7 Conclusions and future work 179
7.1 The ROMEO methodology for ontology evaluation . . . . . . . . . . . . . . . 180
7.2 Empirical validation of ontology evaluation methods . . . . . . . . . . . . . . 184
7.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Glossary 191
A ROMEO templates and suggested mappings 192
A.1 ROMEO template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
A.2 Suggested mappings between questions and measures . . . . . . . . . . . . . . 194
B ROMEO analysis: Lonely Planet 198
B.1 Role of the ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
B.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
B.3 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
B.4 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
C ROMEO analysis: Wikipedia 206
C.1 Role of the ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
C.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
C.3 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
C.4 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
D Requirements gathering: Wikipedia 216
D.1 Excerpts from Meta:Categorization requirements . . . . . . . . . . . . . . . . 216
D.1.1 Goals (or, “Why implement categories?”) . . . . . . . . . . . . . . . . 216
CONTENTS x
D.1.2 Theoretical basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
D.2 Excerpts from Wikipedia:Categorization . . . . . . . . . . . . . . . . . . . . . 220
D.2.1 When to use categories . . . . . . . . . . . . . . . . . . . . . . . . . . 220
D.2.2 Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Some general guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Categories vs. lists vs. info boxes . . . . . . . . . . . . . . . . . . . . . 222
Categories applied to articles on people . . . . . . . . . . . . . . . . . 223
Categories do not form a tree . . . . . . . . . . . . . . . . . . . . . . . 223
Cycles should usually be avoided . . . . . . . . . . . . . . . . . . . . . 223
D.2.3 Grouping categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
E Wikipedia Browsing Experiment 225
E.1 User handouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Bibliography 236