Aldo Gangemi, Domenico M. Pisanelli, Gerardo Steve
Consiglio Nazionale delle Ricerche - ITBM, Viale Marx 15, 00137, Roma, Italy
{aldo,nico}@color.irmkant.rm.cnr.it, steve@relay.itbm.rm.cnr.it
Sharing and reusing large subsets of the medical terminology is needed in various areas: knowledge-based systems, information retrieval, standardization, etc. The main obstacle to sharing and reusing medical terminologies is the lack of conceptual integration of terms. Actually the intended meaning of terms is different according to the context in which they appear and to the context of use. Interdisciplinary research in ontology provides good evidence that use of generic ontologies specified from literature is the grounding matter for conceptual integration of terminologies. Following our experiences in engineering a methodology for terminology integration, we suggest that the contextual dependency of terms should be overcome by means of a collaborative modelling environment, a distributed approach, an expressive language and a sound methodology. ON9, our current medical ontology library, evolved using expressive languages like GRAIL, Ontolingua, Loom and OCML. It also took advantage from tools for the distributed negotiation of ontologies like Ontosaurus.
In this paper we sketch out the motivations which led us to design a methodology for engineering terminological ontologies, and a description of the languages and tools that have been used to construct the current ON9 library of ontologies. Sharing and reusing large subsets of the medical terminology is become a necessity in various areas: knowledge-based systems, information retrieval, standardization, etc. The main obstacle to sharing and reusing medical terminologies is the lack of conceptual integration of terms. Actually the intended meaning of terms is different according to the context in which they appear and to the context of use. Interdisciplinary research in ontology provides good evidence that use of generic ontologies specified from the literature is the grounding matter for conceptual integration of terminologies. Detailed generic theories require rich languages and tools as well as collaborative effort to be extensively used. Consequently, we strongly commit to groundedness of conceptualization, according to the definition given by Harnad (1990), expressive languages, modular architectures, and distributed tools for collaborative modelling.
Our research primarily concerns the integration and reuse of terminological ontologies in medicine. Terminological ontologies are crucial for activities such as vocabulary standardization (CEN, 1995) natural language (lexical) processing, terminology server design (GALEN, 1992-4; Humphreys and Lindberg, 1992), conceptual modelling of subdomains (Rossi Mori et al. 1997), knowledge integration, sharing, and reuse (Gennari et al., 1994; Falasconi and Stefanelli, 1994; Swartout et al. 1996; Valente and Breuker 1996), and multi-agent system development (Falasconi et al., 1996).
Our source for building terminological ontologies are medical terminology systems. Most medical terminology systems do not have a terminological ontology, however this does not mean that terminology systems are not founded on a conceptualization, but only that their conceptualization is left to the interpretation of the experts who use the systems. In the following we present our approach towards an explicit conceptualization of the domain ontologies of terminology systems. Our aim is to build a library of grounded terminological ontologies (of representation, generic, and domain kind).
We started working on medical language processing in 1989 and produced a schema for a machine-dictionary of medicine (Rossi Mori et al. 1990), which provides a normalization of medical vocabulary by decomposing the morphological units of terms. Those efforts revealed that term normalization must be followed by a conceptual account of the normalized term (Gangemi et al. 1992), namely decomposed terms required the explicitation of their intended meaning in order to provide a sensible conceptualization. For example, the term "viral hepatitis" can be decomposed in "vir-", "-al", "hepat-", "- itis" (the component morphemes), and we could normalize the components in "virus", "<adjective>", "liver", and "inflammation". However, we still need the interpretation of the relations among the components, and the classification of the components within a comprehensive domain terminology. The explicitation of meaning was initially carried on in an informal way, for example by analyzing the work done in medical terminologies or so-called "coding systems". It was clear that a conceptualization for a term was context-dependent, where "context" had to be intended in a wide sense, including:
Such context dependency (or "situatedness") convinced us to look for a broad perspective, including primarily a methodology for explicitating the conceptualization of a term only as far as its context requires, and secondarily languages and tools for implementing a conceptualization.
In the following we give: §2.: our proposal of some methodological issues to be supported in engineering terminological ontologies; §3. a practical description of the ONIONS methodology; §4. a brief overview of the WWW toolkits we have experimented with; §5. a sketch of our current ON9 ontology library, integrated from the terminological ontologies of five terminology systems.
The conceptualization activity -- the activity providing a specification to terms -- poses severe problems to modellers when concepts must be shared by different users in different contexts. Local conceptualizations are not suitable to support the tasks of making standards, writing guide-lines, reusing, integrating or sharing knowledge. To this purpose, we need:
(1) Procedures for capturing terminological knowledge: knowledge conceptualized in existing models may cover different areas, and this coverage is hardly predictable. Moreover, it is unclear how much coverage is needed in a standard model; terminological knowledge has various contextual constraints, and only the relevant ones should be conceptualized. We need a methodology for capturing all and only the knowledge we need for a scope, and for tracing the borders among the different areas covered by different models, but not in the sense of managing conflicting knowledge in different areas or even inside the same area (this is not a terminological issue).
(2) Procedures for explicitating domain ontologies and the related generic ontologies: the intended meaning of concepts in a local conceptualization is tailored to the local needs, thus a different conceptualization might have a different intended meaning. Moreover, as far as standards or guidelines are concerned, a conceptualization has to be acceptable to an entire community, not only to a local task. We need a methodology for conceptualizing the intended meaning of local conceptualizations under a unified, "multi-local" conceptualization. To this purpose, we require a library of generic ontologies to be constructed (see issue (6)). Also, we need procedures for reusing existing generic ontologies or formalizing other existing, but informal ones.
(3) A common, expressive language for expressing the resulting conceptualization: the language used is not neutral to the resulting conceptualization, thus different languages pose problems of translatability. A common language should have the expressivity to translate the constraints posed by other languages; it should be very expressive in any case.
(4) A viable implementation for concept classification: concepts defined in an ontology should be classified, for example according to the structural resemblance of their definitions (Mac Gregor, 1994). Structural resemblance is the most used strategy for concept classification within the description logics domain and poses hard problems of complexity; for example, it has been demonstrated that languages with a certain expressivity are recursively undecidable (Schmidt-Strauss, 1989). On the other hand, common practice has given various arguments in support of a more liberal strategy, specially if "normal case" (instead of the "worst case") is adopted for testing tractability of a language (see: Speel (1995)).
(5) The explicitation of representation ontologies (a conceptualization of the intended meaning of the so-called "Meta-Level Categories" (MLC), like "class", "slot", "property", "relationship", etc.): an ontology uses formal languages which eventually result in additional constraints provided by the MLC of those languages. MLC are used with a different semantics in different languages and their interpretation is usually left to the intuition of the modeller. We need a semantic analysis of meta-level categories and good guidelines for applying them in the conceptualization activity.
(6) A library (modular) architecture for ontological theories: when the number of domain concepts exceeds a certain size, the maintenance of a unique domain ontology is very difficult, both from a computational and from a conceptual viewpoint. The problem is even harder when a domain ontology is a specialization of a generic ontology library (indeed, as it should be), because generic knowledge might be specified in generic ontologies which are not compatible among them, if they are taken as wholes. For this reason, we need to be modular. Generic modules could be included as wholes in domain modules, or they could only provide some concepts to domain ontologies (they could be used).
(7) Guidelines for the distributed modelling of ontologies: requires some decisions that are somewhat arbitrary: ontological engineering
a) about the ontologies to be included in a library;
b) about the definitions to be included in generic ontologies;
c) about which definitions are to be specified from generic to domain ontologies.
Although a rationale for a) b) c) is supplied by a methodology which deals with issue (1) and (2), the actual decisions to be taken in the application of this methodology are preferably to be discussed among various groups or institutions; for example, among an expert of a subdomain, a knowledge engineer, a philosopher and an industrial partner, all involved in making a terminological standard for surgical device concepts.
(8) Tools for on-line availability of ontologies (browsing and editing): once we have a methodology, a library architecture and a distributed modelling activity, one should find the fittest tools for carrying out such an activity. The ideal toolbox should provide:
a) Internet-available ontology libraries;
b) remote on-line browsing and editing of ontologies and definitions of concepts,
possibly with interfaces customized to the expertise level of the user;
c) an interactive tool for collaborative discussion about the libraries.
We examine issues (1) (2) (3) (5) in more detail in (Steve and Gangemi, 1996; Steve and Gangemi, 1997; Gangemi et al. 1997). A relevant effort in the direction of (3) has been done by Gruber (1993). The problems in (5) have been studied by several authors (Gangemi et al. 1996; Gruber 1993; Guarino et al. 1994; Sowa 1996). The issues in (1) and (2) have received little attention in AI, until recent times (Uschold and King1995; Steve and Gangemi, 1996; Valente and Breuker, 1996). Issue (4) is a classic subdomain of AI, taken into account by so-called description logics (Schmidt-Strauss, 1989; Brachman et al., 1991; Mac Gregor, 1994). In medicine, an important effort has being done in Europe by some CEN committees which address issues (1) (2) (3) (5) at various degrees of depth. The CANON group (Evans et al., 1994) mainly addressed (3); the MoSe pre-standard (CEN, 1995) provides some guidelines for (1) and (2), other groups are writing standard conceptual systems in medical sub-domains dealing with the issues in (2) (Rossi Mori et al., 1997). The issue (2) has been treated in (Grüninger, 1996; Steve and Gangemi, 1996; Borst and Akkermans, 1997), but mainly founds itself on the literature of naïve physics, linguistics and philosophy. Issues (6), (7) and (8) are a trademark of the quite recent research in ontological engineering (Falasconi and Stefanelli, 1994; Farquhar et al., 1996; Swartout et al., 1996). (7) is also investigated in the current continuation of GALEN, GALEN-IN-USE.
Some research projects have a position in most of the issues presented. We could classify such approaches to ontology modelling by means of several features related to the above issues:
features related to the conceptual tradeoff between:
features related to the formal tradeoff between:
other features related to:
In order to quantitatively synthesize the concern of these features for some research projects (with no claim of completeness or presision: this is only a general indication), we present our assessment in the graphs in Fig. 1, 2, and in the Tab. 1. The values of these sets of features are on a conventional scale from 0 to 1. Fig. 1 shows the assessment of features related to the conceptual tradeoff. GALEN (GALEN, 1992-4), ON9(Gangemi et al., 1997) and the UMLS Semantic Network (USN, Humphreys and Lindberg, 1992) are specifically tied to treat medical terminological ontologies; CYC (Lenat and Guha, 1990) and SENSUS (Swartout et al., 1996) are tied to treat a-specific terminological ontologies, specially for natural language processing; Games-II (Falasconi and Stefanelli, 1994), Kactus (Laresgoiti, 1996) and PhysSys (Borst and Akkermans, 1997) are mainly tied to domain knowledge-modelling ontologies (the first with application to medicine); formal ontology- for example: (Borgo et al., 1996; Varzi and Casati, 1995) - is not a specific project, but rather a wide, interdisciplinary research program to provide solid bases to generic ontologies.
Fig. 2 shows the assessment of features related to the formal tradeoff; tab. 1 shows the "yes-no" assessment for other features. Consider that the validity of such comparisons is relative: aims and contexts of different projects generate peculiar motivations; for example, only four of the nine systems listed have a main concern with terminological ontologies, which are a secondary aspect in the others. Anyway, some generalities can be described. So called "bottom-up" approaches tend to privilege terminologic coverage, tractability of the language, syntactic simplicity, and to be more distributed, while so-called "top-down" approaches tend to privilege conceptual principles, expressivity of the language, metalinguistic exactness and are more modular. Some stay in the middle, getting the most from both approach types.
Fig.1: The conceptual tradeoff in some ontology systems.
Fig.2: The formal tradeoff in some ontology systems
Tab.1: Other methodological features of some ontology systems
A main concern of our research is to provide a terminological ontology to the most important terminology systems in medicine. To this purpose, we developed a methodology which addresses the above issues. We defined ONIONS with the goal of:
ONIONS then led to the successful integration of the most general concepts (more than five thousands) of five terminology systems. A complete description is reported in other papers (Gangemi et al., 1997; Steve et al. 1997) and an account of the operative and philosophical requirements that motivated ONIONS design can be found in Steve and Gangemi (1996). We adopted Ontolingua and Loom as formalisms for representing the results of the integration of our terminology systems. Ontolingua (Gruber, 1993) -- a language developed from KIF (Neches et al., 1991) -- allows expressivity for both frame-like and axiomatic constraints. Loom (Mac Gregor, 1991) is a quite expressive implementation of a description logic with classification services.
The most relevant need to satisfy was to have an ontology open to revisions without giving maintenance troubles, together with a "buy-what-you-need" approach: if one only talks about anatomy, why inflating his/her ontology with all the stuff about chemicals? Such an approach also allows negotiability, i.e. if one does not agree on a certain part of a conceptualization, not the whole ontology has to be discarded. Therefore we put an emphasis on modularization providing an architecture allowing alternatives and conflicts without loosing the reference to the generic ontologies that are included or used in the modules.
In Figure 3 we introduce in an abstract and schematic form the basics of ONIONS methodology. The motivations of such a methodology and the related feasibility are matter of a wider discussion and are only briefly recalled here.
Here we describe the properties of a terminology system at different development states, thus it is largely independent from the issue if the phases we design are just the right ones to realize those properties. Figure 3 is a schematic account of some the previous issues, which envisages a methodology with six phases and a set of input and output states in the building of a terminology system. Such states are described by a set of structural and ontological properties. We name a property "ontological" if it concerns the principles of a conceptual system.
Figure 3. The methodological phases to build a terminology system which can face the new communication needs: knowledge integration, re-use, sharing. The output of any phase is a special state of one or more terminology systems, described by both structural and ontological properties. Such states are independently re-usable for a specific purpose.
Each ONIONS phase Mi makes a terminology system evolve from a state Si into a state Si+1. Pi and Oi are respectively the structural and ontological properties of Si systems. Hence, such properties allow a classification of existing terminology systems according to their structural and ontological properties.
The methodology has been tested on relevant portions of ICD10, SNOMED-III, and GMN, and on the USN and GCM terminological ontologies. Other specialized corpora of medical terms have been conceptualized as well (e.g. surgical procedures (Rossi Mori et al., 1997)). Currently we are extending the models to cover the entire UMLS Metathesaurus[TM]. Depending on the purpose of the integration, a terminology system may reach a state - e.g. S4 - and stay there without needing a further evolution. P and O properties do not just repeat the issues presented before, because methodological phases are designed to account mainly for issues (1)(2)(3)(5), and are motivated by the organization of existing terminology systems.
It should be emphasized that the lifecycle presented here is that of the ONIONS methodology and actual terminology systems might not follow it strictly. A system might stay in a status Si without having passed the previous ones or it might be in a hybrid state where its structural property is Pj and its ontological property is Ok with j[!]k.
As an example of application of the ONIONS methodology, let us show the transition from S4 state to S5state in the case of the definition of "viral hepatitis A".
State S4 definition is expressed in Ontolingua as:
(define-class viral-hepatitis-a (?vh) (1) "the inflammation of liver caused by virus A; it has an incubation of 15 to 50 days and is accompanied by jaundice" :class-slots ((subclass-of viral-hepatitis-a inflammation)) :instance-slots ((has-incubation viral-hepatitis-a 15-to-50-days) (affects viral-hepatitis-a liver) (caused-by viral-hepatitis-a virus-a) (is-accompanied-by viral-hepatitis-a jaundice)) :issues ((:see-also "in SNOMED, the code is D-0521" "in ICD9-CM the code is 070.1") (:generic-theories "inflammation requires a multiple account within a theory of biologic functions and a theory of biologic morphologies" "affects requires a theory of actants and a theory of functions" "caused-by requires a theory of causality" "the patient status is not mentioned" "anatomy is not mentioned: at least, a part-whole theory is required")))
The methodological phase M4 consists in:
1) the construction (or the reuse, if available) of a library of generic ontologies to account
for the (:issues (:generic-theories)) requirements memorized at S4 (for example those
given in (1)); this equals to build a well-grounded top-level(Sowa,1996; Guarino,1997);
2) theinclusion in the domain ontology of generic ontologies specifying relevant
conceptual principles;
3) the assignment of sound meta-level categories to the classes and relations in the library.
This leads to S5. At state S5 definitions are axiomatized (P5); ontologically, definitions have an explicit semantics (O5) of both the top-level concepts (the concepts provided by generic ontologies) and of the meta-level categories(the concepts provided by a representation ontology). As explained in S4, no classic terminology system has S5 features. An S5 definition for viral-hepatitis-A is:
(define-class viral-hepatitis-a (?vh) "the inflammation process of liver caused by virus A; it has an incubation of 15 to 50 days and is accompanied by jaundice" :def (and (inflammation-process ?vh) (exists ?vir (and (has-a-cause ?vh ?vir) (virus-a ?vir))) (exists ?liv (and (is-embodied-in ?vh ?liv) (and (liver ?liv) (exists ?pat (and (part ?liv ?pat) (*patient ?pat)))))) (exists ?inc (and (is-constitutive-phase-of ?inc ?vh) (and (incubation ?inc ) (= (temporal-value ?inc) ?n) (>= ?n 15) (=< ?n 50)))) (forall (?jau ?pat) (=> (and (jaundice ?jau) (*patient ?pat) (is-embodied-in ?jau ?pat)) (occurs-in ?jau ?vh)))) :issues ((:see-also "in SNOMED-II the code is D-0521" "in ICD9-CM the code is 070.1")))
S5 definitions are usually more detailed than the lower level ones. This is caused by the reference to generic ontologies, which constrain the modeller to explicitate what is usually "collapsed" in local definitions. A typical case is here the passage from the relation "has- incubation" in (1) to the complex quantified statement in (2). Local definitions do not need to "say it all". But when different local definitions must be integrated, some implicit parts have to be expanded. In fact, definition (2) differs from (1) in several aspects, because it gives an answer to the calls specified in the :issues of (1) and deals with the forms non expressible in the frame syntax; in particular, we needed:
(a) concepts which are subsumed by other concepts already defined in a generic
ontology;
(b) an ontologically sound representation of the complex instance in (1);
(c) the specification that only some of the instances occurring as the second argument of
the relation is-accompanied-by when the first instance is a viral-hepatitis-A, are instances
of jaundice.
(d) some quantified expressions to talk of a generic patient whose liver is infected and
shows some symptoms after an incubation period: this cannot be represented in simple
frame style. One should link patient with inflammation, liver, jaundice, incubation and
virus-A;
(a) and (b) are solved by specializing the appropriate concepts from the required generic ontologies:
(c) and (d) are solved by conjoining the three universally quantified implication expressions (within the :def keyword) corresponding to the three slot-value-type expressions (within the :axiom-def keyword), with two more expressions, existentially quantified, which account for the incubation period and the jaundice sign. In Loom, we may obtain a classifiable translation to the entire (2) by using just the Tbox language:
(defconcept viral-hepatitis-a :context infectious-diseases :is-primitive (:and inflammation-process (:some has-a-cause virus-a) (:some is-embodied-in (:and liver (:some part *patient))) (:some has-constitutive-phase (:and incubation (:the temporal-value (:and day (:through 15 50))))) (:all has-occurrence-of (:and jaundice (:some is-embodied-in *patient)))) :annotations ((documentation "the inflammation process of liver caused by virus A; it has an incubation of 15 to 50 days and is accompanied by jaundice")))
In the Loom language, :is-primitive means that viral-hepatitis-A is a primitive concept (it has not a complete definition), slots and types are introduced by means of :and, :all and :the keywords. Full classification for the predicate calculus is not available in Loom, but it is being implemented in PowerLoom (MacGregor, 1994). Thus, Loom provides efficient syntax and semantics for managing a consistent subset of FOPL with its classifier.
In §2. at issues (7) (8) we had proposed some requirements; in particular, we claimed that modeling terminological ontologies needs a toolbox for distributed collaboration. In §3. we have shown the complexity of term conceptualization activity: several decision have to be taken on terminology system analysis, formal choices, theories to include, literature to scan, translations to perform, etc. Those decisions can be validated only by collaborative effort of interdisciplinary experts. For this reason, we formulated four required functions:
(a) ontology libraries available on the Internet; (b) on-line remote accessing of libraries for editing, saving, and exporting ontologies and concept definitions; (c) interfaces to libraries which are customized to the expertise level of the user; (d) an interactive tool for collaborative discussion about the libraries: where different modellers could experiment and face each other about the effects of ontological choices on terminology integration, as well as about the constraints posed by terminology integration on ontological choices.
Several tools have these functions. During the development of ON9 and its former versions (ON6-8 (Steve and Gangemi, 1996)), we experimented with some of them.
Function (a)is supported by Ontolingua 4.0 (for main features, see Tab.2), a Common Lisp implementation of Ontolingua released in 1994, but now no more in distribution. We appreciate its high expressive power, which allows both first and second order logic expressions, as well as frame-like expressions. We still use it to write the primary sources of our ontologies. Ontolingua 4.0 can translate ontologies in Loom, KIF, Generic Frame Protocol (GFP), and other languages, and creates nice html directories containing hypertextual versions of: the source files of ontologies, the ontology reports and the individual concept definitions in GFP.
The main drawback of Ontolingua is the lack of inferential capabilities. In fact, due to the high expressive power, one may lose the control of the consistency among concepts and among ontologies: sometimes we have experienced this when translating from Ontolingua to Loom. On the other hand, even such a drawback can be an advantage if one is not interested, at least in a first phase, in spending a lot of time in revising theory inclusions and definition allocations.
Function (b)(together with (a)) is currently supported by various tools, e.g. the Stanford Ontolingua Server (http://www.ksl.stanford.edu). A centralized Server is the current policy of Ontolingua developers. The Server allows the on-line remote accessing of libraries for editing, saving, and exporting ontologies, all with a nice interface, but it provides less predicate calculus construct types than Ontolingua 4.0; on the other hand, the developers have enhanced the frame-like constructs.
Figure 4: Loom definition and related context library for "viral-hepatitis-A" through Ontosaurus
Ontosaurus (or "Loom-HTTP") is an ontology server implemented using CL-HTTP (Mallery, 1994), a Common Lisp Web server, the Loom knowledge representation system (Mac Gregor, 1991), and Lisp code that interfaces Loom to CL-HTTP. Ontosaurus incorporates Loom, thus takes advantage of Loom's reasoning capabilities, specially for concept classification. On the other hand, having an operational KR system constrains to maintain coherence and thus makes multiple simulataneous edits to the knowledge base (a part of our function (d)) difficult, as explicitly recognized by developers. Ontosaurus includes translators to Ontolingua, KIF, KRSS (Patel-Schneider and Swartout, 1993) and C++ (with obvious limitations in translatability), among others. We are currently using Ontosaurus to perform function (b): it offers many semantic services for conceptualization activity; also, it is quite portable and thus sharable with collaborating centers. Since our primary sources are written in Ontolingua, we translated them in Loom. The original translator from Ontolingua 4.0 is helpful, but substantial hand revision must be performed for some constructs. Examples of Ontosaurus are in Figs. 4 through 6: the definition of viral-hepatitis-a is shown as in Fig. 4: the upper frame contains the main commands for browsing, loading, editing and saving Loom contexts; the left frame is a "reference" frame where one can put some useful file; the right frame is the actual working frame. In Fig. 5 a different view of the same concept is shown which provides taxonomical information on the left and the applicable relations on the right. Fig. 6 shows a piece of the editing environment.
Figure 5 Loom taxonomy and related roles for "viral-hepatitis-A" through Ontosaurus
Function (c) is partly handled by the existing tools: all their interfaces help accessing, retrieving, editing ontologies. GFP is also a very intuitive format for frame definitions. On the other hand, physicians are not interested in understanding the logical nuances of the languages presented; in the current project GALEN-IN-USE, a special intermediate tool has been created to let medical experts make their models: even a friendly syntax as the GRAIL's resulted slightly awkward, specially for concept classification. On the other hand, having an operational KR system constrains to maintain coherence and thus makes multiple simulataneous edits to the knowledge base (a part of our function (d)) difficult, as explicitly recognized by developers. Ontosaurus includes translators to Ontolingua, KIF, KRSS (Patel-Schneider and Swartout, 1993) and C++ (with obvious limitations in translatability), among others. We are currently using Ontosaurus to perform function (b): it offers many semantic services for conceptualization activity; also, it is quite portable and thus sharable with collaborating centers. Since our primary sources are written in Ontolingua, we translated them in Loom. The original translator from Ontolingua 4.0 is helpful, but substantial hand revision must be performed for some constructs. Examples of Ontosaurus are in Figs. 4 through 6: the definition of viral-hepatitis-a is shown as in Fig. 4: the upper frame contains the main commands for browsing, loading, editing and saving Loom contexts; the left frame is a "reference" frame where one can put some useful file; the right frame is the actual working frame. In Fig. 5 a different view of the same concept is shown which provides taxonomical information on the left and the applicable relations on the right. Fig. 6 shows a piece of the editing environment.
Figure 5 Loom taxonomy and related roles for "viral-hepatitis-A" through Ontosaurus
Function (c)is partly handled by the existing tools: all their interfaces help accessing, retrieving, editing ontologies. GFP is also a very intuitive format for frame definitions. On the other hand, physicians are not interested in understanding the logical nuances of the languages presented; in the current project GALEN-IN-USE, a special intermediate tool has been created to let medical experts make their models: even a friendly syntax as the GRAIL's resulted slightly awkward.
WW Lab Server seems promising because it is compliant with most of our requirements. Moreover, OCML has the same expressive power of Ontolingua 4.0, thus we could concentrate all functions in one toolbox, then a (forthcoming) OCML to Loom translator would map ontologies to the semantic services of a classifier.
ON9 (available at http://saussure.irmkant.rm.cnr.it/onto/ON9/index.html)is a library of ontologies designed by means of the ONIONS methodology.
Figure 7 shows an inclusion lattice of some ON9 ontologies: the representation ontologies provided by default in Ontolingua are "frame-ontology" and the set of kif-ontologies. We defined the ontologies: "structuring-concepts", "meta-level-concepts" and "semantic-field- ontology", to link the representation ontologies with the generic ontology library. The sets of "structural ontologies" and of "structuring ontologies" contain generic ontologies. Generic ontologies are variously included in domain ontologies. In particular, integrated- medical-ontology includes all the generic ontologies, which have been used to integrate the terminological ontologies of the five terminology systems.
Figure 7. A significant subset of the inclusion lattice of the ON9 library of ontologies. Ontologies are represented by black circles. Thick grey frames or circles are sets of ontologies (some explictly show the elements). The semantics of black arrows is included-in(applied differently by Ontolingua or Loom, see text). The dashed grey arrow means integrated-in.
The current ON9 ontology library consists of five identifiable sets of models:
1) the intermediate byproducts of the ONIONS integration of the top-levels of the five terminology systems: conceptual primitives (from phase M1), taxonomical inclusions (from phase M2), and formal Local Definitions (LD) (from phase M3). For example, the LDs in USN include taxonomic constraints and some constraints ("templates") on domain and range of relations, stated within class definitions. The following is the formalization of the LD of "organism" in the theory "o-umls" (the Ontolingua translation of USN):
(define-class Organism (?x) (4) "Generally, a living individual, including all plants and animals" :class-slots ((subclass-of Physical-Object)) :instance-slots ((affected-by Organism Acquired-Abnormality) (affected-by Organism Biologic-Function) (affected-by Organism Congenital-Abnormality) (has-part Organism Anatomical-Structure) (has-process Organism Biologic-Function) (has-property Organism Organism-Attribute) (interacts-with Organism Organism) (interacts-with Organism Organism)) :issues ((:generic-theories "has-part requires a part-whole ontology" "has-process and is-affected-by require an ontology of actants")))
Other terminology systems are poorer, for example, the SNOMED-III similar concept is "living organisms", which is given as a primitive.
2) a library of Generic Ontologies (GO) to be used in the integration process (Fig. 7). This work has been carried out with a minimalistic strategy: only some parts of some theories which are useful for the integration process are "bought". For example, given the need of buying some theory of parts and wholes, we chose a subset from the so-called Calculus of Individuals from the philosophical literature (Leonard and Goodman, 1940) and some specific notions of part from the cognitive science literature (Gerstl et al., 1996), formalizing a theory: "meronymy". The following is an Ontolingua definition of "overlaps" from the Calculus of Individuals; it uses some second-order predicates for properties of relations and some first-order axioms of equivalence (here stated under the keyword :iff-def):
(in-theory 'meronymy) (define-relation common-part-with (?x ?y) "this is the minimal definition for 'overlapping' in classical extensional mereology, and should be compliant with both Leonard-Goodman calculus of individuals and Tarski's axioms" :axiom-def (and (reflexive-relation common-part-with) (symmetric-relation common-part-with) (alias common-part-with overlaps)) :iff-def (exists ?z (and (part ?z ?x) (part ?z ?y))))
3) the Integrated Medical Ontology (IMO), including some ontologies from GO and some Domain Ontologies (DO). For example, a corresponding definition to (4) is specified in the theory: "biologic objects" as follows:
(in-theory 'biologic-objects) (6) (define-class organism (?org) "the type concept for living objects in the biologic layer (cf. M Blois [$M-I&M] and N Hartmann [$]P-GF)" :axiom-def (and (!type organism) (subclass-of organism biologic-object) (value-type organism has-component abnormal-body-part) (value-type organism embodies abnormal-function) (value-type organism embodies pathologic-function)) :def (and (exists ?phy (and (embodies ?org ?phy) (physiologic-function ?phy))) (exists ?bp (and (part ?bp ?org) (body-part ?bp))) ))
Formula (6) makes use of a dedicated second-order predicate (we defined it in theory: meta-level-ontology), which assigns a meta-level category, in this specification expressed by !type; of some second-order axioms in the way of (6), and of some first-order axioms -- stated under the keyword :constraints -- which specify more complex constraints (see also §3.).
4) the mappings between each LD and the IMO. For example, having both (4) and (6), (6) is modified by adding a constraint as follows:
(in-theory 'o-umls) (define-class organism (?x) etc. etc. {see (3)} :constraints (integrated-in organism organism biologic-objects))
which states that "organism" in the theory: o-umls is integrated in "organism" in the theory: biologic-objects, which is a module in the ON9 library. "integrated-in" is a ternary relation. Obviously, all the concepts and relationships appearing in the o-umls definition have an integration mapping in some ON9 module.
5) some specialized domain ontologies: surgical procedures, clinical activities, infectious diseases, clinical guidelines, etc., using a subset of modules from IMO (Fig. 7).
From the ONIONS experience of developing terminological ontologies in the last years, we can claim that:
a) from the viewpoint of conceptual integration of terminologies, the ontologies produced through ONIONS may support:
b) from the viewpoint of reuse and maintenance, the ontologies produced through ONIONS may support:
We proposed an overview of ontology languages and we exposed why we consider rich expressivity as a prerequisite. Our experience suggests that representing a terminological ontology requires complex formal specifications involving full first-order sentences, some second-order sentences about situation and contextual change, pervasive existential quantification, definition of meta-level categories of the representation language, etc. We have found that Ontolingua, OCML, and Loom are well-suited to this purposes.
c) from the viewpoint of cooperative ontology modeling and validation on the WWW, use and integration of ON9 should be negotiated or customized by:
We also proposed an overview of toolboxes for ontology construction and we exposed why we consider collaborative modeling capabilities an even stronger prerequisite. We currently use Ontosaurus to fit our needs, and we plan to use WW Lab Server to test a real-time interactive modeling collaboration. Although ON9 is still being tested by experts, there is no doubt that acceptance, rejection and extension are fundamental phases in the process of ontology validation, extension and update. The necessity of extensive off-line human intervention in the search, choice, and formalization of generic ontologies can be seen as unavoidable bottlenecks in ONIONS ontology modelling. An appealing alternative is to adopt a systemic approach in the generic library, which is widely shared and formally available. As a matter of fact, our analysis evidentiates that system theory, widely used in engineering domains (the usual configuration of component-state-event-process), does not fit the medical domain. The basic principles motivating the conceptualization of terminology in medical domains refer also to other theories, such as those provided (mostly in informal ways), by linguistics, philosophy, and cognitive science.
This paper has benefited from the useful comments and suggestions of Enrico Motta and Fabrizio Giacomelli. Our research is partly supported by the Italian National Research Council Special Project ONTOINT (ONTOlogical tools for Information iNTegration).
Borgo S, Guarino N, Masolo C. (1996). Stratified Ontologies: The Case of Physical Objects. in Vet (ed.) Proceedings of ECAI96(Workshop on Ontological Engineering), New York, John Wiley.
Borst P, Akkermans H, Top J. Engineering Ontologies. (1997) International Journal of Human-Computer Studies,46.
Brachman R, McGuinness DL, Patel-Schneider PF, et al. (1991) Living with Classic. in JF Sowa (ed.): Principles of Semantic Networks,San Mateo, CA, Morgan Kaufmann.
CEN prENV 12264:1995 (1995) Medical Informatics - Categorial structure of systems of concepts - Model for representation of semantics (Document For Formal Vote). Brussels: CEN.
Cohn AG, Randell DA, Cui Z. (1996) Taxonomies of Logically Defined Qualitative Spatial Relations. International Journal of Human-Computer Studies,43
Coté RA, Rothwell DJ, Brochu L (eds) (1994) SNOMED International,3rd ed., 4 vols. Northfield, Ill: College of American Pathologists.
Doyle J, Patil R. (1991) Two Theses of Knowledge Representation: Language Restrictions, Taxonomic Classifications, and the Utility of Representation Services. Artificial Intelligence,49.
Evans DA, Cimino JJ, Huff SM, Bell DS for the CANONGroup (1994) Toward a Medical-Concept Representation Language. Journal of the American Medical Informatics Association1:207-17.
Falasconi S, Stefanelli M. (1994) A Library of Medical Ontologies. in Proceedings of ECAI94 (Workshop on Comparison of Implemented Ontologies), New York, John Wiley.
Falasconi S, Lanzola G, Stefanelli M. (1996) Using Ontologies in Multi-Agent Systems. in B Gaines, M Musen (eds), Proceedings of Knowledge Acquisition Workshop,Banff, participants edition.
Farquhar A, Fikes R, Rice J. (1996) The Ontolingua Server: a Tool for Collaborative Ontology Construction. in B Gaines, M Musen (eds), Proceedings of Knowledge Acquisition Workshop,Banff, participants edition.
Fillmore CJ. (1971) Types of Lexical Information. in: DD Steinberg, LA Jakobovits (eds): Semantics: an Interdisciplinary Reader in Philosophy, Linguistics and Psychology,Cambridge UP.
Gabrieli E. (1989) A New Electronic Medical Nomenclature. Journal of Medical Systems;3.
Gaines BR. (1994) Class Library Implementation of an Open Architecture Knowledge Support System. International Journal of Human-Computer Studies,41.
GALEN Project (1992-4) Documentation available from the main contractor Rector AL, Medical Informatics Group, Dept. of Computer Science, Univ. Manchester, Manchester M13 9 PL, UK.
Gangemi A, Galanti M, Galeazzi E, Rossi Mori A. (1992) Compositional Semantics for Medical Records. in R Scherrer, S Mandil (eds.): Proceedings of MedInfo92,Amsterdam: Elsevier Science Publishers.
Gangemi A, Steve G, Giacomelli F. (1996) ONIONS: An Ontological Methodology for Taxonomic Knowledge Integration. in Vet (ed.) Proceedings of ECAI96(Workshop on Ontological Engineering), New York, John Wiley.
Gangemi A, Steve G, Pisanelli DM, Giacomelli F. (1997) Ontological Integration of Terminologies with ONIONS. in PJ Charrel, H Kangassalo (eds.) Proceedings of European-Japanese Seminar on Information Modeling and Knowledge Bases,Toulouse, participant edition.
Gennari JH, Tu SW, Rothenfluh TE, Musen M. (1994) Mapping Domains to Methods in Support of Reuse. International Journal of Human-Computer Studies,41, 399-424.
Gerstl P, Pribbenow S. (1996) Midwinters, Endgames, and Body Parts: A Classification of Part-Whole Relations. International Journal of Human-Computer Studies,43.
Gruber T. (1993) A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition; 5:188-220.
Grüninger M. (1996) Designing and Evaluating Generic Ontologies. in Vet (ed.) Proceedings of ECAI96 (Workshop on Ontological Engineering), New York, John Wiley.
Guarino, N., Carrara, M., and Giaretta, P. (1994) An Ontology of Meta-Level Categories. In J Doyle, E Sandewall and P Torasso (eds.), Principles of Knowledge Representation and Reasoning: Proceedings of KR94. San Mateo, CA, Morgan Kaufmann.
Harnad S. (1990) The Symbol Grounding Problem. Physica D,42.
Humphreys BL, Lindberg DA. (1992) The Unified Medical Language System Project. in Lun KC et al. (eds): Proceedings of MedInfo92,Amsterdam: Elsevier Science Publishers. <
Laresgoiti I, Anjewierden A, Bernaras A, et al. (1996) Ontologies as Vehicles for Reuse: a mini- experiment. in B Gaines, M Musen (eds), Proceedings of KAW96.
Lenat D, Guha R. (1990) Building Large Knowledge-Based Systems.Reading, MA: Addison-Wesley.
Leonard HS, Goodman N. (1940) The Calculus of Individuals and its Uses. Journal of Symbolic Logic,5.
Mac Gregor RM. (1991) The Evolving Technology of Classification-based Knowledge Representation Systems. in JF Sowa (ed.): Principles of Semantic Networks,San Mateo, CA, Morgan Kaufmann.
Mac Gregor RM. (1994) A Description Classifier for the Predicate Calculus. in Proceedings of the Twelfth National Conference on Artificial Intelligence, (AAAI 94),New York: John Wiley.
Mallery JC. (1994) A Common LISP Hypermedia Server. in Proceedings of WWW94,participants edition.
Martil R, Turner T, Terpstra P. (1995) Knowledge Reuse in Technical Domains: The KACTUS Project. in Proceedings of The Impact of Ontologies on Reuse, Interoperability and Distributed Processing, Unicom Seminar, London.
McGuire JG, Kuokka DR, Weber JC et al. (1993) SHADE: Technology for Knowledge-Based Collaborative Engineering. Journal of Concurrent Engineering,1.
Miller GA, Johnson-Laird PN. (1976) Language and Perception.Cambridge UP.
Motta E. (1995) KBS Modeling in OCML. Modeling Languages for KBS,VU Amsterdam Technical report.
Neches R et al. (1991) Enabling Technology for Knowledge Sharing. AI Magazine;fall 91.
Patel-Schneider PF, Swartout B. (1993) Draft of the Description Logic Specification from the KRSS group of the DARPA Knowledge Sharing Effort.
Prince G. (1982) Narratology.Berlin: De Gruyter, .
Rector A, Gangemi A, Glowinski A, et al.. (1994) The GALEN CORE Model Schemata for Anatomy: Towards a Re-Usable Application-Independent Model of Medical Concepts. in Proceedings of MIE94, Lisbon, participants edition.
Riva A, Ramoni M. (1996) LispWeb: a Specialized HTTP Server for Distributed AI Applications. Computer Networks and ISDN Systems,28.
Rossi Mori A, Thornton A, Gangemi A. (1990) An Entity-relationship Model for a European Machine- Dictionary of Medicine, in RA Miller (ed.): Proceedings of SCAMC90,New York: IEEE Press.
Rossi Mori A, Gangemi A, Steve G, et al. (1997) An Ontological Analysis of Surgical Deeds. in Garbay C et al. (eds) Proceedings of Artificial Intelligence in Europe AIME97,Berlin: Spinger Verlag.
Schmidt-Schauss M. (1989) Subsumption in KL-ONE is Undecidable. in Proceddings of the IEEE wks. on principles of KBS,Denver, participants edition.
Sowa JF. (1996) Top-Level Ontological Categories. International Journal of Human-Computer Studies, 43.
Speel PH. (1995) Selecting Description Logics for Real Applications. in A Borgida, M Lenzerini, et al. (eds): Proceedings of International Workshop on Description Logics,Rome, participants edition.
Stedman's (1995) Stedman's Medical Dictionary,26th Edition. Baltimore: Williams and Wilkins.
Steve G, Gangemi A. (1996) ONIONS Methodology and the Ontological Commitment of Medical Ontology ON8.5. in B Gaines, M Musen (eds), Proceedings of Knowledge Acquisition Workshop, Banff, participants edition. Steve G, Gangemi A. (1997) Some Theses on Ontological Engineering in the Context of onions Methodology. in A Farquhar (ed.) Proceedings AAAI97 Spring Symposium on Ontological Engineering, Stanford, participants edition. Steve G, Gangemi A, Pisanelli DM, Integrating Medical Terminologies with ONIONS Methodology, in Kangassalo H (ed) Information Model and Knowledge Bases VIII, Amsterdam, IOS-Press, 1997. Swartout B, Patil R, Knight K, Russ T. (1996) Toward Distributed Use of Large-Scale Ontologies. in B Gaines, M Musen (eds), Proceedings of Knowledge Acquisition Workshop, Banff, participants edition. Tate A. (1996) Towards a Plan Ontology. Journal of the Italian AI Association Uschold M, King M. (1995) Towards a Methodology for Building Ontologies. Proceedings of IJCAI95 Workshop on Basic Ontological Issues. Valente A, Breuker J. (1996) Towards Principled Core Ontologies. in B Gaines, M Musen (eds), Proceedings of Knowledge Acquisition Workshop, Banff, participants edition. Van Heijst G, Schreiber ATh, Wielinga BG. (1997) Using Explicit Ontologies in KBS Development. Int. Journal of Human-Computer Studies. Varzi A, Casati R. (1995) Holes and other Superficialities. Cambridge, MA: MIT Press. WHO (1994) International Classification of Diseases 10th revision. Geneva: WHO press. Zdrahal Z, Domingue J. (1997) The World Wide Design lab: an Environment for Distributed Collaborative Design. Proceedings of Int. Conference on Engineering Design, Tampere, participants edition.