Martin Molina1, Yuval Shahar2, Jose Cuena1, Mark A. Musen2
1Department of Artificial Intelligence, Technical University of
Madrid,
Campus de Montegancedo S/N, Boadilla del Monte 28660, Madrid,
SPAIN
{mmolina, jcuena}@dia.fi.upm.es
2Section on Medical Informatics,
School of Medicine,
Stanford University, Stanford, CA 94305, USA
{shahar,
musen}@camis.stanford.edu
Abstract. The paper presents a case study that compares two of the existing knowledge modeling platforms: PROTÉGÉ-II and KSM. These two software environments allow a developer to build a knowledge level model and to create the final operational version. Both environments have been used to develop real world models (e.g., for medical and traffic domains respectively). In the paper, we first describe a knowledge model following the KADS methodology that has been defined to carry out real-time decision support tasks in domains such as traffic control. Then, we present how the model is defined and operationalized using, first, the PROTÉGÉ-II environment and, second, the KSM environment. Finally, we discuss similarities and differences between both approaches.
This jump from the knowledge level model to the final operational version may be simplified by using some of the existing knowledge modeling platforms that, besides proposing their particular modeling paradigms, assist the developer to create the operational version. Two of these platforms are PROTÉGÉ-II [Musen et al., 95], [Puerta et al., 93] and KSM [Cuena, Molina, 96], [Molina, Cuena, 94], [Molina, 93]. Both environments have been already used in real world problems. For instance, PROTÉGÉ-II has been used for creation of applications in several clinical domains [Tu et al., 95], while KSM has been used for building knowledge-based systems in the traffic control domain [Molina et al., 94], [Cuena et al., 96] (such as real-time decision-support system for a urban traffic control system in Madrid that is currently working on-line ). In this paper our goal is to compare both knowledge modeling platforms. We use a particular knowledge model for real-time decision support that is presented following the KADS methodology (section 2). Section 3 describes this model using the PROTÉGÉ-II approach and section 4 shows the same model from the KSM perspective. Finally, section 5 describes similarities and differences between both environments.
Next three sections show the knowledge model for real-time decision-support for such system presenting respectively the three main views proposed by the KADS methodology: the task view, the inference view and the domain ontology. This model corresponds to an abstraction of an existing real system called TRYS for traffic control [Cuena et al., 96].
2.1 The Task View
A real-time decision support application controlling a system as we consider in this example offers three main functions that provide the answers to the three following questions [Cuena, Hernández, 96]: (1) what is happening?, i.e., what problems present the current state of the system, (2) what may happen if?, i.e., what will happen in the near future if some external conditions change according to certain hypotheses and (3) what to do if?, i.e., what should be done in order to improve the current state of the system considering hypotheses about external conditions. These functions are supported by three tasks-methods-subtasks trees (Fig. 1).
The first question is answered by a diagnosis task that analyzes the current state of the physical system to detect and to diagnose existing problems. For instance, in the context of traffic control this task detects the presence of a traffic problem (e.g., a congested area) and it finds out its cause (e.g., the presence of an accident). However, given the complexity of some real systems, sometimes the surveillance cannot be done for the whole system at once but it must be done for different separate components (e.g., in the traffic domain, each individual road is supervised separately instead of monitoring directly the whole network). The result of the total diagnosis is the union of local diagnosis for the different components. Thus, the diagnosis task is carried out by a method that divides the whole area into simpler components (using the select component task) and then it applies local diagnosis for each one (using the local diagnosis task). This second task is carried out by following the heuristic classification method that decomposes the diagnosis task into three subtasks: abstract problem, identify malfunction and refine malfunction. The first subtask abstracts input data from detectors to detect existing problems (e.g., in the traffic domain, data such as the current speed at certain location allow to detect the presence of a queue in a highway). Such information acts as symptoms of existing problems. Then, those symptoms are used to find out existing causes (e.g., an accident or a saturated off-ramp) that explain the presence of the problem, following two consecutive steps: first, identify the type of malfunction and, then, refine this classification in order to determine specific characteristics.
The second question is answered by a prediction task that forecasts the short term future behaviour of the system in order to determine the severity of the existing problems, accepting optionally as input different hypotheses of external actions (e.g., in the traffic domain, hypotheses of different traffic demands or changes in traffic lights). As the previous diagnosis task, this task separates the whole system into components, considering two different subtasks: the select component task and the local prediction task. The second task is divided into two subtasks: predict behaviour, for determining the future system's state, and abstract problems, the same task that was applied for diagnosis but considering the state after the prediction (this will identify future problems and, consequently the severity of the current situation). The second task is carried out by a method that uses a model of the component to simulate its behaviour. It (1) predicts external actions (when they are not received as input) (2) estimates the effect of the current (or hypothesised) control actions, and (3) simulates the behaviour by using a system model and the existing malfunctions. Note that the use of different hypotheses allow the user to study the problem with different external conditions in a conversation process where the user proposes different external actions and the system answers their impacts.
Figure 1: Task structure of the knowledge model (the local prediction subtask, with the asterisk, is used twice).
Finally, the third question is answered by a design task that finds out solutions that can eliminate current problems. This task, like the others, separates the whole system in components. However, in this case, the global solution cannot be directly the addition of local solutions but they must be synthesised considering their interactions. In the case of the traffic domain, a local solution (e.g., a path recommendation) to decrease a traffic queue in a particular area can increase an existing queue in another area. Thus, the main task is divided into three subtasks: the select component subtask, that selects a component each time, the local design subtask, that finds out a local solution for a given component, and the global design subtask, that integrates local solutions to define global proposals. In its turn, the local design task is carried out by a generate-and-test method that, first, proposes local solutions for current problems (e.g., increasing the green time of a traffic light) according to heuristics and, second, it tests (by simulation) the effect of the proposed control actions in order to select the best solutions. Then, the global design task is carried out by a propose-and-revise method that, first, generates a combination of local solutions in the propose step and, then, the combination is analyzed to verify whether it satisfies compatibility constraints. When an incompatibility is detected, revision steps modifiy the global proposal using knowledge about priorities among components. In summary, this task structure integrates several tasks offering the three main functions of the model for real-time decision support. The set of problem-solving methods that includes this model are the following.
Component-enumeration. This is an elemental method to analyze the complete system component by component. The method establishes that certain task for a system can be performed in three simpler steps. First, select a component of the system and, second, apply the task locally to this component. These two steps are done for all the components. Finally, a third step integrates all the answers for the whole system. This method has been applied for the three main tasks: problem diagnosis, problem prediction and solution design. For the first two tasks, the last step has been removed given that in these cases the integration step is just the addition of local outputs.
Heuristic-classification. This method is used to carry out the task called component diagnosis. The first step is done by the basic task called abstract problems that interprets sensor data using as domain knowledge abstraction functions. The second step is done by the basic task called identify malfunctions that matches observables and symptoms using problem scenarios. Finally, the third step is done by the elementary task called refine malfunctions that finds out particular details of detected malfunctions considering sensor data.
Interpret-future-sate. This method is used to carry out the task called local prediction. It considers a first task that predicts future states of the system, called predict behaviour, in terms of values of internal variables. Then, a second task interprets those states in order to detect the presence of future problems. This second task is called abstract problems and is the same task that is used to diagnose problems.
Model-based-simulation. This method is used to carry out the task called predict behaviour. It considers the existence of a simulator of the system's behaviour. The method performs three basic steps: predict external actions, estimate control action effects and simulate. The first step uses a model that includes historical information about external actions. The second step estimates the effect of the current control actions using a control actions model. Finally, the simulate step is the simulation of the system's behaviour that determines the future state.
Generate-&-test. This method is used to carry out the task called local solution design. The method, first, generates a set of control actions that may solve a given problem and, then, it tests (by simulation) the impact of each proposal in order to accept or reject it. The generate step is performed by the basic task called generate control action that uses a control action model to propose control actions that may solve certain problems. The test step is done by the task called test local prediction, that makes a prediction of the impact of the proposed control action. Then, a third step (the select local action task) selects the best proposals considering their impacts.
Propose-&-revise. This method is used to carry out the task called global solution design. With the propose-&-revise strategy first a global solution is proposed as the direct addition of local solutions for each component with problems. This step is done by the basic task called propose solution. Then, the proposal is analyzed to verify whether it presents incompatibilities. This step is done by the basic task called verify solution that uses as knowledge a set of compatibility constraints. Finally, a remedy step is done in order to eliminate the detected incompatibilities. This is done by the basic task called remedy solution using as domain knowledge a fixes model. This process is repeated until a minimum set of coherent solutions is generated.
2.2. The Inference View
The knowledge sources (elementary tasks) corresponding to the previous knowledge model for decision support are shown in figure 2. This table shows the set of such knowledge sources with their respective input and output metaclasses. The first knowledge source, called select component, is used to separate the whole system model into smaller components. It receives as input the whole system where the decision support model is going to reason and generates as output an individual component, using as domain knowledge a system model. For instance, in the context of traffic control, a system is a road network and a component is a particular road.
Knowledge Sources | Input Metaclasses | Output Metaclasses | Domain Knowledge |
Select Component | system | component | system model |
Abstract Problems | component observables | symptoms | abstraction functions |
Identify Malfunctions | component symptoms observables | type of malfunctions | problem scenarios |
Refine Malfunctions | component
type of malfunction symptoms observables | malfunctions | system model |
Predict
External Actions | component observables | external actions | environment model |
Estimate Control Action Effect | component
control action | control action effects | control actions model |
Simulate | component observables malfunctions external actions control action effects | future observables | system model |
Propose
Local Solution |
component malfunctions current control actions problem severity | local solution proposal | control actions model |
Select Local Solution | component local solution proposals estimated impact | local solution | actions preferences |
Propose Solution | local solutions | global solution proposal | |
Verify Solution | global solution proposal | violations | compatib. model |
Remedy Solution | global solution proposal violations | global solution | fixes model |
Figure 2: Knowledge sources, metaclasses and domain knowledge of the knowledge model
The next three knowledge sources are used to diagnose problems. The knowledge source called abstract problems is an abstraction step that interprets sensor data. It receives as input the component and observables (e.g., traffic speed and traffic flow at certain locations), and generates qualitative information about the state (e.g., the presence of a queue) that plays the role of symptoms. This task uses abstraction functions that relate row data with higher level parameters. The knowledge source called identify malfunctions receives the component, symptoms and observables, and identifies types of malfunctions in the physical system (such as accidents or saturated sections) that may condition the system's behaviour. This task uses a set of generic problem scenarios that cover the set of types malfunctions that the system may present. The next knowledge source, called refine malfunctions, is used to determine details about the detected type of malfunction (e.g., the specific location of an accident). It receives as input the component, the type of malfunction, symptoms and observables, and it generates as output specific malfunctions using as domain knowledge a system model.
The functionality of prediction is carried out by three main knowledge sources. The knowledge source called predict external actions estimates the future external actions when they are not established as hypotheses by the user. For instance, in the context of traffic, this task estimates the short term demand of the traffic network using historical information. Therefore, this knowledge source receives as input the component, observables and generates hypotheses of external actions, using an external action model that includes historical behaviour. The knowledge source called estimate control action effect determines the effect of current (or hypothesised) control actions (e.g., the effect on capacity due to changes in the of green duration of certain traffic lights or the diversion effect of path recommendation messages). The next knowledge source, called simulate, performs a behaviour simulation using as input observables, malfunctions, external actions and control action effects. The output of this knowledge source is the future state of the system represented by a set of future observables.
Finally, the functionality of designing solutions uses, among the others, five particular knowledge sources. The knowledge source called generate local solution includes heuristic knowledge to propose candidate local control actions for existing problems (e.g., when there is a problem of capacity, a solution is to increase the green time of certain traffic lights). These proposals are tested by simulation using the local prediction task and, then, the knowledge source select local solution is used to choose the bests control actions according to their impacts. The knowledge source called propose solution integrates local solutions proposed for individual components. The knowledge source called verify solution analyzes the previous global proposal to detect incompatibilities (using a set of constraints) and, finally, the knowledge source called remedy global solution fixes the incompatibilities following the propose-and-revise strategy.
2.3. The Ontology View
The ontology view presents a declarative description about the domain where the decision support model is defined. This domain here is described in abstract terms, i.e., it is considered for a generic physical system. The domain is divided into the following modules that are associated to the corresponding knowledge sources (Figure 2). Figure 3 presents a summary of the ontology view including the following areas.
System model. The system model includes basic concepts modeling the physical structure. This model is mainly used by the knowledge source called simulate, that makes predictions about the behaviour of the system, and it is also used by other two tasks: select component (to choose each time a particular component for applying a task) and refine malfunction (to find out details for an existing type of malfunction). For instance, in the traffic domain this model includes concepts defining the network structure such as nodes (entry and exit points), sections, links connecting sections (with several types of links such as on-ramp link and off-ramp link), OD pairs connecting origins and destinations, paths that establish itineraries between OD pairs and road areas integrating the previous elements. There are also detectors installed on sections and control devices such as traffic lights and changeable message signs (CMS). All these elements are combined using relations such as the section-sensor relation that associates a sensor to the section where it is installed. A possible formulation of such model is using a representation of concept-attribute-facet and relations such as DDL language [Schreiber et al. 93].
Abstraction functions. These functions allow to abstract numerical information into higher level qualitative parameters. In general, the representation used here include: (1) the formulation of numerical functions to compute new parameters using lower level parameters, and (2) the qualitative interpretation of numerical parameters (considering noise and uncertainty in input data). In particular, an example of abstraction knowledge for qualitative interpretation could be represented by fuzzy functions. For instance, in the traffic domain, the CONGESTED value for the circulation regime parameter (which may be FLUID, CONGESTED or UNSTABLE) is abstracted from the parameters speed and occupancy according to a two-dimension possibility function.
Problem scenarios. These scenarios represent patterns of malfunctions that the system may present. Each pattern includes measurable conditions that allow to conclude the presence of a particular type of malfunction. Thus, each pattern may be formulated by a frame-like representation that includes slots corresponding to observables (e.g., speed, occupancy and other measures at different locations) and symptoms abstracted from observables (e.g., circulation regime, saturation level, etc.). When a frame matches a current situation, it allows to conclude the presence of the corresponding malfunction (e.g., a traffic accident).
Figure 3: Summary of the ontology view. It presents the set of domain areas with their representation.
Environment model. This model includes historical knowledge about the environment (e.g., in the case of traffic, there are scenarios of traffic demand, i.e., the amount of vehicles going from certain origins to certain destinations). The model includes temporal references in order to be able of making predictions. This is represented by using hierarchies of patterns of external actions associated to temporal intervals, in such a way that it is possible to determine sort term future scenarios of external actions for a given present state.
Control actions model. This model includes knowledge about control actions and associates to them (1) system's state conditions that are compatible with the control action, and (2) the estimated effect of control actions. The state conditions can be viewed as the context in which the action has the estimated effect. The representation used here is a collection of structured triplets <control-actions, state-conditions, estimated-effect> that are used by different inference steps to provide different functions (e.g., determining the effect of a given set of control actions or, on the other hand, selecting certain control actions that achieve a set of required effects). In the traffic domain, an example of a triplet includes as control-actions a set of messages presented to the drivers recommending an alternative path to go to a certain destination, as state-conditions a logical expression formulating that the main path must present problems and the alternative path must be free, and as estimated-effect the percentage of vehicles that is supposed to follow the recommendation.
Control actions preferences. This model includes knowledge about preferences of control actions according to their impact. This knowledge is formulated by using priority tables that establish levels of acceptance considering the type of control action and the estimated effect on the existing problem.
Compatibility model. As it was mentioned previously, the model presented in this paper for decision support considers that the global solution of existing problems are determined as the integration of local solutions for the affected components of the system. However, the global solution cannot be directly the addition of local solutions given that they may interact among themselves. Thus, a model including compatibility constraints is required in order to analyze the correctness of the synthesis. An example of incompatible situation in the traffic domain is the presence of the following two actions: the first one recommends to follow path P to go to a certain destination (by presenting messages to drivers in changeable message panels), and the second control action reduces the capacity of a section belonging to path P by decreasing the duration of the green time of a particular traffic light. Obviously, these two control actions are incompatible given that one of them increase the demand of the path, while the other decrease its capacity.
Fixes model. Finally, the fixes model is used to solve constraint violations among local proposals. These fixes are used by a propose-and-revise method that modify proposals of global solutions to successively eliminate detected violations by applying the corresponding fixes. In the case of traffic control, this model is based on a priority scheme of components. Thus, when an incompatibility is detected the component with lower priority is asked to propose a new control action (depending as well as on other factors such as the severity of the problem).
Figure 4: Example of the main screen of the PROTÉGÉ-II environment
PROTÉGÉ-II [Musen et al., 1995; Eriksson et al., 1995; Tu et al., 1995] is a development environment for knowledge-based systems. The framework includes a methodology for the construction of such systems that employs a library of reusable computational components. These components include problem-solving methods and nondecomposable methods called mechanisms. Methods decompose tasks into subtasks that can be solved by either methods or mechanisms. Methods have well defined ontologies and knowledge roles that constrain the computational process by forcing a limited ontological view of the tasks and the domain [Eriksson et al., 1995]. PROTÉGÉ-II also includes several tools that support the automatic generation and customization of knowledge-acquisition tools, specific to a particular task (e.g., skeletal-plan execution), domain (e.g., clinical-guideline-based care), and problem-solving method (e.g., episodic skeletal-plan refinement). In this section we show how PROTÉGÉ-II may be used to build an operative model for the previous real-time decision support tasks.
The main three steps in PROTÉGÉ-II for building the knowledge model are: (1) the formulation of the method ontology which includes the set of knowledge roles used by the method, (2) the definition of the domain ontology (somehow method-independent) where the method will be applied to, and (3) the formulation of mapping-relations relating both ontologies. Next three sections describe in more detail these steps and, then, a fourth section explains how this model is operationalized.
Figure 5: The global method ontology of for the decision support model from the PROTÉGÉ-II perspective. Numbers associated to concepts indicate to which method ontology belongs each concept. Number 1 represents the method ontology corresponding to the task problem diagnosis, number 2 is for the task problem prediction and number 3 is for solution design.
3.1. The Method Ontology
PROTÉGÉ-II views the model as a structure of tasks-method-subtasks. For the case study of this paper, it is the same structure that was presented using the KADS methodology (Figure 1) in which there are three task trees (sharing some subtasks) associated to the three main functions of the model. The current version of the PROTÉGÉ-II tool assumes that initially there is an existing software component implementing the problem-solving method that will be applied to a particular domain. For the case study of this paper, there are three complex methods corresponding to the three main tasks (problem diagnosis, problem prediction and solution design). Each method has its own method ontology, although they share some elements. With PROTÉGÉ-II, the developer defines each ontology using a particular representation of object-slot-facet-value with a class-subclass-instance organization [Tu et al., 1995]. PROTÉGÉ-II provides a friendly user interface with which the developer formulates such an ontology.
Figure 5 presents a summary of the three ontologies in a global one. In the figure, numbers associated to concepts indicate to which method ontology belongs each concept. For instance, the concept component belongs to the three method ontologies (numbers 1, 2 and 3), the concept problem scenario belongs to the method ontology corresponding to the problem diagnosis task (number 1), and the concept external-action belongs to the method ontologies corresponding to the tasks problem prediction and solution design (numbers 2 and 3).
3.2. The Domain Ontology
For the particular case of the decision support knowledge model that we present here we consider a traffic domain where the problem-solving knowledge will be applied to (corresponding to the TRYS system). Actually, PROTÉGÉ-II considers two different ontologies at this level: the domain ontology that includes a declarative description of the domain, and the application ontology that includes the previous ontology plus other elements required for the method.
Figure 6 shows the application ontology for the traffic domain (attributes and facets are not explicit in the figure). For instance, sections, links and nodes are elements for defining the traffic network structure. Traffic lights and changeable message signs (CMS) are control devices. Incidents and recurring problems are types of problems.
Figure 6: The application ontology of the traffic domain from the PROTÉGÉ-II perspective.
3.3. The Mapping Relations
Mapping relations establish associations between the application ontology and the method ontology [Gennari et al., 1994]. In the case of the example of this paper, mapping relations associate traffic domain elements to the decision support method elements. For instance, abstraction functions of the method are mapped into facets of attributes of sections (e.g., circulation regime and saturation) that belong to the domain ontology. These facets are tables that allow to carry out the abstraction using specific qualitative ranges and their corresponding numerical intervals. Likewise, the concept detector of the method ontology has an attribute called observable that is mapped into the attributes speed and flow of the concept sensor belonging to the domain ontology. Other mapping relations are established by the following direct associations (where the left hand side includes a concept of the method ontology and the right hand side is a component of the domain ontology): <system, traffic-network>, <component, road-area>, <element, network-element>, <detector, sensor>, <effector, control-device>, <problem-scenario, problem-scenario>, <environment-action, demand-scenario>, <control-action, action-effect>, <compatibility-constraint, compatibility-constraint> and <fix, priority-scheme>.
3.4. The Operational Version of the Knowledge Model
As was mentioned, PROTÉGÉ-II assumes that a problem solving method is already supported by an existing software component programmed in a certain language (e.g., C or CLIP). In order to operationalize the model, first, the developer uses an ontology editor (with visual facilities) for constructing both ontologies (either ontology might already exist, of course). Such editor translates the ontology formulation into a computable version. Then, the developer uses the application ontology and facilities provided by PROTÉGÉ-II to build a knowledge acquisition interface. PROTÉGÉ-II automatically generates a knowledge-acquisition tool that will be used later to acquire the particular domain [Tu et al., 1995]. Finally, the developer uses a mapping-relation editor to formulate connections between method and application ontology. This information is used by PROTÉGÉ-II to automatically build the final executable application. Thus, the PROTÉGÉ-II tool may be viewed as a collection of utilities that are applied in sequence (with possible backtracks) until the final executable application is built.
Figure 7: Example screen of the KSM environment.
KSM is a software environment that assists a developer for building, operationalizing and reusing knowledge models. In this section we show how to formulate and operationalize the knowledge model for real-time decision support defined in previous sections. Basically, in KSM three perspectives must be defined to formulate a model: (1) the knowledge area perspective, which may be viewed as a modular description of the ontologies involved in the model, (2) the task perspective, similar to the task layer of KADS and (3) the vocabulary perspective, which includes the basic terms shared by other knowledge modules. KSM provides computable constructs that implement basic problem-solving techniques that are called primitives of representation. The knowledge level model is operationalized by associating these components to knowledge areas.
Figure 8: The knowledge-area perspective of the decision support knowledge model
4.1 The Knowledge Area Perspective
The PROTÉGÉ-II approach underlines the existing method-domain duality of knowledge models, whose separation is useful for reuse purposes. However, methods and domain ontologies are not totally independent (e.g., the use of a particular method usually requires the inclusion of new components in the domain ontology or the formulation of a given domain ontology constrains the number of methods that may be efficiently used in such domain). Therefore, a new perspective showing this dependence in a particular model is useful. This view is provided by KSM and it is called the knowledge-area perspective (Figure 8). This perspective is used for presenting a general image of the model where each module represents what we call knowledge area. In general, a knowledge area identifies a body of expertise that explains a certain problem-solving behaviour of an intelligent agent. Typically, a knowledge area identifies a professional skill, a qualification or speciality of an expert. Knowledge areas are not passive modules but, on the contrary, they provide different services represented by a set of tasks. The knowledge area can be viewed as a body that encapsulates a set of modular domain ontologies with the set of tasks and methods that these ontologies may accept. The domain-method interaction was already considered in the concept of generic task [Chandrasekaran, 86], although generic tasks are centered on the task to be done, while a knowledge area is centered on the domain knowledge that supports a set of tasks. The whole knowledge model is a hierarchical structure of knowledge areas in such a way that there is a top-level area representing the entire model. This area is divided (using the part-of relation) into other more detailed subareas that, in their turn, are divided into other simpler areas and so on, developing the whole hierarchy (where some areas may belong to more than one higher level area). A bottom level area is called primary knowledge area and corresponds to elementary modules that may be directly operationalized by using basic software tools. Note that a structure of knowledge areas can be established at generic level, so it can be reused for building different applications.
In the case of the model presented in this paper, this organization includes a top-level area representing the whole knowledge about the system required for decision support. This area offers the three main functions of the model: problem diagnosis, problem prediction and solution design. The area is decomposed into two subareas: component knowledge, which includes knowledge about a particular component, and global knowledge, which includes knowledge about the entire system (such as the required knowledge for integrating local solutions). The component area is decomposed into other simpler areas: abstraction knowledge, to abstract input data, problem scenarios, that makes a classification of existing problems for identifying malfunctions, behaviour knowledge, that includes deep knowledge about a component for simulating its behaviour, the control actions knowledge, that allow to estimate the effect of current control actions, the component model that is required for simulation, and the area actions preferences to choose the best actions. In its turn the behaviour knowledge-area includes three subareas: one for modeling the environment knowledge, another for modeling the component knowledge itself and another for control actions. On the other hand, the global knowledge is divided into the compatibility model, that includes constraints about control actions, and the fixes knowledge area, to be used when an incompatibility is detected.
Note that this structure includes at the bottom level the same areas identified in the ontology view (Figure 3) with the associated elementary tasks. Thus, in KSM the developer establishes a partition of the domain ontology, and considers a primary area each module. The higher level areas are aggregations of these elements. Therefore, the knowledge-area view may be considered on the one hand as a structure of the ontology view. But, on the other hand, knowledge areas are modules including tasks, so they serve also to integrate subsets of tasks in a more synthetic organization.
4.2 The Task Perspective
The previous structure of knowledge areas in KSM is complemented by an explicit task perspective, which is similar to the one of KADS and PROTÉGÉ-II (Figure 1). For each task, there is a tree of task-method-subtasks showing a functional description. The main difference is that tasks in KSM are part of knowledge areas. Thus, for instance, the task problem prediction (one of the three main tasks) is associated to the system knowledge area. Likewise, the task identify malfunction is associated to the primary area problem scenarios. One of the advantages of this explicit association between tasks and knowledge areas is that it allows to decrease the number of tasks. In general, tasks selecting parts of the model ontology can be removed given that there will be explicit areas for these parts For instance, the task select-component is necessary (in KADS and PROTÉGÉ-II approaches) to successively choose a component for diagnosis, predict and design. However in the KSM approach, given that there is a knowledge area for each component, this task is not necessary so it is not present in this version.
Methods in KSM are associated to tasks (as in PROTÉGÉ-II) and they are formulated using a particular language called Link (supported by an interpreter at run time). This language allows to represent control knowledge to define problem-solving methods. For instance the particular version of the heuristic-classification method for this model is formulated as follows:
METHOD heuristic classification INPUT observables OUTPUT malfunction DATA FLOW (abstraction knowledge) abstract problem INPUT observables OUTPUT symptoms (problem scenarios) identify malfunction INPUT symptoms OUTPUT type of malfunction (component model) refine malfunction INPUT observables, type of malfunction OUTPUT malfunction CONTROL FLOW START -> (abstraction knowledge) abstract problem, (problem scenarios) identify malfunction. (problem scenarios) identify malfunction IS no problem -> END no problem. (problem scenarios) identify malfunction IS problem -> (component model) refine malfunction, END no problem.
Note in the example that the data flow section defines how tasks are connected (where tasks are associated to knowledge areas). The control flow section uses rules to establish the execution order of tasks (in this case the task refine malfunction will not be not executed if after executing the task identify malfunction there are not problems detected).
4.3. The Vocabulary Perspective
Finally, the vocabulary perspective includes sets of vocabularies, which are groups of terms that are used by several knowledge areas. For this model, there is a vocabulary that includes the set of basic concepts about elementary components of the system (e.g. sections, links, nodes, paths, sensors, traffic-lights, etc.). Note, however, that a vocabulary is not a description of the whole domain knowledge. It only includes basic terms shared between different knowledge areas. In the case of clinical applications, such a vocabulary, often called a controlled vocabulary, is essential for sharing and reusing domain-specific knowledge.
In KSM vocabularies are formulated using a particular language called Concel that uses a concept-attribute-facet representation together with an organization in classes-subclasses-instances. For instance, part of this representation for the ontology of the traffic domain is:
Concept section subclass of object. Attributes: sensor (instance of sensor), speed {high, low}, saturation {high, low}, occupancy {high, low}, capacity (integer range 0 10000), demand (integer range 0 10000), lanes (integer range 1 6). Concept path subclass of object. Attributes: sections (instances of section), demand (range 0 10000). Concept sensor subclass of object. Attributes: speed (integer range 0 200), flow (integer range 0 10000), occupancy (integer range 0 100).
4.4. The Operational Version of the Knowledge Model
The knowledge model in KSM is supported by reusable computable constructs called primitives of representation. A primitive of representation implements a basic problem-solving technique with a particular knowledge representation (e.g., a primitive using rules with backward and forward chaining). A primitive may be viewed as a generalization of the concept of shell, considering also non knowledge based techniques. Each primitive include a knowledge acquisition user interface, inference procedures and explanation facilities. In order to produce the operational version of the knowledge model, the developer associates a primitive to each primary knowledge area. KSM has a open library of such primitives to support the knowledge model. The library can be extended with other more specific primitives. In particular, for the case of the knowledge model described in this paper, figure 9 shows the primitive associated to each primary area. Note that the use of open library allow to manage a multi-representation environment where it is possible to select the most appropriate technique for each case. This is particularly important in real systems where efficiency must be taken into account.
Until so far, the described structure of knowledge-areas, tasks, vocabularies and primitives of representation is general and reusable, i.e., it may be applied to different domains. To develop the model for a particular domain (e.g., traffic control) the developer creates an isomorphic structure of knowledge areas specialized in this domain as an instantiation of the general description. For each generic knowledge area there will be one or more domain knowledge areas, following the same relations established by the generic model. Figure 10 shows an example of a domain model in the traffic control domain corresponding to the generic model for decision-support. The top-level area has been instanced in the traffic-network area, and the component knowledge area is instanced in two different highways. Each bottom level area contains a knowledge base that has to be written by the developer
Primary Knowledge Areas | Abstraction Knowledge | Problem Scenarios | Component Model | Environment Model | Control Actions | Actions Preferences | Compatibility Model | Fixes Model |
Primitives of Representation | Numerical and Fuzzy Functions | Hierarchy of frames | Graph with ad-hoc procedures | Hierarchies of frames | Logic Clauses | Rule Base | Constraint Base | Rule Base |
Figure 9: Computational support of primary knowledge areas by using primitives of representation.
Finally, the set of knowledge acquisition facilities provided by each primitive are used to build particular knowledge bases. This means that the final user interface for knowledge acquisition is viewed as the union of the individual user interfaces of each primitive. In addition, the developer creates domain conceptual vocabularies (that include subclasses and instances of classes defined in generic vocabularies) and she may redefine at domain level the generic control knowledge defined in methods using the Link language. After this process, the model is ready to be executed for solving problems.
Figure 10: A domain knowledge-area perspective for a particular application.
PROTÉGÉ-II establishes a clear separation between methods and domains which provides a good framework for method reuse. The developer may reuse parts of existing models to create new applications, reusing methods in other domains or applying new methods to the same domain. The use of mapping relations improves reuse given that it increases the flexibility for associating domains and methods. The use of mapping relations makes the acquisition process independent of the method, due to the generated, customized knowledge acquisition tool, which acquires expertise in domain terms that are friendlier to the domain expert. The customization is partially due to the creation of an application ontology [Gennari et al., 95] that combines method and domain terms, and the graphical interface customization tool, which further enables the use of domain-specific terms for method-specific knowledge roles. On the other hand, although the current version of the PROTÉGÉ-II environment does not provide a technique for assembling methods, the paradigm on which it is based (task-methods-subtasks-mechanisms) offers an appropriate context to consider this process. In particular, subtasks are assembled by higher level methods which include as control knowledge the way in which subtasks must be executed. The detailed process of assembly in PROTÉGÉ-II is still an open question that for the moment is solved by ad hoc procedures.
Figure 11: Comparison between modeling approaches in PROTÉGÉ-II and KSM
In its turn, KSM shares different modeling elements with PROTÉGÉ-II (such as methods and tasks), but there are some differences. For instance, KSM introduces a new perspective made of knowledge areas that plays a double role: on the one hand, they organize in a modular structure the domain ontology view and, on the other, they include the set of tasks sharing the knowledge represented by each area. So, knowledge areas can be viewed like higher level blocks integrating modular ontologies and sets of methods based on the affinity between them. In KSM, reuse is viewed as a process of specialization of knowledge areas, which presents some similarities to the PROTÉGÉ-II one. However in PROTÉGÉ-II reuse is more explicit, since it presents the mapping relations and more flexible because it may cope with certain differences between the structures of the method ontology and the domain ontology. On the other hand, KSM provides a technique for assembling knowledge components in order to create more complex methods. This is done by using composite knowledge areas which include local control knowledge formulating the assembly. KSM uses a particular language, Link, to define such a control knowledge. Link provides a way to define the subtask connection (using data flows) and the execution order (using control rules). In KSM, instead of being independent entities as in PROTÉGÉ-II, tasks and methods are associated to knowledge areas and different instances of them are created by the duplication process carried out during knowledge modeling. This gives a simple solution for associating each task with its own knowledge base so that, as a difference with PROTÉGÉ-II, during the execution it is not necessary to select dynamically the appropriate knowledge base.
PROTÉGÉ-II | KSM | |
Modeling Components | Main view: Task Structure | Main view: Knowledge-Area Structure |
Task | Task | |
Method | Method | |
Modular ontologies with a set of tasks | Knowledge Area | |
Mechanism | Primitive of Representation (set of mechanisms sharing a basic ontology) | |
Method Ontology | Set of Knowledge Bases and Conceptual Vocabularies | |
Domain Ontology | Instantiation of Knowledge Bases and Conceptual Vocabularies into a domain | |
Mapping Relations | Implicit associations between generic and domain models | |
Operational Facilities | Mechanism as reusable software component | Primitive of Representation as reusable software component |
Ontology formulation with user interface facilities | Ontology formulation using Concel language and local languages provided by primitives | |
Method assembly by ad-hoc programming | Method assembly by using the Link interpreter (with data-flows and control rules) | |
Knowledge acquisition interface developed as a customization following the domain ontology | Knowledge acquisition interface as the union of individual interfaces of primitives of representation | |
Significant Applications | Medical diagnosis | Traffic management
decision support (TRYS system) |
Patient monitoring | Assistance in emergencies in the hydrology domain (CYRAH system) | |
Mechanical design (VT system) | Mechanical design (VT system) |
Figure 12: This table shows a correspondence between different features of PROTÉGÉ-II and KSM
In the PROTÉGÉ-II approach, the developer creates his/her own KA tool using the facilities provided by PROTÉGÉ-II. This gives a high flexibility to build and maintain the user interface for knowledge acquisition of the final application. However, since these facilities are not universal, there are some features that cannot be built using this approach (for instance, interfaces for knowledge acquisition with complex graphics, images, etc.). In KSM, on the other hand, the final user interface for knowledge acquisition is viewed as the addition of local user interfaces of each component. Each primitive has its own interface and the developer can only adapt it by renaming some terms in order to define its role in the final application. This second approach is less flexible, because it must accept the knowledge representation and user interface imposed by the original primitive, but it is more general, because it can accept a wide range of modules of very different nature, including symbolic representations (rules, frames, constraints, graphs, etc.), parametric representations (neural networks, spreadsheets, etc.) and even conventional algorithmic modules.
Figure 11 shows a comparison between the knowledge modeling approaches followed by PROTÉGÉ-II and KSM. Basically, in PROTÉGÉ-II the model is viewed with a collection of main tasks that are decomposed respectively into subtasks by associating the corresponding methods. This develops a task tree for each main functionality of the model. At the lowest level of each tree there are mechanisms supporting elementary tasks. Each mechanism has its own method ontology that is associated to the domain ontology by using explicit mapping relations. Thus, the same domain knowledge can be shared by different methods with different mapping relations. KSM, in its turn, like PROTÉGÉ-II, formulates a structure of tasks-methods. However, elementary tasks are supported by inferences on knowledge bases associated to primitives of representation. Note, for instance, that the same knowledge base may be used to provide different tasks (with different inferences). Each knowledge base, apart from its local representation, may be associated to a conceptual vocabulary that include concepts shared by other knowledge bases. Finally, the knowledge-area structure provides a higher level organization of primitives. In addition, knowledge areas include sets of tasks (this is not shown in the figure). Comparing both approaches, on the one hand, they follow a similar organization of tasks and methods but, on the other hand, they present two main differences. First, PROTÉGÉ-II makes more explicit the association between domain and method components by using mapping relations that are not present in KSM. Second, the set of knowledge bases plus conceptual vocabularies in KSM may be viewed as the domain ontology of PROTÉGÉ-II. Thus, KSM introduces at this level an ontology organization structured in knowledge-areas with different levels of aggregation.
In summary, figure 12 shows a correspondence between the different features of both environments. Concerning modeling components, the central view of the knowledge model in PROTÉGÉ-II is the task structure, while in KSM it is the knowledge-area structure. Task and methods are equivalent in both approaches. The knowledge area of KSM may be viewed in PROTÉGÉ-II as a modular ontology plus the set of tasks that may provide such ontology. The elementary knowledge block in PROTÉGÉ-II is the mechanism and in KSM the primitive of representation (that may be viewed as a collection of mechanisms that share the same method ontology). The method ontology of PROTÉGÉ-II is equivalent to the set of knowledge bases (corresponding to the different primitives) plus conceptual vocabularies in KSM. The domain ontology of PROTÉGÉ-II is somehow equivalent to knowledge bases and conceptual vocabularies instantiated on a particular domain. Finally, KSM does not include the explicit mapping relations of PROTÉGÉ-II, but they are implicit in the correspondence between generic knowledge areas and domain knowledge areas.
On the other hand, both environments provide facilities for operationalizing knowledge models. The basic reusable software component in PROTÉGÉ-II is the mechanism and in KSM the primitive of representation. In PROTÉGÉ-II , ontologies are formulated by using a window-based editor that present graphical information following an object oriented representation. In KSM, ontologies are formulated by using the Concel language (similar to the representation of PROTÉGÉ-II) plus the particular representation provided by each primitive of representation. In PROTÉGÉ-II, complex methods are built assembling simpler methods by writing ad-hoc programs. In KSM this process is done by using the Link language and its interpreter. Finally, the knowledge acquisition interface is customized in PROTÉGÉ-II by using a graphical editor that follows the domain ontology. In KSM this interface is the addition of knowledge acquisition interfaces of primitives. Both environments have been used to build real knowledge based applications in different domains that have demonstrated the validity of both approaches (Figure 10 shows some of such applications) [Rothenfluh et. al, 96].
In summary, the paper compares two different software platforms for knowledge modeling and operationalizing: the PROTÉGÉ-II and KSM environments. Both environments have been proved as useful tools in real world problems (e.g., medical and traffic control domains). The paper presents a case study complex enough, about decision support for real-time systems, following the KADS methodology. The knowledge model is a generalization of an existing real model used in the traffic domain. In includes three main tasks about diagnosis, prediction and design. The paper illustrates how the model may be formulated following the PROTÉGÉ-II and KSM approaches in order to determine analogies and differences between both.
The comparison shows first that both environments are capable of providing support to the example's complex generic knowledge model. Both environments support the definition of a generic and reusable model by providing descriptive entities (such as tasks, method and ontologies in both environments among others specific ones) and provide computational facilities (e.g., mechanisms, ontology editor, knowledge acquisition interface builder, etc. in PROTÉGÉ-II or primitives of representation, Concel language, Link interpreter, etc. in KSM ) for applying and operationalizing the model in a particular domain. The significant differences include the two advantages of PROTÉGÉ-II that are not present in KSM: (1) PROTÉGÉ-II includes an explicit association between method and domain ontologies (by using mapping relations) that is useful to facilitate reuse and knowledge acquisition, and (2) PROTÉGÉ-II assists to customize a knowledge acquisition tool following the domain ontology which provides a good level of flexibility to create, maintain and reuse the final application. On the other hand, the advantages of KSM (not present in PROTÉGÉ-II) are: (1) it provides a new perspective for knowledge modeling, the knowledge-area, that presents complementary and synthetic images of the model integrating subsets of tasks and modular ontologies, and (2) KSM provides a solution for method assembling to define more complex methods (by using the Link language and its interpreter).
Thus, this work shows how the existing duality in knowledge models (the task-method view and the domain-ontology view) is well separated in PROTÉGÉ-II. But, on the other hand, given that these views are not totally independent, a new perspective is useful for showing their coupling (this is supported by the knowledge-area perspective of KSM). Therefore, it would seem that the optimal framework for development of knowledge-based systems should include these views. Each view has its advantages. For instance, in a traffic-control application, the knowledge engineer might need to view the method-specific properties (e.g., temporal persistence) of the instantiation of the same method-specific concept (e.g., abstract parameters) or even the same domain-specific abstract parameter (e.g., circulation regime) across several domain-specific contexts (e.g., at different locations). Such a view is supported naturally by the PROTÉGÉ-II framework, and supports also a multiple-level inheritance of common properties. Alternatively, the domain expert might need to review all of the knowledge roles across particular knowledge areas at different degrees of aggregation (e.g., different highways or different road networks). Such a view would be supported better by KSM. Both views should exist for full support of development and maintenance.
[Bradshaw et al., 93] Bradshaw, J.M., Ford K.M., Adams-Webber J.R., Boose J.H.: "New approaches to constructivist knowledge acquisition tool development" Int. J. Intell. Syst. 8 (2). 1993. Also in "Knowledge Acquistion as Modeling", Ford K.M. and Bradshaw K.M. (eds) Wiley, New York, 1993.
[Chandrasekaran, 86] Chandrasekaran, B.: "Generic Tasks in Knowledge Based Reasoning: High Level Building Blocks for Expert Systems Design". IEEE Expert, 1986.
[Cuena, Hernández, 96] Cuena J., Hernández J.: "An Exercice of Knowledge Oriented Design: Architecture for Real Time Decision Support Systems". To be published as chapter of the book "Knowledge-based Systems: Advanced Concepts, Techniques and Applications". S. G. Tzafestas (Ed.). Publisher World Scientific Publishing Company. 1996.
[Cuena, Molina, 96] Cuena J., Molina M.: "KSM: An Environment for Design of Structured Knowledge Models". To be published as chapter of the book "Knowledge-based Systems: Advanced Concepts, Techniques and Applications". S. G. Tzafestas (Ed.). Publisher World Scientific Publishing Company. 1996.
[Cuena et al., 96] Cuena J., Hernández J., Molina M.: "Knowledge Oriented Design of an Application for Real Time Management: The TRYS System". Proc. European Conference on Artificial Intelligence (ECAI'96). Budapest 1996.
[Eriksson et al., 95] Eriksson, H., Shahar, Y., Tu, S.W., Puerta, A.R., and Musen, M.A.: "Task modeling with reusable problem-solving methods". Artificial Intelligence 79 (2):293-326. 1995.
[Gennari et al., 94] Gennari, J.H., Tu, S.W., Rothenfluh, T.E., and Musen, M.A.: "Mapping domains to methods in support of reuse". International Journal of Human-Computer Studies, 41:399-424, 1994.
[Molina, 93] Molina M.: "Desarrollo de Aplicaciones a Nivel Cognitivo Mediante Entornos de Conocimiento Estructurado". Technical University of Madrid. PhD. Dissertation. 1993.
[Molina, Cuena, 94] Molina M., Cuena J.: "Knowledge Oriented and Object Design: The Experience of KSM". Proc. 9th Banff Knowledge Acquisition for Knowledge-based Systems Workshop. 1995.
[Molina et al., 95] Molina M., Logi F., Ritchie S., Cuena J.: "An Architecture Integrating Symbolic and Connectionist Models for Traffic Management Decision Support". Proc VI International Conference on Applications of Advanced Technologies in Transportation Engineering. 1995.
[Musen et al., 95] Musen, M.A., Gennari, J.H., Eriksson, H., Tu, S.W., Puerta, A. R.: "PROTÉGÉ-II: Computer Support For Development Of Intelligent Systems From Libraries of Components". In Proceedings of MEDINFO '95, Eighth World Congress on Medical Informatics, 766-770, Vancouver BC. 1995.
[Puerta et al., 93] Puerta A.R., Tu S.W., Musen M.A.: "Modeling Tasks with Mechanisms". International Journal of Intelligent Systems, Vol. 8, 1993.
[Rothenfluh et. al, 96] Rothenfluh, T.E., Gennari, J.H., Eriksson, H., Puerta, A.R., Tu, S.W., and Musen, M.A. Reusable ontologies, knowledge-acquisition tools, and performance systems: PROTÉGÉ-II solutions to Sisyphus-2. International Journal of Human-Computer Studies, in press.
[Schreiber et al., 93] Schreiber G., Wielinga B.J, Breuker J.A. (eds): "KADS: A principled approach to knowledge-based system development". Academic Press. 1993.
[Shahar, Musen, 93] Shahar, Y. and Musen, M.A.: "RÉSUMÉ: A temporal-abstraction system for patient monitoring", Computers and Biomedical Research 26(3):255-273, 1993. Reprinted in: J.H. van Bemmel and T. McRay, eds., 1994, Yearbook of Medical Informatics 1994, 443-461. Stuttgart: F.K. Schattauer and The International Medical Informatics Association.
[Tu et al., 95] Tu, S.W., Eriksson, H., Gennari, J., Shahar, Y., and Musen, M.A.: "Ontology-based configuration of problem-solving methods and generation of knowledge-acquisition tools: Application of PROTÉGÉ-II to protocol-based decision support". Artificial Intelligence in Medicine 7(3):257-289. 1995.
[Wielinga et al., 92] Wielinga B.J., Schreiber A.T., Breuker J.A.: "KADS: A modelling approach to knowledge engineering". Knowledge Acquisition. 4, 5-53. 1992.