Providing Advice on the Acquisition and Reuse

of Knowledge Bases in Problem Solving

 
S. White and D. Sleeman

 

Department of Computing Science
 University of Aberdeen
 Aberdeen AB24 3UE
 Scotland, UK
 {swhite, dsleeman}@csd.abdn.ac.uk
 
 
 
Abstract. Many tools and techniques have been developed for the systematic acquisition of domain knowledge, including knowledge elicitation (KE) methods to acquire knowledge from a human expert, machine learning (ML) algorithms that infer knowledge from data, and knowledge base refinement (KBR) tools that refine existing knowledge bases (KBs). As the number and sophistication of knowledge acquisition tools increases, it becomes progressively more difficult for users (notably domain experts) to choose between them for particular applications, especially when more than one is needed. We recognise the importance of driving this process by the epistemological requirements of the problem solver(s) which have been selected to solve a particular task. To support this approach, we introduce the MUSKRAT toolbox, which includes an advisory system coupled to several knowledge acquisition tools and problem solvers. This advice-giving system compares the requirements of the selected problem solver with available sources of information (knowledge, data, human expert). As a result, it may recommend either the reuse of existing knowledge bases, or the application of one or more knowledge acquisition tools, based on their knowledge-level descriptions. In this paper, we present the MUSKRAT framework and illustrate it with a detailed description of the prototype currently being implemented, which includes three problem solvers and four knowledge acquisition tools.
 

1. INTRODUCTION

Many tools and techniques have been developed for the systematic acquisition of domain knowledge, including knowledge elicitation (KE) methods to acquire knowledge from a human expert, machine learning (ML) algorithms that infer knowledge from data, and knowledge base refinement (KBR) tools that refine existing knowledge bases (KBs). As these tools become more sophisticated and are enhanced to deal with real-world applications, their differences tend to become less apparent. KE tools, which were originally simple implementations of manual methods, now perform more tasks automatically, while ML and KBR tools now recognise that many interesting tasks cannot be completely automated and so interact more with their users. This, in addition to the large number of available Knowledge Acquisition (KA) 1  techniques, makes it very difficult for many users (notably domain experts) to choose an appropriate tool for their particular application, especially when more than one is needed to solve their problem.

We present a framework for problem solving and knowledge acquisition in which the epistemological requirements of the problem solvers drive the KA process. We aim to describe the kinds of data and knowledge which each problem solver requires, such that an advisory system will be able to recommend how the available information and knowledge sources can be transformed by a series of KA tools into the knowledge required by the problem solver. We illustrate the feasibility of the framework by building MUSKRAT, a MUltiStrategy Knowledge Refinement and Acquisition Toolbox (Graner and Sleeman, 1993). MUSKRAT is an open architecture which incorporates a number of KA tools and problem solvers, as well as an advisory system for commenting on the means-ends guidance.

This paper explains the origins and intentions of this research. The following section presents its background; section 3 describes the basic MUSKRAT framework; section 4 details the requirements of means-ends analysis, and section 5 reports on current progress with our MUSKRAT prototype. Finally, in section 6, we summarise our work and indicate some possible future research directions.

2. BACKGROUND

This work originated in the Machine Learning Toolbox (MLT) project (Kodratoff et. al, 1992). The aim of that project was to use machine learning with real industrial problems by building a collection of ML tools. The ML toolbox also includes a number of support tools, including an advice-giving system, the Consultant, developed at the University of Aberdeen (Craw et. al, 1992; Sleeman et. al, 1995). The Consultant questions its user about the task to be solved, the data and background knowledge that can be provided, etc., and recommends one or more suitable learning tools. Although it was found to perform satisfactorily, the Consultant suffers from two major limitations. Firstly, since its recommendations are the result of differential (or comparative) knowledge of the toolbox algorithms, it is difficult to extend with knowledge of other algorithms. This means that whenever a new algorithm is added to the toolbox, the whole of the Consultant's knowledge base must be reconsidered, instead of just the knowledge pertaining to the new algorithm. Secondly, the Consultant has no understanding of the problem that the user wants to solve in his application domain. The user must decide what knowledge is required to solve the problem by first defining a learning task, and only then can the Consultant help him with the choice of a suitable tool.

Having a model of the target problem solver is useful for guiding the KA process. This is generally acknowledged in the KE community:
 

`` Currently the main theories of knowledge acquisition are all model based to a certain extent. The model based approach to knowledge acquisition covers the idea that abstract models of the tasks that expert systems have to perform can highly facilitate knowledge acquisition.'' (van Heijst et. al, 1992).
However, this is not always accepted by ML researchers; ML systems that use an explicit model of problem solving are rare (Ganascia, Thomas and Laublet, 1993). The reason is that it is difficult to use an abstract, knowledge-level2 model of a problem solver to guide ML effectively. The knowledge acquired by ``manual'' knowledge elicitation can often easily be adapted to fit the requirements of a particular problem solver (e.g., in terms of knowledge representation), but that obtained from an automatic ML tool can only be used after the knowledge has been transformed by either a human or a program. We therefore decided that MUSKRAT, a knowledge acquisition toolbox which includes KE, ML and KBR tools, should also have access to problem solvers, which will serve as the targets of the KA process. This is in contrast, for instance, with KEW, the Knowledge Engineering Workbench produced by the ACKnowledge project (Reichgelt and Shadbolt, 1992). KEW, which focuses on KE techniques, does not include a problem solver, but instead uses Generalised Directive Models to guide tool selection (van Heijst et. al, 1992).

The integration of learning and problem solving is also a major issue in the field of integrated systems (SIGART Bulletin, 1991; VanLehn, 1988). Some systems integrate several KA tools with a problem solver (as in PRODIGY (Carbonell, Knoblock and Minton, 1991)). Others integrate KA and problem solving in a single component, using a uniform technique (THEO (Mitchell et. al, 1991), SOAR (Laird et. al, 1991)). In both cases, the knowledge base is tied to a particular problem solver. In contrast, MUSKRAT integrates existing, stand-alone KA tools with existing, stand-alone problem solvers, so that the knowledge can be tested independently and shared among several problem solvers. Knowledge sharing and reuse is further supported by the fact that all the knowledge acquired by the system is expressed in a single representation language, such as CKRL (Common Knowledge Representation Language), (Morik, Causse and Boswell, 1991).

Finally, the selection of an appropriate ML tool is also an issue in many multistrategy learning systems (Michalski and Tecuci, 1991). Some multistrategy systems include several ML techniques (for instance both symbolic and sub-symbolic algorithms) which are applied successively to generate a single knowledge base. In MUSKRAT, the knowledge to be acquired is structured into several knowledge bases, each of which is obtained with an appropriate technique. Other multistrategy systems use highly discriminating selection criteria to opt for the most suitable KA tool, but we are not aware of any system that chooses from as broad a range of techniques as MUSKRAT, including KE, ML and KBR tools.
 

3. THE MUSKRAT FRAMEWORK

In the MUSKRAT framework, we assume that problem solving proceeds as follows:
  1. A task is first identified; that is, a problem to be solved in a particular domain.
  2. A suitable problem solver is selected to solve this task. If no single problem solver can be identified, it may be necessary to split the application task into sub-tasks that can each be solved by a problem solver.
  3. For each selected problem solver, the required data sets and knowledge bases are determined. This amounts to a knowledge-level analysis of each problem solver, which need only be done once since it does not depend on a particular application task.
  4. The available knowledge sources (human expert, examples, knowledge bases, etc.) are compared with the requirements of each selected problem solver. This may define one or more KA tasks.
  5. A tool is selected to solve each KA task; i.e., to bridge the gap between required and available knowledge. This presupposes a knowledge-level analysis of available tools3.
  6. The selected KA tools are applied.
  7. The selected problem solver(s) are applied.
These steps can be repeated in a cycle, especially if information acquired in step 6 is needed to refine the decisions made in step 2, or if one of the KA steps fails, or if the user rejects the solution proposed in step 7.

The MUSKRAT system is designed to support steps 3 to 5. It assumes that a problem solver has been selected for a particular task or sub-task, and directs the acquisition of knowledge for that problem solver. The system consists of any number of problem solvers, any number of KA tools, and a guidance module, the KA selector/advisor.

The tool selection process starts with a description of the problem and the subsequent selection of a corresponding problem solver. An advisory system may assist the user with this choice. This is not currently part of MUSKRAT, but a module similar to KEW's advice and guidance module (van Heijst et. al, 1992) should be applicable.

Once a problem solver has been selected, MUSKRAT knows which KB(s) are required. This follows because each problem solver specifies the knowledge it requires in terms of its functionalities and representation. These requirements are expressed in a formalism which provides descriptors for both knowledge-level and symbol-level features. We are currently defining such a formalism to describe the effects and requirements of the particular tools included in the MUSKRAT prototype.

The next step is to identify the available knowledge sources. We consider three broad categories of knowledge sources: available knowledge refers to knowledge that is already in the form required for a KB, e.g., a set of rules. It may be directly usable or require transformation or refinement. Note that knowledge bases are seldom available initially, but when MUSKRAT is used iteratively as part of a problem solving cycle, ``available knowledge'' refers to that acquired during a previous iteration. Available data refers to data that is relevant to the problem and from which useful information could be extracted, although it does not meet the requirements of the KB. Typically, this may consist of past cases, i.e., previously solved problems similar to the one at hand, from which insight into the new problem can be gained. Alternatively, if the problem is to diagnose faults in a complex system, ``available data'' may refer to a model of the system which is useful (or perhaps necessary) to perform diagnosis. Note that the distinction between knowledge and data is not intrinsic but depends on the KB requirements. For instance, a set of past cases is regarded as knowledge if it is to be used by a case-based reasoner that can use it directly, but only as data for a rule-based system which is unable to reason with cases. Finally, an expert is a person who can provide various forms of knowledge, possibly with the help of a KE tool and/or a knowledge engineer.

The KA selector/advisor is the central component of MUSKRAT (see Figure 1). It compares the requirements of the selected problem solver with the characteristics of available knowledge sources and recommends the use of one or more KA tools which should create the desired KBs. For that purpose, it needs a knowledge-level description of each available KA tool and performs a means-ends analysis to decide which one is most capable of reducing the differences for each of the required KBs. This is described in more detail in the following section.

The KA Selector
Figure 1: The KA Selector

Once the KA tools have been recommended, it is the user's responsibility to run these tools on the specified inputs to acquire the required knowledge bases. When this has been done for each of the KA tools, the user should be able to run the recommended problem solver. If any of the above stages fails it is currently the user who has to decide what course of action to take. (When the system is built we will note carefully the actions taken by users, with a view to a subsequent semi-automation of this stage.) Additionally, it is the user's responsibility to evaluate the solution obtained by the problem solver and, if necessary, to initiate a further KA cycle.
 

4. MEANS-ENDS ANALYSIS

Means-Ends Analysis4 is a recursive problem solving procedure in which at each step the difference between the current problem solving state and a goal state is identified and then a problem solving step (`operator') is selected on its merit for reducing this difference (Sundermeyer, 1991). In MUSKRAT we are concerned with the application of knowledge acquisition tools and problem solvers, so our `problem solving state' refers to the availability of useful knowledge bases. The goal state is given by the ability to run a problem solver successfully on the given problem using these knowledge bases. The `operators' are KA tools (and, in general, other transformation processes) which bring the available knowledge bases nearer to those which are required by the problem solver. An optimal means-ends guidance system would recognise which of the following three cases applies to each required knowledge base:  The first case demands a detailed inspection of a knowledge base's contents in order to assess its suitability, or fitness for purpose (discussed in section 4.1). If such a content-based inspection fails because a suitable knowledge base is not available, then that knowledge is acquired, as indicated by case 2. This activity involves two important issues: the selection of suitable techniques for acquiring the knowledge (examined in section 4.2), and the acquisition of that knowledge in such a way that it already complies with constraints associated with the targeted problem solver. We do not intend to address this latter issue directly, although we may gain insights into the nature of the problem as a side-effect of our main interests (i.e., fitness for purpose and KA tool selection). Neither do we intend to support the modification of existing knowledge bases (case 3), since the first two cases provide plentiful challenges, and others are already addressing this topic. Two approaches, for example, are adaptation (Fensel and Groenbloom, 1997) and ontological mediation (Gray et. al, 1997; Visser et. al, 1997).

In the rest of this section we focus on sub-topics associated with cases 1 and 2 given above.

4.1 Assessing a Knowledge Base's Fitness for Purpose

Each time a candidate knowledge base is presented to a problem solver (or a KA Tool), it needs first to be inspected to see if it can be used without modification (case 1, above). This is the assessment of a knowledge base's fitness for purpose.

In MUSKRAT, all knowledge bases are expressed in the same representation language, CKRL. CKRL (Common Knowledge Representation Language) is an information interchange language developed as part of the MLT project (Morik, Causse and Boswell, 1991). CKRL is not directly executable, but consists of declarations that can be translated into a tool's internal representation. To ensure that an unambiguous translation is possible to a wide range of representational schema, CKRL entities, (concepts, instances, relations, properties, sorts, rules, etc.) are defined at the epistemic level (Brachman, 1979). Our choice of a uniform knowledge representation was motivated by considerations of knowledge sharing and reuse: a KB should be usable by several problem solvers, even if these uses were not anticipated when the KB was created. It also allows the integration of new problem solvers and KA tools into MUSKRAT at the cost of implementing a single interface to and from CKRL.

Although CKRL was originally designed as a communication medium for ML tools, it is general enough to be useful in many situations where knowledge is to be transmitted or processed in a number of ways, including, for example, describing tools as part of the knowledge level model, and representing ontologies. An additional advantage of choosing CKRL is that some of the KA tools in our prototype are also part of MLT (Machine Learning Toolbox), and therefore already express their output in this language.

Each CKRL knowledge base is allocated an epistemological role when it is used as input to a problem solver or KA tool. The roles which each tool needs to have fulfilled before it can run are documented in the knowledge level model. For example, the conceptual clustering algorithm KBG expects two main inputs, a set of examples and a domain theory. Each of these represents a particular role which a knowledge base can play when used as input to KBG. The roles which a knowledge base plays when it is used in conjunction with one tool are not constrained by the roles which it plays when used in conjunction with others. We take the view that a collection of knowledge bases is not fit for the purposes of a particular problem solver unless all the required roles are fulfilled. To fulfil an allocated role, the knowledge contained in a knowledge base must be necessary, sufficient, and of the expected type. If a knowledge base is not fit for the role, we should like to determine why not, i.e., whether the knowledge it contains is:

Examples

Since CKRL deals with the definition of high level entities such as concepts, relations, instances, facts, and rules, we can define the roles of each type of knowledge base by expressing their contents as a context-free grammar. For example, the following fragment of EBNF6 sketches possible definitions for a rule base, a set of examples, and a domain theory.

<knowledge base>  ::= <ckrl entity>+
<ckrl entity>     ::= { <concept> | <rule> | <relation> | <fact> | <instance> | ...}
<rule base>       ::= <rule>+
<set of examples> ::= <instance>+
<domain theory>   ::= <domain entity>*
<domain entity>   ::= { <concept> | <rule> | <relation> | <fact> | <instance> }

In addition to inspecting the conceptual types of the proposed knowledge bases, we should like to ensure that the given knowledge is also the right knowledge for solving the problem at hand, but without actually running the problem solver. The advisor must therefore inspect each knowledge base with respect to the task and, with the assistance of descriptions of the competencies of the tool(s) in question, reject knowledge bases which clearly do not lead to a solution of the problem. Although our ideas are currently somewhat preliminary, this is the crux of the approach which we believe we can address at various levels of sophistication (details to be presented at the workshop).

4.2 Acquiring Knowledge because it is not Available

If the ``fitness for purpose'' test fails on the available knowledge bases, and we decide not to transform the available knowledge bases (as in case 3, above), then we must acquire a suitable knowledge base ab initio. Given an adequate model of knowledge acquisition and available tools, MUSKRAT can assist the user in choosing the right acquisition tool(s) for the problem at hand. A prerequisite for effective KA tool selection as part of means-ends analysis is the ability to describe KA tools and problem solving systems at the ``knowledge level'' (Newell, 1982). This entails describing not only the distinct properties of each subsystem, but also the way that the subsystems relate to each other, and to other comparable systems. For example, a taxonomy of KA tools could assist in the choice of acquisition method, particularly in an open framework like that of MUSKRAT. In fact, since we are reasoning at the knowledge level, by including such knowledge, MUSKRAT would be able to recommend the use of tools which it deemed appropriate, but which were not available7 in the system!

Since the selection of KA tools must also fit into the framework of means-ends analysis, we also seek to specify:

The requirements of a KA tool or problem solver are described by the set of tests given in section 4.1. This enables the system to answer the question ``Can the given problem solver (or KA tool) run with the knowledge available?''. It does not help the system plan the overall problem solving strategy, however, or to combine the contributions of different tools. For this, we need to model epistemological change. We suggest that this is possible in MUSKRAT by applying some mapping functions to the knowledge level types defined by a grammar in section 4.1. Each mapping represents the knowledge-level transformation achieved by a problem solver or KA tool. Consider the following examples: The first mapping indicates that STALKER is a program which takes a rule base as an input, and returns a rule base, i.e., it is a rule base refiner. The second mapping shows that KBG generates a rule base from a domain theory and a set of examples. Lastly, REPGRID `operates' on a domain expert as its knowledge source, and returns a set of instances8, a set of properties, and a function which, given an instance and a property, returns a value. Since the mapping functions are to be used as the top-level operators in means-ends analysis, they enable MUSKRAT to plan its problem solving, and use the plan to guide tool selection.

5. THE MUSKRAT PROTOTYPE

To illustrate our approach, a MUSKRAT prototype is currently being implemented in the domain of meal planning and preparation. This is a common domain in AI not only because it is demanding (Marling and Sterling, 1996), but also because it is easily accessible to other researchers. It includes three distinct problems: selecting dishes given a set of constraints, analysing and criticising a selected menu, and scheduling meal preparation given time constraints and limited resources. It should be noted that since the main focus of our work is on knowledge acquisition rather than problem solving, we decided to keep the problem solvers fairly simple, even if this implies that we can only solve simplified versions of our original problems. Enhanced versions of these problem solvers will later be applied to similar, though much larger, problems, for example in the domain of flexible manufacturing; where the three tasks are the design of mechanical devices under specific constraints, analysis of proposed designs, and workshop scheduling for the manufacturing of the object.

The prototype includes three different problem solvers and four KA tools which were selected to deal with the given problems. In the following section we describe the system-level architecture of the prototype, before describing the components of MUSKRAT in more detail.
 

5.1 Architecture and Choice of Technology

Figure 2: The System-Level Architecture of MUSKRAT

In line with current technology and communication trends, we decided to implement the MUSKRAT prototype as a client-server system which operates over the Internet (see Figure 2). This will not only make it easier for interested parties to try out our system, it will also provide a sound basis for further work in distributed knowledge acquisition and problem solving. The architecture consists of a LISP server and Java clients, which communicate via TCP/IP sockets. The client is relatively lightweight, currently providing a simple web browsing facility (envisaged mainly for help texts), a repertory grid tool9 (Boose, 1990), and automatic access to the server. We have also anticipated a visual CKRL knowledge editing tool, but the current implementation includes only a simple text editor. The LISP server contains the problem solvers, the rest of the KA tools, and the advisory system. We now describe the problem solvers, knowledge acquisition tools and the advisor in more detail.

5.2 Problems and Problem Solvers

Constraint Satisfaction   The constraint satisfier is supplied with a list of constraints, ranked by their importance. The task in this domain is to compose a menu (from a pre-defined set of dishes) which satisfies all the constraints. If all the constraints cannot be simultaneously satisfied, then a menu is composed which satisfies the most important ones. Examples of constraints include: ``the meal should include a starter, a main course and optionally a dessert'', ``select a vegetarian meal'', ``at most one dish may include seafood'', ``the total price must not exceed N'', etc.

 The KBs required to solve this problem are:

Design Analysis   The purpose of this problem solver is to take a menu generated by the first problem solver or any other source, and to issue a list of comments, and suggestions for possible improvements. A typical output from this module could be ``This meal supplies two thirds of the recommended daily allowance of carbohydrates'', or ``This meal is unbalanced because it contains two sea food dishes; you could replace fish soup with vegetable soup''.

The KBs required by this problem solver are:

Task Scheduling   Once a menu has been selected, this problem solver can be used to generate a plan to prepare it. Each dish has a recipe, which is a fixed, partially ordered list of actions. The problem is to set the starting time of the tasks involved in the recipes of all the dishes in the menu, so as to meet time and resource constraints. A time constraint might be that two dishes must be ready and warm at the same time; a resource constraint might be that only one oven is available.

 The KBs required by this problem solver are:

 Knowledge Base Reuse in the MUSKRAT Prototype
Figure 3: A summary of Knowledge Base Reuse in the MUSKRAT prototype

KB reuse in the MUSKRAT prototype is conveniently summarised in Figure 3. In this figure, the ovals denote knowledge bases, the rectangles denote problem solvers, and the arrows indicate the direction of knowledge flow.

5.1 Knowledge Acquisition Tools

Unlike the above problem solvers, which are being implemented as part of this work, the KA tools described in this section were developed independently by other researchers. Our goal is to show that they can be made to work together, with minimal modifications. One significant enhancement that has to be made, however, is that they must all express their output in the same knowledge representation language, CKRL. (This is already the case for those which are part of the MLT, namely APT and KBG.)

Repertory Grid   The repertory grid is a KE technique derived from social/cognitive psychology (Kelly, 1955; Boose, 1990). It provides a systematic way of interactively eliciting elements (examples) and constructs (descriptors) from an expert. Although it is fundamentally a methodology, it can be supported by software tools such as Tacktix (Reichgelt and Shadbolt, 1991) that not only acquire this knowledge but also compute similarities and correlations between elements and between constructs. More recently, repertory grid tools have also been implemented for use on the Internet (Gaines and Shaw, 1997).

Screenshot of the Repertory Grid Tool
Figure 4: Screenshot of the Repertory Grid Tool4

In our application, this tool is used to acquire simultaneously dish descriptors (A1) and descriptions (A2 and B2), since these KBs must be acquired directly from an expert (see Figure 4). In addition, correlations between descriptors suggest possible rules for A3, although those can be more adequately acquired by KBG (see below).

KBG   KBG (Bisson, 1992) is an ML clustering and generalisation tool. It can either take unclassified examples and cluster them according to a particular, flexible metric, or take classified examples and induce discrimination rules. In both cases, it can also use background knowledge in the form of rules to complete example descriptions. An interesting feature of KBG is that its learning examples and output rules are expressed in (restricted) first-order logic, which means in particular that all the examples need not be represented by the same descriptors.

In our example, KBG is used to infer rules for A3: given a small number of complete dish descriptions, it finds correlations between descriptors that can later be used to complete new (incomplete) descriptions.

It can also be used to infer control rules for A5. In this case, a learning example is a set of constraints and an indication of which one should be dropped. Since constraints are complex objects that cannot be represented as attribute/value pairs, KBG's first order representation is very suitable for this task.

Finally, its clustering and concept formation ability can be used to select useful predefined constraints (A4). Since such constraints are provided only for user convenience, it is useful to detect patterns that occur frequently in user-defined constraints, and add them to the set of predefined constraints. KBG can help with this pattern detection.

APT   APT (Nédellec and Causse, 1992) uses a combination of KE and ML techniques to acquire problem solving rules. It starts with a domain theory, in the form of a semantic network, and possibly an initial set of rules. When it cannot solve a problem with its rules, it asks the user for a particular solution, then uses the domain theory to generalise it. The user is constantly requested to validate the rules generated by the system, and can extend the domain theory if necessary to enable APT to infer correct generalised rules.

When a limited domain model is available, APT can be used as a KE tool to acquire rules and enhance the model. When an important set of rules is available, it can be seen as an interactive KBR tool. These capabilities, together with its rich knowledge representation (semantic network) make it suitable to acquire analysis rules (B3), as well as recipes (C2) that can be regarded as problem decomposition rules.

STALKER   STALKER is an efficient automatic knowledge base refinement tool (Carbonara and Sleeman, 1996). Given a set of Prolog-like rules, and an example incorrectly classified by these rules, it considers many possible remedies (generalising or specialising rule premises, re-ordering rules, adding new rules, etc.), tests them against known cases and implements the most successful ones. It occasionally consults an expert to validate its recommendations. In our prototype, STALKER uses examples of menus commented on by an expert to refine the ``comment'' rules (B3).

Table 1 summarises the relationships between MUSKRAT's problem solvers, KA tools and knowledge bases.
 

KB
Required By
Acquired By
A1 Constraint Sat., Analysis Grid
A2, B2 Constraint Sat., Analysis Grid
A3 Constraint Sat., Analysis KBG, Grid
A4 Constraint Sat. KBG
A5 Constraint Sat. KBG
B3 Analysis APT, STALKER
C2 Scheduling APT
C3 Scheduling (no tool)
Table 1: The Relationships Between MUSKRAT's problem solvers, KA tools and Knowledge Bases

5.1 The Advisor

The advisor employs means-ends analysis to identify the knowledge bases and knowledge acquisition tools which could be used to enable a given problem solver to run. We believe that a suitable approach is by steady refinement of a knowledge level model of problem solvers and KA Tools. We therefore devised a simple model to demonstrate the application of means-ends analysis to knowledge acquisition in our chosen domain. The model contains the following features: As an example, consider STALKER. It is an instance-of a KB Refinement Tool, and therefore also is-a KA Tool. It is part-of MUSKRAT, requires a Rule Base as its input, and also outputs a Rule Base. In our model, all relations are binary, and can also be supplemented with cardinality restrictions. For example, when a task is split into subtasks, each subtask has at most one `parent', but can have many `children'.
 A Simple Taxonomy of Software Components
Figure 5: A simple taxonomy of software components

The advisor can currently perform two basic actions, which we call describe and run:

Although the current model is somewhat limited in its scope and depth, we have demonstrated through the run function that means-ends analysis can be achieved by this approach. A later version of the advisor will contain more detailed knowledge about the nature of problem solvers and KA Tools, and perform more content-based checks on the knowledge bases before assuming their suitability.

6. SUMMARY AND FUTURE WORK

We have presented the outline of a framework that allows the integration of independent problem solving and knowledge acquisition tools. Integration is achieved through knowledge-level descriptions of the tools (which are used to provide advice and guidance), and a uniform representation of knowledge (which encourages sharing and reuse). The acquisition of knowledge is driven by the requirements of problem solvers rather than by the data or knowledge which is already available. We call this goal-driven knowledge acquisition. This contrasts, for example, with the approach of PROTÉGÉ-II (Puerta et. al, 1992), which constructs knowledge acquisition tools, and modifies problem solving methods.

Our work is currently focusing on the following two issues.

By addressing the above issues, we will be able to develop a minimal working MUSKRAT prototype. This will not only serve to illustrate the MUSKRAT ideas, but also form the basis for further research into the integration of knowledge acquisition and problem solving. In the following, we indicate some of the possible future research directions which could be pursued.

Since the MUSKRAT prototype is being implemented as a client-server system, there are opportunities for enhancing the network architecture. For example, the KA Selector/Advisor could be implemented as a single broker agent which sends interested parties to one of a number of distributed MUSKRAT servers. This approach may lead to both improved support for distributed knowledge acquisition, which is particularly important for corporate intranets, and to easier access to problem solvers, together with their associated knowledge.

Whenever an existing knowledge base could be modified to meet a problem solver's requirements, the necessary transformations should be computationally identified and supported. A possible approach to this task is to carry out human-based empirical studies in which each subject is provided with a problem solver and a small set of knowledge bases which are unsuitable for immediate use by that problem solver. The subject's task is to transform the available knowledge into knowledge which is suitable for the problem solver, if necessary with the aid of KA Tools. Observation of the techniques which human subjects use may yield useful information as to how the task can be supported computationally.

A challenging goal for future research would be to enhance the MUSKRAT system such that the knowledge-level descriptions of problem solvers and KA tools are automatically generated (or refined) by letting the system perform its own experiments. This opens interesting perspectives in the field of autonomous multistrategy learning systems.

Acknowledgements   This work is financially supported by an EPSRC research studentship. The Machine Learning Toolbox Project (ESPRIT project 2154) provided not only inspiration, but also an implementational foundation for some of our work. We are also grateful to the anonymous referees who provided very useful comments on a previous version of this paper.

7. REFERENCES

Bisson, G., (1992), ``Learning in FOL with a Similarity Measure'', in Proceedings of the 10th National Conference on Artificial Intelligence (AAAI-92), pp. 82-87.

Boose, J. H., (1990), ``Uses of Repertory Grid-Centered Knowledge Acquisition Tools for Knowledge-Based Systems'', in Boose, J., Gaines, B., (Eds.), Foundations of Knowledge Acquisition, Knowledge-Based Systems Book Series, Volume 4, pp. 61-83, London: Academic Press.

Brachman, R. G., (1979), ``On the Epistemological Status of Semantic Networks'', in Associative Networks: Representation and Use of Knowledge by Computers, Findler, N. V., (Ed.), New York: Academic Press, pp. 3-50.

Carbonara, L., Sleeman, D., (1996), ``Improving the Efficiency of Knowledge Base Refinement'', in Proceedings of ICML 96, Bari, Italy, pp. 78-86.

Carbonell, J. G., Knoblock, C. A., and Minton, S., (1991), ``Prodigy: An integrated architecture for planning and learning'', in (VanLehn, 1988), pp. 241-278.

Craw, S., Sleeman, D., Graner, N., Rissakis, M., and Sharma, S., (1992), ``CONSULTANT: Providing advice for the Machine Learning Toolbox'', in Proceedings of the 1992 BCS Expert Systems Conference, Bramer, M. (Ed.), Cambridge University Press.

Fensel, D., Groenbloom, R., (1997), ``Specifying Knowledge-Based Systems with Reusable Components'', in Proceedings of the 9th International Conference on Software Engineering & Knowledge Engineering (SEKE-97), Madrid, Spain.

Gaines, B., Shaw, M. L. G., (1997), ``Knowledge acquisition, modelling and inference through the World Wide Web'', International Journal of Human-Computer Studies, 46, pp.729-759.

Ganascia, J-G., Thomas, J., and Laublet, P., (1993), ``Integrating models of knowledge and machine learning'', in Machine Learning: ECML-93, Brazdil, P. B. (Ed.), Springer-Verlag.

Graner, N., Sleeman, D., (1993), ``MUSKRAT: a Multistrategy Knowledge Refinement and Acquisition Toolbox'', in Proceedings of the Second International Workshop on Multistrategy Learning, Michalski, R. S., Tecuci, G., (Eds.), pp. 107-119.

Gray, P. M. D., Preece, A., Fiddian, N. J., Gray, W. A., Bench-Capon, T. J. M., Shave, M. J. R., Azarmi, N., Wiegand, M., Ashwell, M., Beer, M., Cui, Z., Diaz, B., Embury, S.M., Hui, K., Jones, A. C., Jones, D. M., Kemp, G. J. L., Lawson, E. W., Lunn, K., Marti, P., Shao, J., and Visser, P. R. S., (1997), ``KRAFT: Knowledge Fusion from Distributed Databases and Knowledge Bases'', Conference on Database and Expert System Applications (DEXA '97), Toulouse, France.

Kelly, G. A., (1955), ``The Psychology of Personal Constructs'', Norton, New York, 1955.

Kodratoff, Y., Sleeman, D., Uszynski, M., Causse, K., Craw, S., (1992), ``Building a Machine Learning Toolbox'', in Enhancing the Knowledge Engineering Process, Steels, L., Lepape, B., (Eds.), North-Holland, Elsevier Science Publishers, pp. 81-108.

Laird, J., Hucka, M., Huffman, S., and Rosenbloom, P., (1991), ``An analysis of Soar as an integrated architecture'', in SIGART Bulletin Special Section on Integrated Cognitive Architectures, 2 (4), pp. 98-103.

Marling, C. R., Sterling, L. S., (1996), ``Designing Nutritional Menus Using Case-Based and Rule-Based Reasoning'', in Artificial Intelligence in Design `96, Kluwer Academic Publishers.

Michalski, R. S., and Tecuci, G. (Eds.), (1991), Proceedings of the First International Workshop on Multistrategy Learning (MSL-91), George Mason University, Fairfax, VA.

Mitchell, T. M., Allen, J., Chalasani, P., Cheng, J., Etzioni, O., Ringuette, M., and Schlimmer, J.C., (1991), ``Theo: A framework for self-improving systems'' in (VanLehn, 1988), pp. 323-355.

Morik, K., Causse, K., and Boswell, R., (1991), ``A common knowledge representation integrating learning tools'', in (Michalski and Tecuci, 1991), pp. 81-91

Nédellec, C., and Causse, K., (1992), ``Knowledge refinement using Knowledge Acquisition and Machine Learning Methods'', in Proceedings of EKAW-92, Springer Verlag.

Newell, A., (1982), ``The Knowledge Level'', Artificial Intelligence 18(1), pp.87-127.

Puerta, A., Egar, J., Tu, S., Musen, M., (1992), ``A Multiple-Method Knowledge-Acquisition Shell for the Automatic Generation of Knowledge-Acquisition Tools'', Knowledge Acquisition, 4, pp. 171-196.

Reichgelt, H., and Shadbolt, N., (1992), ``ProtoKEW: A knowledge-based system for knowledge acquisition'', in Artificial Intelligence, Sleeman, D, and Bernsen, NO (Eds.), Research Directions in Cognitive Science: European Perspectives, volume 6, Lawrence Erlbaum, Hove, UK.

SIGART Bulletin, (1991), Special section on integrated cognitive architectures, 2(4), Aug. 1991.

Sleeman, D., Rissakis, M., Craw S., Graner N., Sharma S., (1995), ``Consultant-2: Pre- and Post-processing of Machine Learning Applications'', International Journal of Human-Computer Studies, 43, pp. 43-63.

Sundermeyer, K., (1991), ``Knowledge-Based Systems: Terminology and References'', Bibligraphisches Institut Wissenschaftsverlag, Mannheim, Germany.

van Heijst, G., Terpstra, P., Wielinga, B., and Shadbolt, N., (1992), ``Using generalised directive models in knowledge acquisition'', in Proceedings of EKAW-92, Springer Verlag.

VanLehn, K. (Ed.), (1988), ``Architectures for Intelligence'', Proceedings of the 22nd Carnegie Mellon Symposium on Cognition, 1988, Lawrence Erlbaum, Hillsdale, NJ.

Visser, P. R. S., Jones, D. M., Bench-Capon, T. J. M., and Shave, M. J. R., (1997), ``An Analysis of Ontology Mismatches; Heterogeneity versus Interoperability'', AAAI 1997 Spring Symposium on Ontological Engineering, Stanford University, California, USA, pp.164-172.

Wielinga, B. J., Schreiber, A. T., and Breuker, J. A., (1992), ``KADS: a modelling approach to knowledge engineering'', Knowledge Acquisition, 4(1), pp. 5-53.
 

End Notes (these correspond to footnotes in the paper version).

1. Note that we use the term KA to include KE, ML & KBR techniques. 
2. In this paper the knowledge level is characterised by its implementation independence, as advocated by the KADS methodology (Wielinga, Schreiber and Breuker, 1992). 
3. Or, if you like, a top level description of the kinds of information transformations which each of the tools can achieve. 
4. We use the phrase "Means-Ends Analysis" interchangeably with "Means-Ends Guidance". Strictly speaking, the guidance should be dependent on the analysis! 
5. In this case, it may be reasonable to suggest a separation of the knowledge base into two or more knowledge bases. 
6. Extended Backus-Nauer Form. In this notation, angled brackets ('< >') denote non-terminal symbols, braces and the vertical bar ('{ ... | ... }') denote alternatives, an asterisk ('*') used as a postfix denotes zero or more occurrences of the preceding non-terminal symbol, and a plus sign('+') denotes at least one occurrence of the same. 
7. A reference could be supplied, where possible. 
8. Personal Construct Psychology uses the term element for an instance and construct for a property. 
9. Most of the main program components are better placed at the server site, where they can be monitored and maintained. However, knowledge elicitation tools are sometimes better placed in the client program because of their interactive nature. 
10. A demo of our Repertory Grid Tool is at http://www.csd.abdn.ac.uk/~swhite/repgrid/repgrid.html 
11. We are not currently addressing the repair problem which arises in the event of mismatches. 
 
Last Modified: 07:18pm GMT, February 26, 1998