Evaluating a PSM specification language
in Adaptive Design:
The Autognostic Experiments

Eleni Stroulia {1}, and Ashok K. Goel {2}

1. Department of Computing Science,
615 General Services Building,
University of Alberta,
Edmonton, AB, CANADA T6G 2H1,
2. College of Computing,
Georgia Institute of Technology,
Atlanta, GA 30332-0280, USA


In this paper, we report on our experience with the evaluation of a PSM modeling language and an adaptive-design process based on this language, implemented in the Autognostic system. Our objective in building Autognostic was to develop a system able first, to accomplish a range of learning tasks, and second, to meet three performance-improvement requirements. As evaluation criteria, we defined the following: feasibility and effectiveness of the process, generality of the modeling language, and realism of both the language and the process. To investigate the degree to which Autognostic's language and process meet these criteria, we have performed a series of experiments which we describe in this paper.


Abstract specifications of processing mechanisms play an important role in supporting the knowledge-engineering process  [Clancey 1985, Clancey 1986, Steels 1990, Chandrasekaran 1989, Chandrasekaran and Johnson 1993]. Their original purpose was to provide a language in which the human expert could communicate his/her problem-solving knowledge, and the knowledge engineer could identify the ``standard'' tasks and mechanisms underlying expert performance in a variety of domains. An additional objective was to develop formal representations of these specifications so that a computer program could interpret them and, thus, mimic human expert performance. To date, a variety of different languages have been proposed for specifying PSMs (Problem-Solving Methods) and, associated with them, methods for different tasks in the knowledge-engineering process.

Design of the ``expert'' knowledge-based system is a very important such task. A lot of research efforts have focussed on the problems of designing knowledge-based systems by configuring abstract, reusable PSMs and of verifying the configuration  [Fensel and Schoenegge 1997, Benjamins et al. 1996, Wielinga and Breuker 1986, Wielinga et al. 1992]. A related design problem is to adapt an existing system into another one, with a similar yet different behavior. This problem may arise naturally in several contexts. Behavioral requirements on any system are bound to evolve, when it is deployed in a real-world environment over a long time. Or alternatively, a system may be re-deployed in a new environment, in which it may face requirements slightly different from the ones imposed on its original implementation. In these situations, adaptive design becomes extremely important.

In our work, we have focussed on designing (i) a process for the task of adaptive design, and (ii) a PSM modeling language for capturing the specific problem-solving mechanism and domain knowledge of a system, in a manner that can effectively support this task. The adaptation process we have developed consists of four tasks: (i) monitoring the system's problem-solving behavior, (ii) assigning blame when it fails, i.e., identifying the cause of its failure, (iii) appropriately modifying the system to eliminate the cause, and (iv) evaluating the usefulness of the modification. This process is based on the Structure-Behavior-Function model of the problem-solving Tasks-Methods-and-Knowledge (SBF-TMK models) of the system. Models in the SBF-TMK language capture (i) the functional semantics of the system tasks and domain operators, (ii) the compositional semantics of its problem-solving methods that recursively synthesize the inferences drawn by the operators into the outputs of its overall task, (iii) the types of domain knowledge available to the system, and (iv) the ``causal'' interdependencies between (sub)tasks, methods and domain knowledge. Autognostic is a ``shell'' system that implements this adaptive design process for systems modeled in terms of the SBF-TMK language.

In this paper, we report on our experience with the evaluation of Autognostic's PSM modeling language and adaptive-design process. Our objective in building Autognostic was to develop a system able first, to accomplish a range of learning tasks, and second, to meet three performance-improvement requirements. Having implemented a first version of Autognostic, we defined as evaluation criteria the following:

To evaluate the generality of Autognostic's modeling language and process we have integrated it with four systems. With each new integration, different experiments were conducted to investigate some of the other evaluation criteria. The first system to be integrated with Autognostic was Router  [Goel and Callantine 1992], a path planning system. The first experiment with Router was intended to evaluate the feasibility of accomplishing the desired learning tasks based on a system's SBF-TMK model. Subsequently, a set of experiments was designed to evaluate the effectiveness of the method, that is, to evaluate the extent to which it was able to accomplish the three types of desired performance improvement, i.e., efficiency improvement, quality-of-solution improvement, and class-of-solvable-problems increase.

The next system integrated with Autognostic was Kritik2  [Goel and Stroulia 1997], a designer system. The experiments conducted with Kritik2 aimed at testing whether the different types of adaptations that Autognostic's learning process was capable of performing were useful across application domains. The underlying goal was to investigate the generality of Autognostic's language and process.

Next came the integration with a robot developed in the context of the AuRA architecture  [Arkin 1989], Router cs. Experimentation with Reflecs aimed at evaluating the realism of Autognostic's knowledge and process assumptions, since robots are systems naturally in need of adaptive redesign. In fact the Reflecs experiment, although performed in a simulation environment, was designed around a AAAI robotics competition scenario. Finally, in the same spirit followed the integration with David  [Prassler et al. 1996], a real robot.

The rest of this paper presents SBF-TMK models and the failure-driven process for adaptive design based on them. Then, it discusses the different Autognostic experiments. Due to space limitations, the experiment discussion cannot be very detailed. However, to the extent that this is possible, we have tried to outline the rationale of the experiments, their design, and their results. For the reader interested in the particulars of each experiment, further references are given. In Section  2 we describe the content and the representation of the SBF-TMK models, along with their assumptions. In Section  3, we briefly describe Autognostic's reflective learning process. In Sections  4,  5,  6.1 and  6.2 we describe each of the specific experiments. Finally, Section  7 summarizes our experiments and draws some conclusions with regards to evaluation of PSM languages, in general.


  The representation language of SBF-TMK models of problem solving derives from a language originally developed for describing how physical devices work  [Goel 1989]. Their semantic primitives are based on the task-structure theory of problem solving  [Chandrasekaran 1989].

Content Theory: Task Structures and Domain Meta-model

In the task-structure framework, a problem-solving mechanism is analyzed in terms of the tasks it accomplishes and the methods it employs to do so. A task (first row in Table  1) is characterized in terms of the types of information it consumes as input, the types of information it produces as output, and its functional semantics, i.e., a partial specification of the nature of the transformation it accomplishes between the two. A task can be accomplished by one (or more) methods, each of which decomposes it into a set of simpler subtasks. The system's operators, i.e., its primitive design elements, constitute the elementary building blocks for all the tasks that it is capable of accomplishing. In addition to its functional semantics, the information-transformation function of a task can be characterized by specifying this task as an instance of another prototype task. Then this task inherits the prototype's functional semantics, and can be accomplished by all methods applicable to the prototype. The concept of prototype tasks derives from the concept of generic tasks in the task-structures theory.

A method (second row in Table  1) is characterized by the subtasks in which it decomposes the task it is applied to, and the control it exercises over their processing. Thus, a system's internal problem-solving mechanism is specified by its task structure, i.e., the recursive decomposition of its overall task in terms of methods and subtasks into a set of elementary, leaf, tasks, directly accomplished by the system's domain operators.

Notice, that specifying a task as an instance of one (or more) prototype(s) is different than specifying a variety of methods for it. Both have a similar effect in the control of processing, in that they specify alternative processing paths to be distinguished at run-time. Methods, however, are specific to the task structure in the context of which they are defined, where prototypes can be shared by, and instantiated in, different task structures. In that sense, prototypes encapsulate PSMs reusable across tasks, albeit in a single domain, where methods encapsulate mechanisms specifically tailored to the task to which they apply, and they may be task-dependent and non-reusable. An interesting consequence of this difference is that, modifications that may be deemed useful in the task decomposition of a prototype when instantiated in a particular task structure, can, in principle, be transferred to other task structures also. However, whether or not such transfer is beneficial to the all instances of the prototype has to be evaluated in the context of each particular instantiation. Prototypes are similar to the generic tasks in the task-structures literature and the PSMs in the KADS literature, in that they are reusable, but differ from them, in that they are not domain-independent; the specification of a prototype refers to specific types of domain knowledge that are required for it to be accomplished.

In an extension of the task-structure framework, SBF-TMK models also capture a meta-model of the system's domain, in terms of domain concepts (row four of Table  1), relations applicable among them (row five of Table  1), and higher-order constraints. Together, the system's task structure in terms of tasks and methods and its domain meta-model in terms of concepts, relations, and constraints constitute its SBF-TMK model. Note, that there are two interdependencies between these two aspects of the system's model. First, the types of information (row three of Table  1) that flow through the system's task structure, as inputs and outputs of tasks, are characterized as instances of some type of object known to the system. Second, the functional semantics of the tasks are expressed in terms of domain relations applicable to the objects of which their inputs and outputs are instances.

Representation Language

SBF-TMK models are hierarchical, nondeterministic, state-transition models. Based on the SBF models for physical devices, which are substance-flow models, they represent a system's internal processing as information flow. They specify a task as a active component transforming an input information state to an output one. This transformation is annotated by the task's functional semantics, conditions, methods, prototype and operator.

The tasks' input and output states are ``snapshots'' of the system's information flow. Each piece of information flowing through the system's internal processing is identified as an instance of a domain concept known to the system. The system's known domain concepts are organized in a hierarchical taxonomy. A concept (row four in Table  1) is characterized by its attributes, a definition of what constitutes identity among its instances, and the range of its possible values. A domain relation (row five in Table  1) is specified by the concepts it applies to, a pointer to its evaluation (truth table or predicate), and potentially a predicate evaluating its inverse relation. Finally, a domain constraint is specified by pointers to the relations to which it refers, and the relation it imposes between them.

A method in the SBF-TMK language is specified as a more detailed information flow network explaining a complex task's information transformation. The state transitions in this network are carried out by the method's subtasks. The whole network is indexed by the task to which the method is applicable. This index is annotated by a set of conditions on the information available at the time the method is invoked, specifying the situations in which the method can be applied.

Table 1: Elements of the SBF-TMK language.


To better explain the SBF-TMK models, let us explicitly present some of their underlying assumptions:

1. The domain relations used to specify the tasks' functional semantics are assumed to also specify pieces of domain knowledge on which the inferences used to accomplish the task are based. That is, if the specification of a task's, t, functional semantics includes the relation R(i,o), domain_relation_R(input_info_i output_info_o) (i.e., tex2html_wrap_inline770), then it is assumed that the operator accomplishing the task (or, if it is not a leaf task, one of the operators in the task's strategy) uses the system's domain knowledge about domain_relation_R (i.e., op(t) uses tt(R), OR op(t) uses p(R)).

2. The prototype of a task is not a domain-independent task, in the sense that generic tasks are. Instead it is a ``generalized'' domain-specific task with (i) an input information state consisting of information types that are a a superset of its instantiations input states (ii) an output information state consisting of information types that are a superset of its instantiations output states, and (c) a functional semantics which is a subset of its instantiations' functional semantics.

3. A method is specific to the single task to which it is applied. That is, the state-transition network into which a method refines the state transition of a task begins at the task's input information state and ends at the task's output information state.


  Autognostic is a shell, which, given a SBF-TMK specification of a system, can monitor its behavior and appropriately adapt it when it fails.

Task Monitoring and Failure Detection

A mission, i.e., a high-level task, is accomplished by a pre-order traversal of the task structure in which it is decomposed. When Autognostic visits a non-leaf task, it evaluates the applicability conditions of its methods in the current state of the system's environment. If multiple methods are applicable, Autognostic selects one at random and keeps the rest of them as alternatives, in case the selected one fails. Then, it updates its task agenda with the subtasks of the selected method. If a task is specified as an instantiation of some prototype(s), Autognostic evaluates the conditions under which these prototypes tasks can be accomplished and selects one whose applicability conditions are met, while remembering the others as alternatives. Finally, when Autognostic reaches a leaf task, it proceeds to invoke the domain operator that accomplishes it. Once a task is accomplished, Autognostic proceeds to accomplish the next one on in its agenda, until this agenda becomes empty.

A task's functional semantics specify how the task's output relates to its input; thus, while spawning a task with a particular input, Autognostic formulates expectations with respect to its output. As each task is completed, it evaluates the correctness of its output against its semantics, i.e., verifies whether the actual output relates to the actual input as per the task's functional semantics. Failure to conform with the semantics enables Autognostic to recognize when a specific task has not delivered the information expected of it. In addition, as we will see in the Reflecs example, Autognostic is able to recognize problematic task interactions that can lead to lack of progress.

Task-Structure Repair: Blame Assignment, Modification, Verification

Autognostic is able to deal with failures it recognizes itself, as violation of a task's semantics, and with failures pointed to it by the environment, as behavior that is unacceptable to the user in spite of being valid with respect to the tasks' functional semantics. In either case, it proceeds to assign blame for the cause of the failure to some of the system's tasks and/or domain knowledge. The blame-assignment process  [Stroulia and Goel 1996] is a successively focussed examination of the tasks involved in the production of the erroneous behavior, beginning from the first task whose semantics were violated, in the former type of failure, or with the overall system task, in the latter. Autognostic is able to identify errors first, in the specification of the system's operators, that is, it can identify operators that are under- or over-specified with respect to the overall behavior desired of the system, second, in the organization of the task structure, that is, errors in the control of processing among the tasks, and third, errors in the system's domain knowledge, that is, incompleteness or incorrectness of its domain knowledge  [Stroulia 1995].

Once a (set of) potential cause(s) has been identified by Autognostic's blame-assignment process,  [Stroulia and Goel 1996], Autognostic decides to address the most grave one. To this end, it may modify the system's domain knowledge, or it may redesign its task structure either by modifying the control among the tasks, or by re-specifying its domain operators  [Stroulia and Goel 1997].

Finally, once the modification is complete, Autognostic proceeds to verify the usefulness of the modification by executing and monitoring the failed mission once again. If this time the mission is successfully completed, then the modification is considered correct. Otherwise, Autognostic starts a new cycle of blame assignment, modification and verification until an effective modification is completed, or no other potential cause can be identified.


  Router  [Goel and Callantine 1992] was the first system integrated with Autognostic. It is a sophisticated and yet simple path planner, that uses multiple reasoning strategies to accomplish its task, and these strategies, in turn, use several types of knowledge. Its knowledge about its domain includes a hierarchically organized model of its micro-world, and an episodic memory of previous path-planning problems and their solutions. It has itself some simple learning capabilities, independent of Autognostic, since it extends its path memory by storing the result of each new problem-solving episode. On the other hand, its domain is quite simple since it consists of a few types of domain concepts and a few relations between them. This makes it possible to build a detailed SBF model of both Router 's task structure and its domain knowledge.


Because of its rich and yet not overly complicated task structure, Router allowed experimentation with several interesting scenarios that called for different learning tasks. A set of individual problem-solving-and-learning scenarios provided the basis for the initial feasibility testing. In each of these scenarios, Autognostic-on-Router was presented with a problem, failed to produce the desired solution which was subsequently given as feedback, and adapted itself in a way that enabled it to produce the feedback. For each of Autognostic's learning tasks, i.e., types of modifications, a different scenario was designed. Together, all these scenarios established that a set of learning tasks were possible to accomplish with Autognostic's model-based learning process. In addition, these experiments demonstrated that each learning task resulted in some improvement to the system, since Router after learning could produce a desired solution that it could not produce before learning.


These individual experiments showed that Autognostic was capable of improving the system it was integrated with, in that it could enable this system to produce the desired feedback. However, they did not completely address the issue of Autognostic's effectiveness in improving a system's performance in terms of the quality of the solutions it produces, the efficiency of its problem-solving process, and the population of problems that the system can solve. The affects of performance-driven learning to the system's performance can be evaluated at several different levels of experience with problem solving. For example, even if there is relative improvement within a problem-solving-and-learning cycle, there is no not guarantee that, in the long term, the performance of the problem solver will improve for the complete class of problems it addresses. Thus, there is a need for a separate evaluation of the effectiveness of a learning theory after a sequence of problem-solving-and-learning episodes.

To evaluate the effectiveness of Autognostic's learning process after a sequence of problem-solving-and-learning episodes, two different sets of experiments were conducted with Autognostic-on-Router. One was designed to investigate the long-term affects of reflection on the quality of the problem-solver's solutions, and the other was designed to investigate its affects on the efficiency of its process. A random sequence of 150 problems in Router 's domain was generated, and Router, in its original condition, was presented with each one of them. Three different sequences of 40 problems, randomly selected from the original set of the 150 problems, were generated. Each of these three sequences was presented to Autognostic-on-Router in the context of two experiments. The first one evaluated the effectiveness of Autognostic-on-Router 's reflection process as a means of improving the quality of Router 's solutions: for each of the training problems, if Autognostic-on-Router did not produce the shortest path, it was given this path as feedback. It then proceeded to adapt itself, if possible, to produce the feedback. The second evaluated the effectiveness of Autognostic-on-Router 's reflection as a means for improving the efficiency of Router 's reasoning, and in this case, the feedback was the path that was fastest to produce, i.e., the path that required the least amount of search steps. Note that, in order for Autognostic to meet a process requirement, in this case process-efficiency improvement, this requirement has to be expressed in terms of an output-quality attribute; otherwise, it cannot be captured in the feedback that Autognostic accepts. At the end of all these 40-training-problems sequences, the modified Router was presented with the original 150-problems sequence. Its behavior after being redesigned by Autognostic was compared to its behavior before. The results of these experiments are reported in the Tables  2 and  3. These tables show how successful Autognostic was, i.e., in how many problems Router 's behavior improved after learning, in how many problems it deteriorated, and what do these results mean statistically. The very low value of the sign test indicates that there was, in all experiments, a statistically significant improvement.

Table 2: Results on Quality-of-Solution performance improvement.

Table 3: Results on Process-Efficiency performance improvement.

In both of these sets of experiments, the class of problems that Router \ was able to solve after its adaptation by Autognostic was increased with respect to the set of problems it could solve before. This was due to the new domain knowledge it acquired through the assimilation of the feedback information. The results of the experiments with respect to increase of population of problems solved by Router is shown in Table  4.

Table 4: Results on Population-of-Solvable-Problems performance improvement.

Comments on the Router Experiment

Modeling Router in the SBF-TMK language made obvious the fact that, for any system, there are several different ``correct'' models that can be specified for it in SBF-TMK. Two models of the same system may differ among themselves in several dimensions. First, they may decompose the system differently and therefore they may identify different subtasks and methods. Or they may decompose it in the same manner, but to a different level of detail. Or alternatively, they may decompose it in exactly the same manner but specify the different elements (tasks and methods) differently, i.e., in greater or less detail. Our experiments have shown that such differences in a system's potential models have an impact on Autognostic's ability to adapt the system in question.

For example, if the model is not precise, i.e., if the expected correct behavior of the problem-solver's tasks is poorly described, (i.e., sparse functional semantics or not at all), then the blame-assignment method may fail to evaluate whether the feedback solution is within the class of solutions that the system can produce, and also it may fail to suggest changes. Alternatively, if the model does not explain in enough detail the problem-solver's subtasks, i.e., if it only analyses its process in terms of big, complex subtasks, then the blame-assignment method will fail to localize sufficiently the potential cause of the failure, and consequently it may fail to suggest operational modifications since the elements under inspection will be large and complex. Finally, the blame-assignment task can only localize the cause of the failure to some of the identified elements; therefore, if the decomposition that the model captures does not identify the ``wrong'' element then it will fail.

Finally, the ``long'' Router experiments brought to the foreground the important issue of convergence, i.e., the issue of whether the learning process will identify a single good design or it will keep on making modifications, perhaps undoing earlier ones in the process. Our inspection of the modifications actually performed during the six long experiments revealed that modifications of the system's task structure are rare events, where modifications of its domain knowledge are far more often. In addition, in all six experiments Autognostic modified the same elements of Router 's task structure. This indicates that, given a particular design of a system, there are some elements that are more flexible than others, and these are the elements that are bound to get modified when new requirements are imposed on the system's performance. Furthermore, the task-structure modifications occurred in different patterns in these six experiments; in the process-efficiency experiments they all occurred early on, where in the quality-of-solution experiments they were more evenly dispersed among the 40 training problems. An explanation for that phenomenon could be that different ``optimization'' criteria are more or less ``natural'' to the current system's design, and thus the redesign process towards ``optimizing'' them converges faster or slower correspondingly. This is true for these experiments with Router since at no point during the design of Router was optimality of paths (in any dimension) considered. These hypotheses, although plausible, should be explicitly investigated by new experiments.


  Kritik2 is an adaptive design system  [Stroulia et al. 1992, Goel and Stroulia 1997]. Although it performs a task quite different from Router, Kritik2 too has multiple types of knowledge about its domain, including an episodic memory of cases of known devices, models which explain the functioning of these devices in terms of deep functional and causal knowledge, and semantic knowledge of the substances and the components that are available in its technological environment. Kritik2 uses a model-based method for the overall adaptive-design task it performs, however, it has several methods for modifying the known devices to accomplish new functions; thus, Kritik2 's task structure, too, is nondeterministic.

The aspect that made Kritik2 especially interesting as an experimental test-bed for Autognostic is the fact that it uses SBF models for capturing how devices work. These SBF models are based on a different ontology (components, substances, and structural and functional relations among them) than Autognostic's SBF-TMK models (tasks, methods, and knowledge) since they describe different artifacts (physical devices instead of problem solvers) but they share basically the same representation and organization scheme with Autognostic's SBF-TMK models. So the task of integrating Autognostic and Kritik2, in addition to evaluating Autognostic, is an interesting experiment because it enables a comparative analysis of the roles that SBF models can play in these two different domains of designed artifacts.


Autognostic's learning process was evaluated only with single problem-solving-and-learning episodes, in the context of its integration with Kritik2. Kritik2's domain is much more complex than Router 's, and as a result, experimentation with long sequences of problems would require massive infusion of knowledge to Kritik2, which is beyond the scope of this work. However, since we wanted the Autognostic-on-Kritik2 experiments to also add to the evidence for (or against) the generality of Autognostic's learning process, we carefully selected problems where Kritik2 's failure to produce the desired device would be caused by a variety of errors. Thus, the set of problems which were presented to Autognostic-on-Kritik2 caused Autognostic to exercise all types of modifications that it is able to perform. This effect contributes also some evidence to the fact that the range of learning problems addressed by Autognostic is realistic, in that, Autognostic's modifications are not specific to Router 's task and/or domain, but they are relevant across tasks and domains.


Of course, these two systems constitute two points only in the space of systems where Autognostic was intended to be of use. Simply to continue integrating Autognostic with more systems, although it might provide additional evidence to its generality or identify its limits, was neither practically possible, nor rational from an evaluation-methodology point of view. Therefore, we identified some dimensions in which it was important to evaluate Autognostic's generality. These were (a) the amount of knowledge that the system to be adapted has, (b) the performance requirements on this system, and more importantly, (c) the degree to which adaptation was a natural part to the system's lifecycle. Based on these dimensions, we chose to evaluate Autognostic in the context of robotics. The reasons motivating this choice were the following. First, robots are systems that often face unforeseen changes in their environments and in the requirements on their behavior, because they live in real and complex worlds. Therefore, evaluating Autognostic in the context of robotics applications would test the realism of the assumptions underlying its knowledge and process. Second, most robots exhibit both deliberative and reactive behaviors; this range of behaviors imposes some constraints on the performance of a reflective learning process. Finally, from a pragmatic point of view, a robot, Reflecs, was readily available to us for experimentation. In contrast to the previous two systems integrated with Autognostic that had been developed under the task-structures methodology  [Chandrasekaran 1989], from which Autognostic's SBF-TMK models also derive, this robot had been developed in the context of the AuRA architecture  [Arkin 1989]. It, therefore, provided a good test-bed for evaluating the usefulness of SBF-TMK modeling of an independently built system, since it did not share Autognostic's design and development principles. Finally, to evaluate Autognostic in another point of this dimension, we conducted yet another experiment in robotics. This time, the robot was David, and its processing mechanism was explicitly developed to make use of Autognostic's monitoring capabilities. Thus, the David experiment was tailored to evaluate the usefulness of SBF-TMK modeling, in the context of a system built under the SBF-TMK methodology.



The first robot integrated with Autognostic was Router cs, implemented in the context of the AuRA architecture  [Arkin 1989]. At the lowest level of the AuRA architecture, the robot's behavior is accomplished through the interaction of a set of perceptual and motor schemas. The perceptual schemas are linked to the robot sensors to perceive some aspect of the environment at each point in time. Each motor schema is linked to one or more perceptual schemas, and produces a motion vector as a function of the sensory information provided by them. The actual motion of the robot depends on the synthesis of all the vectors produced by all the active motor schemas.

Reflecs participated in the the AAAI-93 robotic competition, accomplishing the office-rearrangement task. For this task, the robot is placed in an office with several scattered boxes in it, and it has to move them into a designated area of the office. For each box it had to bring to the designated area, Reflecs went to the designated location, planed its path towards the target box, reactively reached this location, grasped the box, and brought it back to the designated location following blindly the same path. The experiment we discuss in this paper involves the third of these tasks, namely the task of reaching an individual box. The SBF-TMK model of the process is diagrammatically depicted in Figure  1.

Figure 1: Router cs' task structure.

The figure depicts tasks as solid-line boxes, and prototypes as bold solid-line boxes. Flow of control is represented by bold, solid arrows. Flow of information is not explicitly represented in the diagram, although the model captures it through the specification of the tasks inputs and outputs. Dashed boxes, indexed with dashed arrows by tasks, represent methods and they include the method's subtasks. Finally, ellipses, attached to task boxes, represent operators accomplishing the leaf tasks of the task structure. The conditions and semantics of some of the tasks are shown as annotations at the low-right corner of the task box. The relations, to which these annotations refer, are part of the system's domain knowledge.

In this figure, we see that the get-to-box task is accomplished by a method which repeatedly invokes the step-in-get-to-box task. This task is accomplished by the interaction of four schemas. Reflecs uses the shaft-encoder reading and lasers reading perceptual schemas, to infer its position at each point in time and the positions of the static objects in its immediate environment. Two motor schemas, i.e., move-to-goal and static-object avoidance, are also active. The first one uses the robot's current position and the position of the target box to produce a vector pushing the robot towards the box. The second one uses the lasers' readings and the robot's current position to produce a vector repulsing the robot from the static objects around it. The actual motion of the robot is then decided by the synthesis of these two vectors, by the operator synthesis. The cycle stops when the shaft encoder shows that the robot has reached its goal, that is, when the robot is within a small radius from the target box.

An experiment on Failure-Driven Self-Adaptation

When Reflecs (in a simulated experiment) was presented with this task, it was able to get close to the box but then it started oscillating in small steps around it. This was due to the static-object avoidance schema which repulsed it from the target box, at the same time when the move-to-goal schema pushed it towards it.


Autognostic's monitoring process recognizes the lack of progress in Router cs' oscillatory motion, and suspends the task execution to assign blame for it. The detection of the failure is made possible by the knowledge captured by the SBF-TMK model of the system's processing. First, based on the functional semantics of each task, Autognostic is able to evaluate whether the output of each step is correct by testing whether it verifies the semantics or not. The conditions of the step-in-get-to-box task specify that, this task is invoked as long as Reflecs has not reached the box, and its semantics specify that, the motion vector it produces as output should be of magnitude significantly greater than zero. This semantics fails as Reflecs gets close to the box.

The SBF-TMK model provides yet another mechanism for detecting lack of progress, based on the causal interactions of the different subtasks in the system's processing. By specifying move-to-goal as a subtask of the method accomplishing the step-in-get-to-box task, the model makes explicit that each instance of the former task in the system's trace is performed in service of the latter. Therefore, when the exact same step, i.e., the move-to-goal subtask with the same input parameters, is invoked in two consecutive cycles in service of the same step-in-get-to-box task, Autognostic recognizes that a task recurs and no progress is being made.

Blame Assignment

In its effort to assign blame for the synthesized vector of zero magnitude, Autognostic establishes that the task responsible for producing it, i.e., synthesis, was not accomplished in the last cycle because the condition of its invocation was not met: its input vectors were of opposite direction and same magnitude, and the synthesis task is not meaningful in such conditions. Thus, it infers as a potential cause for the failure the over-constrained conditions of the applicability of the synthesis task. As possible modifications, it suggests first, the replacement of the synthesis task with another which could be invoked even when the current one would not, or second, the falsification of the condition which prevents this task from being invoked.

If chosen, the first modification would require the existence of a task which would produce a vector even with opposite vectors as input. A possible implementation would be a task that, under these conditions, produces a ``noise'' vector to move the robot randomly out of the local minimum in which it is trapped. This option was actually considered by the design team of Reflecs before the competition. However, they rejected it in favor of developing a new obstacle avoidance schema, which, as we will see later, Autognostic also decides to employ. The second modification could be implemented by ensuring that the information on which the task's condition applies, i.e., the input vectors, has different values. This can be accomplished, if the tasks that produce these types of information, i.e., move-to-goal or static-object avoidance , were different. This latter modification is chosen as immediately applicable: there is an alternative instantiation for the avoidance task which is used to replace the static-object avoidance in the failing task structure, and thus change one of the offensive vectors input to synthesis task.

Notice the roles that the SBF-TMK model plays in the blame-assignment step. First, by specifying the information-production-consumption relations between the system's tasks and the information flowing through its task structure, it establishes information-flow paths (from synthesis to move-to-goal and static-object avoidance) which are treated as causal-flow paths for identifying potential causes for the failure. Second, the tasks' semantics and conditions enable the blame-assignment process to localize which of these paths to follow. For example, only the paths leading to information that does not conform with the semantics of its producing task, or information that causes the failure(success) of a condition that should(not) have been met (move-to-goal producing the attractive vector), need to be explored. Third, by establishing a hierarchical organization on the system's task structure, the model enables the blame-assignment process to recursively refine its search for the cause of the failure (from step-in-get-to-box to synthesis).


Of the above suggested modifications, the simplest to accomplish is the modification of the static-object avoidance task, since, as shown in Figure  1, this task is an instance of a prototype with another alternative instance, namely the avoid-known-obstacle task. This task does not avoid all static objects in the robot's environment, but only the ones that are categorized as obstacles in its knowledge base. Notice that it is the prototype-instance relation among tasks that enables Autognostic to redesign Router cs' task structure in this experiment. However, had there not existed any such alternatives for the identified culprit tasks, Autognostic would have used the actual information state of Reflecs as an example of undesired behavior to refine the semantics of the failed tasks (beginning from synthesis). After having modified Router cs' task structure, Autognostic resumes the execution of its process. The schema implementing the known-obstacle avoidance task does not produce a repulsive vector for the target object, since this box is not really an obstacle, and thus Reflecs reaches its goal. This success validates the effectiveness of the chosen modification.

Comments on the Reflecs Experiment

This experiment highlighted an issue that, until that time, had not been explicitly dealt with: namely, the question of whether task-structure modifications are permanent or not. In Autognostic's integration with Router \ and Kritik2 these modifications were permanent. This is a reasonable decision under the assumption that the system's failures are caused by the evolution of the environment's functional requirements on the system's output. In the Reflecs experiment however, the failure is due to a conflict between the specific problem that the system has to solve and some part of its task structure. Therefore, it is not clear that the task-structure should be permanently modified, since this conflict may not arise with other problems. In terms of the specific experiment we discussed above, the default mechanism for obstacle avoidance works fine when the robot does not have to manipulate objects, and it is more efficient and therefore preferable. For this experiment, Autognostic's process was modified to retain both the original and the modified task structures, the latter one annotated by the symptoms of the failure and the modification that produced it. The original task structure becomes the default option when the agent is asked to accomplish a specific instance of the task at hand. If however there is a failure, similar to the one addressed previously, the agent may automatically switch to the modified task structure without going through the blame-assignment and repair processes.

Efficiency: Run-time Performance

Another issue, a direct result of the reactive nature of a robot's behavior, was the interaction between the reflection process and the actual problem-solving process. In Autognostic's integration with Router and Kritik2 the reflective monitoring process and the reasoning process were completely integrated. In Router cs, Autognostic's reflection and the robot's processing are two independent processes, but they are still synchronized at each reactive cycle. When there is a problem the reflection process interrupts the robot and takes over. This synchronization incurs a high overhead on the robot's performance.

An alternative, explored in a subsequent Reflecs experiment  [Goel et al. 1997], is to let the two processes run in parallel, and have monitoring done externally. That is, the deliberative reflection process does not monitor the actual processing of the reactive robot, but rather the trace of its progress towards the goal. Once a failure pattern, in the form of a behavioral cycle, is recognized, the reflection process and the internal monitoring kick in. The task is then performed in ``slow motion,'' and the traces of the problem solving is then examined by the deliberative reasoner in order to do blame assignment and reconfiguration. After the problem is fixed, the system returns to external monitoring to provide fast response.

To compare the advantages of internal versus external monitoring, we conducted five experiments with behavioral cycles in randomly generated worlds with fixed starting and goal positions. Twenty obstacles are generated for each experiment. The x and y coordinates of the obstacles are uniformly distributed random numbers between 0 and 64. The radius of the obstacles are uniformly distributed random numbers between 1 and 5. An obstacle is deleted if it overlaps the start or goal locations. Table 5 summarizes the efficiency of processing using internal and external monitoring. The time spent on the task is the time from the first simulation iteration to the the iteration when the robot reach the goal with a puck. It is registered using time() function in the simulation program.

Table 5: A Comparison between Internal vs. External Monitoring

A paired t-test shows that on average, the time taken for a step using external monitoring is 0.08 seconds, and the average time for a step using internal monitoring is 3.44 seconds. The t value for the paired t-test is 19.2, significantly above the 99.5% confidence level.



The second autonomous robot integrated with Autognostic was David  [Prassler et al. 1996]. David was developed as a service robot for office-logistics tasks, such as delivering printout and mail, and collecting and emptying waste baskets. Its development team, consisting of approximately ten people, produced a set of procedures each one accomplishing a separate functionality for perceiving, navigating, or manipulating the environment. Each one of these procedures were subsequently modeled as elementary tasks in the SBF-TMK language, and Autognostic was used to integrate them into task structures appropriate for accomplishing David 's various missions. The experiment we discuss in this paper involves David 's waste-basket-collection task, the task structure of which is shown in Figure  2.

Figure 2: David 's task structure.

To collect a waste basket from an office, David first must locate where exactly it is, then it has to pick it up, go to the collection location and put it down. To locate the basket, David first plans a path to search effectively the office in question. This is accomplished by an operator that implements a heuristic art-gallery-planning search. This task produces a path which consists of a minimal set of observation points which suffices for observing the complete free space in an area. Having planned such a path, David repeatedly selects a point from this path, goes there, and looks around to find the basket. If the basket is found, then David docks it, thus completing the locate basket task. Otherwise, it proceeds to select another observation point from its art-gallery path. As soon as David has completed the locate basket task, it can then proceed to pick up the basket, go to the collection location, and put it down there, thus completing its basket collection task.

An Experiment on Flexible run-time Control

As we can see from the Figure  2, there are no specific decompositions designed for the go-to collection location and go-to next search location tasks. Instead, these two tasks are specified as instances of the go-to location prototype, and are accomplished by the methods applicable to it. The method applicable to this task specifies that it is decomposed into three subtasks, establish current position, plan and move. In turn, these subtasks are specified as instantiations of several alternative prototypes.

This SBF-TMK specification constitutes a compact description of a wide range of possible behaviors, each one applicable under different environmental conditions. Autognostic's monitoring process is able to flexibly control David 's behavior at run-time, so that it exhibits the behavior most appropriate for its current information state and environmental conditions. Let us discuss how.

The first time when David has to go to a particular location is after it has planned an art-gallery path and has selected to visit the first observation point in this path. Until that time, it has never before established its own position, therefore it cannot instantiate the odometry-based self-localization prototype task, since this requires that its odometry sensors have already been initialized. On the other hand, it knows in which area it is, and it can therefore instantiate the landmark-based prototype task, if there are landmarks at known positions in this area, or the local-positioning-system-based prototype task, if there are emitters in this area. If both conditions are met, Autognostic may invoke one of the prototypes at random. In addition, if one of the applicable prototypes fails to deliver the necessary information, i.e., David 's current position, Autognostic will continue invoking the rest of them until one is successful.

Having established its current position, David has to plan a path to its destination and move towards it. The former subtask can be an instantiation of either the qualitatively plan or latombe-like plan task. The former generates a qualitative path from one area to another, and is applicable when David does not have a specific point as initial or destination location but rather a general area. The latter produces a sequence of points in a metric coordinate system, and requires that both David 's initial and destination location are specific points. Thus, while David is searching for the waste basket in an office going from one observation point to another, Autognostic instantiates the latter prototype. When, however, David has picked up the basket and needs to go to a collection area, with no particular location specified, the former prototype can be instantiated.

Finally, after having planned a path, David has to execute it. Both potential-fields-based-motion and blind motion tasks require a quantitative path as input, where the fuzzy-logic-based-motion task can be instantiated with either a quantitative or a qualitative path. So the instantiation of a move prototype depends on which planning prototype has been already instantiated.

Comments on the David Experiment

Autognostic's integration with David did not actually exercise Autognostic's failure-driven adaptation process. Instead it evaluated SBF-TMK models as a language for ``programming'' a robot's behavior, and Autognostic's reflective monitoring process as a means for flexible run-time control. This experiment showed that a very general task structure for collecting waste baskets from a suite of offices can generate a variety of different behaviors depending on the run-time environmental conditions and the robot's knowledge state. Essentially it showed that a process, originally conceived for failure-driven adaptive design, could also be effectively used to control a robot's behavior, in a manner similar to Firby's RAP architecture  [Firby 1989].


  In this paper, we reported on a sequence of experiments we conducted with Autognostic, a shell for system modeling and adaptation. Our experiments are summarized in the Table  6. Taking a step back to report on our evaluation agenda and methodology we would like to make the following remarks:

One of the most important goals in formulating our plan for Autognostic's evaluation was to investigate its generality. To this end, we wanted to integrate Autognostic with as many systems as possible, and as diverse systems as possible. Our choice of systems was to some degree intentional and to some degree opportunistic. We feel that these four systems are interesting examples from the systems space, and together provide an interesting enough coverage. They gave us some insight on the class of systems to which Autognostic's reflective learning is applicable. Essentially, as long as the system's internal processing can be viewed as a composition of well-defined elements, each of them with well-defined semantics, then this system can be modeled in the SBF-TMK language. Spreading-propagation type processing cannot be reasonably well modeled in SBF-TMK, due to the intrinsically flat nature of the system's elements and the large number of interactions among them. For a sharper characterization of the boundaries of the SBF-TMK language expressiveness, a much more sophisticated characterization of the dimensions of the systems space would be required. For complicated tasks, such as knowledge engineering and adaptive design, it is not clear that all the relevant dimensions can be identified a-priori, and perhaps some methodology such as quality-function-deployment  [Akao 1990] could be deployed to identify the more important dimensions with respect to the goals of the language/process under investigation. Once a SBF-TMK model of a system is constructed, then this system can potentially benefit from Autognostic's redesign process. Whether this will actually happen depends on whether the system's overall original design matches too closely its original requirements or whether is to some degree under-specified, so as to allow for meeting potential evolutions of these requirements.

The other criterion whose importance we want to stress is realism. Very sophisticated methods can often be developed for problems formulated in a manner too narrow to be realistic. We believe that, in addition to formulating an evaluation plan early on in the process of designing the method in question, it is very useful, at the end of the development phase, to search again about new evaluation scenaria, as independent of the method design and development as possible. The Reflecs experiment was such a choice for Autognostic's evaluation.

Our experiments gave rise to issues that led to further development and also to the formulation of further experiments. This is not an unusual phenomenon and it speaks to the need for integrating evaluation phases within the development process. Each evaluation step should investigate the degree to which the goals of the completed development cycles have been accomplished and it should drive the next one. In Autognostic's case, the issue of efficiency and performance never arose in the experiments with Router and Kritik2, since there are no time constraints on the problem-solving behavior of any of these systems. However, robots face time constraints and in the case of Reflecs we identified an alternative method for Autognostic's communication with the system it monitors that is much more efficient.

Finally, an aspect of all model-based methods that is very important to evaluate is the impact of the model quality on the method's effectiveness. To this end, it is important to define alternative models for the same artifact and investigate the method's behavior with each one of them.


Akao 1990
Akao, Y. (ed.) (1990) Quality Function Deployment, Productivity Press, Cambridge MA.

Arkin 1989
Arkin, R. (1989) Motor schema-based mobile robot navigation. International Journal of Robotics Research, 8(4):92-112.

Benjamins et al. 1996
Benjamins, R., Fensel, D., and Straatman, R. (1996) Assumptions of Problem-Solving Methods and their Role in Knowledge Engineering. Proceedings of the 12th European Conference on Artificial Intelligence, pp. 408-412. John Wiley & Sons, Ltd.

Chandrasekaran 1989
Chandrasekaran, B. (1989) Task Structures, Knowledge Acquisition and Machine Learning. Machine Learning 4:339-345.

Chandrasekaran and Johnson 1993
Chandrasekaran, B. and Johnson, T.R. (1993) Generic Tasks and Task Structures: History, Critique and new Directions. In D. Krivine and R. Simmons (eds.): Second Generation Expert Systems, pp. 232-272, Springer-Verlag.

Clancey 1985
Clancey, W.J. (1985) Heuristic Classification, Artificial Intelligence, 27:289:350.

Clancey 1986
Clancey, W.J. (1986) From GUIDON to NEOMYCIN and HERACLES in twenty short lessons, Stanford University Report STAN-CS-87-1172; also, KSL-86-11.

Fensel and Schoenegge 1997
Fensel, D., and Schoenegge, A. (1997) Using KIV to Specify and Verify Architectures of Knowledge-based Systems. Proceedings of the 12th IEEE International Conference on Automated Software Engineering (ASEC-97).

Firby 1989
Firby, J. (1989) Adaptive Execution in Complex Dynamic Domains. PhD Thesis, Tech. Report YALEU/CSD/RR #672 Yale University.

Goel 1989
Goel, A. (1989) Integration of Case-Based Reasoning and Model-Based Reasoning for Adaptive Design Problem Solving. Ph.D. Dissertation, Dept. of Computer and Information Science, The Ohio State University.

Goel and Callantine 1992
Goel, A. and Callantine, T. (1992) An Experience-Based Approach to Navigational Route Planning. Proceedings of the IEEE/RSJ International Conference on Intelligent Robotics and Systems. Volume 2, pp. 705-710. IEEE Press.

Goel and Stroulia 1997
Goel A., and Stroulia E. (1997) Functional Device Models and Model-Based Diagnosis in Adaptive Design. Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 10(4):355-370, Special Issue on Functional Representation and Reasoning, Cambridge University Press.

Goel et al. 1997
Goel, A., Stroulia,E., Chen, Z., and Rowland, P. Model-Based Reconfiguration of Schema-Based Reactive Control Architectures. Abstract in the Proceedings of IJCAI'97. A longer version of this paper has been accepted for the AAAI Fall Syposium on Model-Based Autonomy at MIT in October 1997.

Prassler et al. 1996
Prassler, E., Stroulia, E., Strobel, M., and Kaempke, T. (1996) Mobile Robots in Office Logistics. In the Proceedings of the 27th International Symposium on Industrial Robots, Milan, Italy, October 6-8 1996, ISIR'96, pp. 153-159.

Steels 1990
Steels, L. (1990) Components of Expertise. AI Magazine 11:30-4. AAAI Press.

Stroulia et al. 1992
Stroulia, E., Shankar, M., Goel, A., and Penberthy, L. A Model-Based Approach to Blame Assignment in Design. In the Proceedings of the Second International Conference on AI in Design, Pittsburgh, June 22-25, AID'92, pp. 519-537, Kluwer, Dordrecht.

Stroulia 1995
Stroulia, E. (1995) Failure-Driven Learning as Model-Based Self-Redesign, PhD Thesis, Tech Report GIT-CC-95-38 Georgia Inst. of Technology.

Stroulia and Goel 1996
Stroulia, E. and Goel, A. (1996) A Model-Based Approach to Blame Assignment: Revising the Reasoning Steps of Problem Solvers. Proceedings of the Thirteenth Annual Conference on Artificial Intelligence, pp. 959-965, Portland, Oregon, August 4-8 1994. AAAI Press.

Stroulia and Goel 1997
Stroulia, E. and Goel, A. (1997) Redesigning a Problem-Solver's Operators to Improve Solution Quality. Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, Nagoya, Japan, August 23-29, Morgan Kaufman Publishers.

Wielinga and Breuker 1986
Wielinga, B.J., and Breuker, J.A. (1986) Models of expertise. Proceedings Seventh European Conference on Artificial Intelligence, pp. 306-318. Brighton, UK 1986.

Wielinga et al. 1992
Wielinga, B.J., Schreiber, A.Th., and Breuker, J.A. (1992) KADS: A modelling approach to knowledge engineering. Knowledge Acquisition, 4(1):5-53. Special issue ``The KADS approach to knowledge engineering''.

Table 6: Summary of Autognostic's Evaluation.

This experiment was run in simulation mode, not during the actual AAAI competition.