Improving the Quality of Component Business Systems with Knowledge Engineering

Luis Montero, Colin T. Scott
Andersen Consulting
161 North Clark St., IL 60601 USA
Telephone: (312) 693-1052
Fax: (312) 652-1052
{Luis.Montero, Colin.T.Scott}@ac.com

ABSTRACT

A domain model is only as good as its knowledge content. A full understanding of the domain is critical to the delivery of reliable and evolvable software solutions. The development of high-quality models that can be used and re-used in business systems over an extended period requires significant up-front domain analysis that is neither typical nor well-supported by existing commercial software analysis and design techniques.

This article describes how Andersen Consulting is using knowledge engineering tools and techniques to aid the understanding and structuring of the target application domains, and how the result is seamlessly embedded in business systems developed using the Component Based Solution Construction (CBSC) approach, increasing the effectiveness and efficiency of the process, and the quality of the result. These tools are being integrated with commercial off-the-shelf object-oriented development tools and methodologies. We believe that this article gives a perspective about how knowledge engineering work can find its way in the industrial, commercial world, particularly in the context of software engineering.

1. INTRODUCTION

This paper describes the use of Knowledge Acquisition (KA) techniques and tools to support the analysis activities in leading edge component-based software engineering approaches to systems development. This approach has been implemented as part of the Technology Reinvestment Project (TRP) initiative of the Advanced Research Projects Agency (ARPA). This paper is also based on previous research and development activities in the area of knowledge based systems as part of the ESPRIT P5365 VITAL project, sponsored by the European Community.

The TRP initiative has been applying the integration of these techniques to a number of full-scale industrial software development projects with the cooperation of the Raytheon Company, whose applications validate the approach, and the CoGenTex Company, developer of supporting tools.

Our focus is to describe the form in which the knowledge acquisition techniques have been used and can be used in conventional business systems, not only in knowledge based systems. These techniques have been particularly useful in the Domain Analysis activity within the Andersen Consulting component development approach.

Chapter 2 will outline this approach and provide the necessary context. Chapters 3 and 4 will present the problem in Domain Analysis and how some existing approaches have evolved to solve it. Chapter 5 and 6 focus on the use of KA techniques in this domain. Chapter 7 focuses on the transformations architecture we have developed to provide an adequate output of the model for design and implementation.

2. COMPONENT-BASED SOLUTION CONSTRUCTION

Component-Based Solution Construction (CBSC) is an approach to develop systems which intends to overcome the so-called Software crisis issue. The attractiveness of the component model for software systems originates in the successful application of the component technique to hardware systems. The idea of a "software bus" into which software components with well-defined interfaces can be "plugged" has long been a goal. A more detailed description of Andersen Consulting's approach towards Component-Based Solution Construction is presented in (Montero 1996) and (Montero 1997).

The CBSC approach needs to be supported by a full life-cycle methodology and the corresponding development process. In our case, we have defined the so-called Asset-Centric Process, matured after feedback from the projects we have worked on during our participation in the TRP initiative.

The vision for the asset-centric process is an environment where analysts and developers distributed across both space and time collaborate to produce high-quality, client-focused solutions by assembling them from extensible, re-usable components. Its objectives are:

To support distributed, collaborative development of component-based solutions that match client requirements

To reduce the lifecycle costs of solutions by encouraging the creation, re-use, and evolution of software development assets

To support user-centered software engineering techniques

To embody the process in an extensible and variable set of tools.

The description of the process has so far implied the existence of some general domain concepts (in this case the assets) and one or more processes that produce them. This (generalized) view gives a good starting point for examining the functionality of the process. From an abstract perspective the process is constructed by focusing on the concepts and capabilities that must be supported in order to carry out successfully the above objectives. As we will see, the capabilities will form the basis for the assets produced during the process.

If we look at such a development process as "outsiders", the final activity can be summarized in a few words as "integrate components into a solution". Now, there are several questions:

Are the components already built or do we have to develop them?

Do we have a technical infrastructure and architecture model for integrating the components?

Do we have the business requirements and domain model in a traceable way, to support the full development?

These questions are answered by three additional processes:

Design and implement components (when not already built)

Develop component architecture

Develop domain model

The following picture illustrates this "outsider" view or "Simplified Model".

Fig 1. Asset-Centric Process. Abstract Model.

This is the full process. This article will focus on the support of knowledge acquisition techniques and tools to the activities that make up the develop domain model process, which we will now call simply "Domain Analysis".

3. THE DOMAIN ANALYSIS PROBLEM

Current analysis and design approaches in business systems in the commercial world, frequently make the assumption that the requirements and the domain concepts for a given system are either well-known or well-defined, and that those concepts and requirements can be easily captured as a model in an object CASE tool.

For most large-scale commercial systems developments this is not the case. This problem is exacerbated by the complexity of new systems, and the trend towards iterative and evolutionary development where the "discovery" of new requirements can take place at any point in the development lifecycle. Some object-oriented analysis and design processes such as Objectory (Jacobson 1993) and OOAD (Martin-Odell 1995) have attempted to resolve this by adding business domain analysis phases prior to the traditional modeling and design phases, in the same way as we have done. However tool support for these new phases is not well-developed and domain analysis is not well-integrated into the rest of the software development process. This is the area that our work addresses.

A major part of the problem is that the activities in these new phases are significantly different from the subsequent object-oriented analysis and design activities. They focus on acquiring and structuring domain knowledge. This acquisition and structure is essential to the success and long-term viability of the software system.

There are a number of obstacles that hinder analysts when capturing domain knowledge. First, the domain information is increasingly distributed, as there are many stakeholders in a business system, including high-level managers, domain experts, users, system engineers and existing legacy documents about company policy, current processes and practices. This distribution and diversity makes the capture of knowledge very complex. (and the different perspectives are necessary to provide a quality system).In addition, the knowledge of the domain experts is usually "hard-wired" in their internal memory structure, and difficult to bring to surface. The key domain experts tend also to be busy and unavailable. Building conceptual domain models is consequently, very difficult.

4. BACKGROUND

The problem is not new. It has long been apparent to systems developers, but most business systems in previous decades were very limited in complexity: (traditional inventory, billing or payroll systems). In addition, analysis and design tended to be small while implementation and maintenance tended (as a result) to be large in order to keep the system alive. The problem is clearer today, when systems are complex as a result of rich and heterogeneous domains.

On the other hand, Knowledge Based Systems (KBS) have been used in domains that are rich and heterogeneous, where the problem has surfaced more often in the past. KBS researchers and practitioners have explored different ways to solve the problem.

Our work follows an evolution of related KBS conceptual modeling approaches that try to rationalize the KBS development process while encompassing the necessary knowledge acquisition activities. Many approaches were developed mainly in the 80's.: Heuristic Classification, Generic Tasks, Role-Limiting Methods, Components of Expertise, and more. One particularly comprehensive approach was KADS, developed by Wielinga, Breuker and other researchers at the University of Amsterdam. KADS is well known in the community, and has evolved significantly in the 90's (KADS-2, CommonKADS).

Also in the late 80s, the University of Amsterdam, the University of Nottingham and other institutions integrated a number of well-known KA techniques in a workbench within the project ESPRIT-ACKnowledge.The purpose was to find ways to create and populate these models with domain information. A number of tools were integrated in a toolkit called ProtoKEW, and the complete workbench KEW.

In the early 90s, the need for a methodology with a workbench supporting the whole life-cycle resulted in the ESPRIT-VITAL project. This project involved, among others, Andersen Consulting, the University of Nottingham and the Open University. VITAL matured these techniques in several industrial KBS projects in different domains within financial services and telecommunications. Additionally, the Generalized Directive Models (GDM) methodology (Shadbolt 1993) (Motta 1994) (Capella 1994) was enhanced from previous work in ACKnowledge, extending the generic KADS model by providing flexible Problem Solving Models (PSMs).

An important outcome of ESPRIT-VITAL was the availability of implementations of a number of knowledge engineering tools. The practical implementation and integration of a number of tools with a common repository has been developed by EPISTEMICS® under the name of PCPACK® (Epistemics 1995), based on some research originated mainly at the University of Nottingham. Andersen Consulting has participated with EPISTEMICS in the extension and development of some of the tools to cope with the needs we identified within TRP.

5. DOMAIN ANALYSIS: INPUTS AND OUTPUTS

We are presenting here an abstract view of domain analysis in terms of inputs and outputs as illustrated in the following figure. This is the structure we will follow in Chapter 6 to illustrate the use of KA Tools in the process.

Fig 2. Domain Analysis Inputs and Outputs.

5.1. Inputs

We define three types of inputs in Domain Modeling.

5.1.1. Requirements - Required Solution

Requirements usually contain unstructured information captured in interviews with stakeholders about the required solution in the system and all background industry, enterprise and process legacy documentation, usually in textual form.

5.1.2. Abstract Knowledge - Top-Down Structures

The top-down structures of abstract knowledge usually consists of existing organization patterns and frameworks, whether they describe the domain, the processes and practices, as well as the policies and the dependencies between events. The desirable input would be a set of top-down structures where the elements of the solution fit. In an asset-centric process, specifications of existing components are also treated as abstract knowledge.

5.1.3. Concrete Knowledge - Bottom-Up Elements

Bottom-up elements of concrete knowledge normally consists of concepts, attributes, operations, rules, … that are elicited in a bottom-up fashion, and need to fit in the top-down structures. They constitute the bits of the model.

5.2. Outputs

As a result of domain modeling we expect a domain model. The question now is how we can have several views of the domain model that are useful for traceability purposes across the systems life cycle, especially towards "Design and Implement Component". Our approach looks at the model from three different, slightly orthogonal, perspectives:

5.2.1. Domain Specifications

Domain specifications focus on the set of concepts (potential classes) that populate the domain, including their attributes, operations, values, instances, relations and links between them, and how they are clustered in domain components.

5.2.2. Process and Actor/Interaction Specifications

Process specifications describe the dynamic behavior of the system in terms of process workflow, events and goals. Interaction specifications describe the interaction between the different actors in the process: system, users and external systems, complementing the process workflow description.

5.2.3. Rule Specifications

Rule specifications describe the conditions and constraints that affect all of the above. In the context of component construction, there are different types of rules, including constraints, pre- and post-conditions, or state transitions.

6. DOMAIN ANALYSIS: KA TECHNIQUES AND TOOLS

This chapter will illustrate how a number of knowledge acquisition techniques support the domain analysis process. Most of these techniques are available within PCPACK tools. We will present samples used in "Defense Design Review", one of the practical applications developed during TRP in the defense manufacturing domain (Raytheon), which is quite complex.

6.1. Required Solution

6.1.1. Interviews - Protocol Analysis - Protocol Editor

The "Protocol Editor" is a tool to edit the transcripts of interviews, or any other textual information and capture key knowledge. The PCPACK version of this tool allows the user to graphically "mark-up" different parts of the text document. The Protocol Editor includes hypertext facilities that allow knowledge elements in one document to be linked to those in another. The tool allows a requirements engineer to:

capture elements in the "Concrete Knowledge" category, such as concepts, attributes, values of attributes, processes/tasks, relations, business rules and annotations.

capture pieces of text as requirement engineering / design rationale elements, like requirements, issues, positions, arguments and assumptions

export the mark-ups to the PCPACK common repository, so that any other PCPACK tool can be used to structure the knowledge.

The following picture illustrates the use of the tool

Figure 3. A Snapshot of the Protocol Editor

The PCPACK Protocol Editor can export its content to HTML, including hyperlinks to documentation and annotations, so that they can be reviewed off-line in a conventional Web browser. In a number of cases, we used the Protocol Editor in combination with LIDA, a tool developed by CoGenTex. This tool allows the user to perform a syntactic-semantic analysis of the text, looking for nouns, verbs, etc. So the protocol editor can avoid starting from scratch.

We found it an useful brainstorming tool in all Domain Analysis activities, revealing domain elements that tend to be lost otherwise. There was significant synergy with other tools: Once structured by other PCPACK tools, the elements we found helped us build an "early model" of the business problem reducing significantly the size of the overall analysis and design effort.

6.2. Abstract Knowledge

There are two basic approaches used to find a solution model for a problem:

Bottom-Up: Start with the actual scenarios, abstract the policies out of them and compose a solution.

Top-Down: Start with a similar (canonical) problem, take its solution model and adapt it to the new problem domain.

Intuitively, the second approach looks more efficient than the first one. However in software development the bottom-up approach is more common. The main reason is the lack of reliable existing solution models. In reality, developers leverage previous individual experience when designing solutions, without relying on formal solution descriptions.

6.2.1. GDM

The GDM approach was developed mainly by the University of Nottingham during the ACKNOWLEDGE and VITAL projects. We have incorporated it into the TRP process. GDM is a top-down approach that incorporates bottom-up concrete knowledge in a continuous process of knowledge acquisition and modeling. It adopts the main principles of KADS, while providing a continuous process to derive new, custom PSMs rather than providing a static PSM library. GDM includes the set of rules governing this process. This principle is illustrated in the simplified figure presented below, for the well-known "Heuristic Classification" model.

Figure 4. A Scheme of GDM Inference Model development

From a component perspective, PSMs are "process patterns" identified during domain analysis. PCPACK offers a GDM Tool to support the GDM approach and including the model extension rules. In line with KADS models, The Process Pattern shown in the figure contains inference steps and roles. From a component development perspective, the former ultimately constitute clusters of processes and the latter constitute clusters of domain concepts.

GDM proved useful in the projects we accomplished during TRP, as a way to outline the type of problem we were trying to solve, and provide a top-down perspective and a first "big picture" of the system in cases where previously there was none.

6.3. Concrete Knowledge

There are a number of knowledge acquisition techniques supporting the capture of the bits in the system: concepts, attributes, detailed operations and rules. Most of them originate in cognitive psychology techniques with proven effectiveness. Perhaps some of the most popular ones are Laddering, Concept/Card Sorting and Repertory Grids. PCPACK provided tools to trigger new knowledge supporting these techniques, and structure that knowledge.

6.3.1 Laddering

Laddering means setting elements in a ladder according to a common criterion in order to visualize them (easier for the expert) and confirm model completion (and, in rule systems generate the knowledge in the form of rules). PCPACK Laddering Tool is a diagramming tool for four types of ladders. The first ladder deals with concepts and instances which can be seen as potential classes and objects during component design. The default relation in the concept ladder is "is-a" (generalization/inheritance), but additional Ladder diagrams for any other relations can be created and displayed in the same way (aggregation, causal dependency, etc.).

Figure 5. A Ladder of Domain Concepts

A second ladder representation is used to show the attributes and their values.

A third ladder is used to show the process/task decomposition, in a hierarchical manner, as could be represented in a Warnier-Orr diagram.

Finally, a fourth ladder representation, suggested by Andersen Consulting, allows the requirements engineer to ladder requirements, issues, positions, arguments and assumptions, and the relations between them, as well as to define and visualize alternative decisions (as sets of positions on issues), in a friendly graphical manner, The ontology is not dissimilar from the ones proposed in IBIS, REMAP (Ramesh 1992) or QOC approaches for design rationale (Moran 1996). This is mainly used for requirements engineering.

The laddering tool was really useful in our domain, and, together with the Protocol Editor, the most used one. Not only because of its modeling features, but mainly as a powerful brainstorming tool for the interaction between expert and analyst. The fact that the laddering tool is very interactive, with drag-and-drop facilities, and strong usability features helped a lot. Once having populated the repository with the concepts, attributes-values, processes and requirements via protocol editor, the laddering tool was used, often with the supporting presence of the experts, to organize the elements. This process prompted the experts to add new elements (thereby filling the remaining knowledge "holes"), and the engineer to raise issues. This was as important for processes (even more so) as for concepts or attributes.

6.3.3. Card Sorting

The "Concept Sorting" is a well-known technique, where the expert is given "cards" corresponding to concepts and instances, and sorts them into piles (values) according to different criteria (attributes). Cognitive psychology studies have shown that this is very efficient elicitation technique and facilitates the acquisition of new concepts, attributes, and relations.

We found the PCPACK Card Sort tool useful to identify additional attributes and values, especially at the end of Domain Analysis.

6.3.4. Repertory Grid

The Repertory Grid technique (Gaines 1988; 1990), well known in the KA community, is based on the "Personal Construct Psychology" theory by Kelly (Kelly 1955), which postulates that people view the world in terms of "constructs" (attributes which discriminate along a bipolar continuum). The description of this technique is beyond the scope of this article. We will focus on the key elements of this technique that helped us.

The technique provides metrics showing the proximity between concepts/instances and between constructs (attributes). In this way, two constructs which are too similar (always the same ratings in different concepts/instances) may be identified as redundant, or may trigger the expert to find concepts that differentiate them. Redundancy may also be the case for concepts, which may prompt experts to seek additional constructs. There are sub-techniques, exploiting this trigger principle within the Repertory Grid Tool, called "Break Match" and "Triad Elicitation", which help in finding new concepts. The essential storage mechanism is a grid of values, and the visualization of the "distance metrics" consists of trees in the sides of the grid, as indicated in the picture below.

Figure 6. A Snapshot of the Repertory Grid Tool. The Grid

The PCPACK Repertory Grid Tool can induce rule sets based on the grid. This "data mining" feature can be exploited in the Rule Editor and Dependencies Tool, which are presented in a different section. We used this tool during the "Complete Analysis" activity, not only for its "data mining" features, but mainly for its "triggering" techniques, which helped us discarding attributes that were irrelevant, and finding new relevant concepts and attributes in the domain.

The sections above have dealt with tools focused on "Elicitation". The following sections illustrate the tools more focused on "Modeling": editing and presentation of results (outputs).

6.4. Domain Specs

6.4.1. Relationships Diagrams

It is very important for analysis to allow the user to define associations (relationships) between concepts (classes or entities) and present them in a way that is familiar to a conventional business modeling audience, whether Object based (Class Model) or Entity based (Entity-Relationship Diagrams). PCPACK included, upon Andersen Consulting request, a tool to complement the Protocol Editor and Laddering Tool in this direction.

We used this tool widely to present Domain Analysis results because the client liked this notation, and because we were able to transform its contents, including the diagram information, directly into CASE tools, simplifying the work during design. We will describe the transformations feature later on in this article.

6.4.2. Case Modeling

It is important to document the known cases (instances) in a domain, because it is a way of grounding the model with real data, and possibly to use this data as a source for abstraction of more general rules, based on induction. PCPACK offered a Case Editor/Induction tool with this functionality. The induction feature was exploited in combination with the Rule Editor and Dependencies tool, which are presented in a different section.

6.5. Process and Actor Specs - Grounding Process Patterns

We have presented GDM as an useful source for abstract "process pattern" which is useful during "Solution Strategy". Once in "Requirements Engineering" we need more detailed process descriptions in the form of "Workflow" diagrams. At the end of the Domain Analysis process, we need to provide a lower lever of process description, up to the level of small operations. The question is how can we achieve traceability between the process-patterns and the lowest-lever process description. The answer is "grounding".

"Grounding" is the process that brings together top-down structures, like process patterns, and bottom-up elements extracted from the Knowledge Acquisition process. There are three levels in this process.

customize the process pattern to the terminology in the business environment we are in. For instance, replacing "variable" by "symptoms" in the "Heuristic Classification" Model when "grounded" for Medical Diagnosis.

populate the clusters of the process pattern with concepts, lower-level processes and other fine-grain elements

provide structure and content to the fine-grain elements

6.5.1. Workflow modeling and grounding

The Control Editor is the key grounding tool in PCPACK partially inspired by OCML, developed at the Open University during the VITAL project (Domingue 1993) (Motta 1994). In the Control Editor, the top-down information from the GDM process pattern joins the structured bottom-up information. It provides the following functionality:

Edit the GDM model, and the grounded GDM model (1^st level)

Edit the clustering of concepts in Clusters

Use the decomposition of processes at every level of detail, introduced in the laddering tool, and build Workflow diagrams representing the dependencies between the processes (sequencing/concurrency, selection points, control-flows), and information elements (data-flows).

Introduce the Actors interactions in the resulting Workflow diagrams.

Figure 7. A Control Editor Workflow Diagram, expanding "Defense Design Review" process.

6.6. Rule Specs

Another aspect of "grounding" is to bring together the informal "business rules" or policies identified at a high level (probably marked with the Protocol Editor), with the formal rules identified at the lower level, either induced by the Repertory Grid and the Case Editor, or as a result of analyzing the workflow diagrams (choice or selection points).

6.6.1. Dependencies and Rules

The Dependencies Tool within PCPACK, suggested by Andersen Consulting, edits individually the formal rules generated by the Repertory Grid and Induction Tool, as well as the "informal" rules marked in the Protocol Editor, and come up with a set of dependency diagrams or rule schemes illustrating the dependencies between the attributes of the concepts. In combination, the Rule Editor within PCPACK allows the user to edit and modify the final rules.

These tools facilitate the development of complete and consistent sets of rules with specific scope, which can be very useful for the development of KBS. In our component context, it was very useful to generate the pre- and post-conditions for operations within the components.

6.7. Documentation

The importance of documentation in analysis cannot be minimized. Every new element in the domain requires a description that can be traced during the analysis process. Moreover, the description is most useful when hyper-links are built to descriptions of related elements. We used PCPACK Hyperpic, a Hypertext documentation tool which allows the user to include descriptions for each element in PCPACK, create automatically hyperlinks between them and export into HTML format. Hyperpic is not too relevant from a Knowledge Acquisition perspective, but it is from a Knowledge Management perspective.

We also used an automated documentation tool, called "Model Explainer", developed by CoGenTex to convert formal models into HTML natural language explanations for most models in PCPACK. This feature was very useful for system managers who usually find difficult to interpret diagrams whose legend is not familiar to them, but find easier to understand verbal descriptions of the elements and the process.

Andersen Consulting developed an Asset Catalog Tool (ACT) to integrate all documents, models and assets in a common HTML-based environment so that any system developer could browse any information that was relevant. ACT used the Transformations Architecture that we are about to present in the following chapter.

7. INTEGRATION, TRANSFORMATIONS AND TRACEABILITY

All these techniques and tools would be useless unless there was a strategy to convert their result into useful information for Object/Component CASE tools towards design and implementation of the system. The perspective of the designer is much better having a design view of the analysis model in the object CASE tool than having nothing at all. The need for this type of transformations is in line with the MIKE approach (Decker 1996), especially the "Formalization/Operationalization" and "Design" steps, and use of DesignKARL language.

We used different Object CASE Tools, both commercial off-the-shelf (COTS), like Rational ROSE, Paradigm Plus and Ptech, or custom-developed, like ACT (already seen) and OTV (Object Technology Visualization), developed by Andersen Consulting, to visualize and animate 3-D component models based on the dynamic object behavior.

It was very important for us to define a transformations architecture to convert the PCPACK elements between these tools, accommodating the different methodologies, and ontology models (meta-models) they support. Moreover, we wanted to keep traceability, so that the CASE designer could look for the source of any design element back to analysis (in PCPACK), including documentation.

Figure 8. Process without and with Transformational Traceability.

In order to describe our transformation architecture, we first introduce an existing standard for CASE data exchange: the CASE Data Interchange Format (CDIF), an industry accepted standard formalism to transfer models between CASE tools. Although we investigated a number of methods, including Ontologies, CDIF stood out as a strong candidate for the TRP Interchange Format because of the support for CDIF in commercial CASE tools. For those tools that did not support CDIF, we built at Andersen Consulting the necessary import-export mechanisms between the tool repositories and CDIF. We did the one for PCPACK.

The following is a very minimal sample of CDIF code depicting classes, subsystems, attributes, operations and the links between them, using Paradigm Plus (CDIFpp). We defined different "Subject Areas", with different ontology models (Meta-models) for the different tools (in CDIFv, the one for PCPACK, "concept" would be used instead of "class", for example).

(Class "00000002-00000002-31404010" (Name "lion") ) (Subsystem "00000002-00000003-31404010" (Name "animal") ) (Link "00000002-00000004-31404010" ) (Subsystem.Contains.Class "00000002-00000004-31404010" "00000002-00000003-31404010" "00000002-00000002-31404010") (Attribute "00000002-00000120-31404010" (Name "name") ) (Class.Attribute "00000002-00000120-31404010" "00000002-00000002-31404010" "00000002-00000120-31404010") (Operation "00000002-00000070-31404010" (Name "walk") )

(Class.Operation "00000002-00000070-31404010" "00000002-00000002-31404010" "00000002-00000070-31404010")

Figure 9. A Sample of CDIF code.

To transform elements from one tool's CDIF representation to another's, we used a KBS developed in CLIPS and integrated in C++. We made use of its definition of "facts" and "rules" that can be "imported" into CLIPS for execution. The following two samples illustrate a sample of a facts file that would correspond (after our first conversion) to the CDIF code in the previous figure

(cd Class 2 lion)

(cd Subsystem 3 animal)

(cd Contains 4 animal lion)

(cd Class_Attribute 120 lion name)

(cd Class_Operation 70 lion walk)

Figure 10. A Sample of "Facts".

The following is a sample with a couple of simple translation rules. Since that the identifiers are maintained in the rules, traceability is possible.

(defrule cl1

(cd cluster ?id ?x)

=>

(assert (cd Subsystem ?id ?x)))

(defrule c2

(cd concept ?id ?x)

=>

(assert (cd Class ?id ?x)))

Figure 11. A Sample of Translation Rules.

The Process to export some content from one tool to another is supported by a series of transformations. Each transformation may have one or several source tools and one or several target tools (usually 1-to-1). We did all those that made sense in our process, including backward transformations: for example, we could populate PCPACK with information from a CASE tool. This means traversing most of the possible paths from top (Source) to bottom (Target) in the following diagram.

Figure 12. Transformations Architecture

As a matter of fact, the following picture illustrates the result of the previous models after being exported to one of the Object CASE tools, in this case, Rational ROSE. This model would be the first step towards the generation of code templates and structure for the business system. In this form we achieved traceability between analysis, design and implementation Code.

Figure 13. The resulting Model in Rational ROSE.

8. CONCLUSION

We have presented an approach to component development, which uses widely knowledge acquisition tools to capture requirements and domain information. We believe that this approach shows how knowledge engineering techniques can help to solve one of the major problems in software construction: the quality of the content, and illustrates how the combination of component technology and knowledge engineering techniques can improve the current practice of software development.

It is our intention with this article to give some pragmatic perspective to the KAW practice. While it is certain that academic basic research in the domain is needed to improve the theoretical state of the art, it is very important for this community to have presence in the industrial world and prove that the solutions can also improve the commercial state of the art. As a matter of example, we have also explored and used within Andersen Consulting some recent approaches in the domain of analysis and design patterns, by Martin Fowler and Erich Gamma et al (the gang of four), respectively, and invited them. Their work is beyond the scope of this article, but it is worth mentioning that they are using many concepts familiar to the KAW community and given them a pragmatic twist that are making them very amenable in the industrial community. We see the cooperative work of Andersen Consulting and Epistemics as a significant step in the same direction. We are aware of other efforts, like the one involving Unilever (Speel 1997). We encourage the KAW community to move in the same direction. It would be sad to see so much valuable effort vanish because of lack of commercial output. There is much that the KAW community can do in the commercial software marketplace.

ACKNOWLEDGEMENTS

We would like to thank several members of the Andersen Consulting CBSC initiative, particularly Jeff Mackay, Usha Saxena, Robert Xu, Carles Muntada, Tom Barfield, as well as other participants in the TRP project, notably Bernie Bussiere from Raytheon and Owen Rambow from CoGenTex. We would especially like to thank Epistemics, in particular Steve Swallow, and the different members of the VITAL project, in particular Nigel Shadbolt, Enrico Motta and John Domingue, for their support during both the VITAL and TRP projects.

REFERENCES

Francisco Capella, Luis Montero, Kieron O'Hara, Bernard Le Roux, Manuel Zacklad, Nigel Shadbolt, Philippe Laublet, Mandy Mepham, Gordon Rugg and Sadish Outtandy (1994). Detailed Definition of VITAL Knowledge Analysis Tasks and Steps, according to Methodology Framework. VITAL DD252. 1994.

Stefan Decker, Michael Erdmann, Rudi Studer (1996). A unifying View on Business Process Modeling and Knowledge Engineering. Proceedings of the KAW’96 workshop, 1996.

John Domingue, Enrico Motta, Stuart Watt (1993). The Emerging VITAL workbench. In Knowledge Acquisition for Knowledge-Based Systems, 7th European Knowledge Acquisition Workshop, EKAW'93, pp. 320-339, September 1993.

Epistemics (1995) PCPACK Portable KA Toolkit. User Manual.

Brian Gaines (1988) Knowledge acquisition systems for rapid prototyping of expert systems. INFOR, 26, 256-285.

Brian Gaines and J.H. Boose (1988) Knowledge Acquisition for Knowledge-Based Systems Series, Vol. 1. Academic Press 1988.

Brian Gaines and M. Linster (1990) Development of second generation knowledge acquisition systems. In Current Trends in Knowledge Acquisition, edited by Bob Wielinga et al. IOS Press. 1990.

G. A. Kelly (1955) The Psychology of Personal Constructs. Norton. 1955.

Ivar Jacobson (1993) Object Oriented Software Engineering. Addison Wesley. 1993.

James Martin and James J. Odell (1995) Object Oriented Methods, A Foundation. Prentice Hall. Englewood Cliffs, NJ. 1995.

Luis Montero, Colin T. Scott (1996) The role of Knowledge Acquisition in Component Based System Construction (CBSC), in Proceedings of the Eighth International Conference on Software Engineering and Knowledge Engineering, June 1996.

Luis Montero, Colin T. Scott (1997) Using Knowledge Transformation to Improve the Software Development Process. Submitted to the 9^th International Conference of Software Engineering and Knowledge Engineering, Madrid, 1997.

Thomas P. Moran and John M. Carroll (Eds.) (1996) Design Rationale, Concepts, Techniques and Use. Lawrence Erlbaum Associates Inc. Mahwah, NJ, 1996

Enrico Motta, Kieron O'Hara, Nigel Shadbolt (1994) Grounding GDMs: A structured case study. International Journal of Human-Computer Studies, 40(2): 315-347, February 1994.

Balasubramaniam Ramesh (1992) Supporting Systems Development by Capturing Deliberations During Requirements Engineering. in IEEE Transactions on Software Engineering. Vol. 18. N. 6. 1992.

Nigel Shadbolt, Enrico Motta, Alain Rouge (1993) Constructing knowledge-based systems. IEEE Software, 10(6):34-39, November 1993.

Piet-Hein Speel, Manfred Aben (1996) Applying a Library of Problem Solving Methods on a Real-Life Task. Proceedings of the KAW’96 workshop, 1996.