Brian R. Gaines and Mildred L. G. Shaw
Knowledge Science Institute
University of Calgary
Alberta, Canada T2N 1N4
Abstract: WebGrid is a knowledge acquisition and inference server on the World Wide Web that uses an extended repertory grid system for knowledge acquisition, inductive inference for knowledge modeling, and an integrated knowledge-based system shell for inference. This demonstration shows WebGrid modeling a standard dataset for the NASA autolander problem which illustrates the system's capability for open-class reasoning with incompletely specified cases.
A description of WebGrid, associated applications and example applications, can be found in our associated paper (Gaines and Shaw, 1996). WebGrid is a port of our KSS0/RepGrid knowledge acquisition tools to operate as a server on the World Wide Web, allowing a web client on any platform world-wide to be used for knowledge modeling and inference. The system is interesting for a number of reasons:
This articles demonstrates some of these features using a dataset that has been widely analyzed in the literature.
Michie (1989) has used as an example of the successful application of machine learning, the development of a program developed to advise the pilot of a space shuttle about the advisability of using its autolander system. He reports that an attempt to develop an algorithm through conventional programming failed after several months of effort, but the use of an inductive modeling package produced a solution very rapidly.
Figure 1 shows the dataset used for induction. It comprises 16 cases characterized by 4 binary attributes and 2 4-valued attributes, leading to a binary decision. The 16 cases are interesting because they involve large numbers of "don't care" values such that they cover all 256 possible situations. Michie notes that the first 15 cases were elicited in the first knowledge acquisition phase, and the 16th case was added specifically to give such full coverage.
Figure 1 NASA autolander dataset (Michie, 1989)
Induction of a minimal decision tree leads to a tree with 15 root nodes, and induction of rules leads to 13 rules with 38 clauses. When Induct is run on this dataset in KSSn it models it with the EDAG shown in Figure 2 which captures the essential algorithm to use the autolander unless the visibility is yes and one of a number of exception conditions hold.
Figure 2 EDAG produced by Induct from NASA autolander dataset
A rational reconstruction of this dataset has been used to exemplify the operation of WebGrid.
Figure 3 shows the initial screen of WebGrid. The HTML form requests the usual data required to initiate grid elicitation: user name; domain and context; terms for elements and constructs; default rating scale; data types allowed; and a list of initial elements. It also allows the subsequent screens to be customized with an HTML specification of background and text colors, ruler line, and a header and trailer (not shown). The capability to include links to multimedia web data is also used to allow annotation, text and pictures, to be attached to elements.
Figure 3 WebGrid initial screen
The knowledge engineer has entered the data noted and the names of 6 initial stereotypical cases and clicks on "Done". WebGrid generates the triadic elicitation screen shown in Figure 4 which asks the expert in what way one case differs from the other two. The obvious answer in this case is that one should use the autolander in case 1 but not use it in cases 2 and 4. The expert clicks on the radio button for the case which is different, enters its attribute and the contrasting one for the other two cases, and clicks on "Done".
Figure 4 Construct elicitation from a triad
Note some features of the screen of figure 4. The horizontal rules and background color have been customized as specified at the bottom of Figure 4. More significantly, there are options at the bottom of the screen to enter categorical data, name the attribute, give it a weight in clustering, a priority in data requests, and to specify whether it is an input to inductive modeling or an output to be predicted. WebGrid has generated these options because the knowledge engineer selected the radio button "+Categories" in the initial screen of Figure 3.
If the default of "Ratings" had been left selected then none of the options would have appeared and WebGrid would act as a simple repertory grid tool. Knowledge elicitation can be commenced in this simple mode and changed to offer more data types at any time. This enables the elicitation process to be kept as simple as possible and more features to be added through a graceful upgrade as the expert gains confidence in the use of the tool.
When the expert clicks on "Done" WebGrid generates the screen of Figure 5 which allows all the elements to be rated on the new construct. Note that the name of the construct, the extreme values, the rating scale, and so on, can all be changed. The system is highly non-modal, enabling errors to be corrected and improvements made at any time.
Figure 5 Rating elements on a construct
Elements are rated by using popup menus as shown in Figure 6. The menu provides a natural rating scale replacing the special rating widgets developed for KSS0/RepGrid.
Figure 6 Popup menus used for rating scale entry
The expert can rate an element or leave it as a "?" which is taken as a "don't care" value in later modeling. When the rating is complete he clicks on "Done" and WebGrid generates the main status screen shown in Figure 7 which shows the elements and constructs, allowing them to be selected if required, and offers various context-dependent options.
Figure 7 Main status screen showing elements, constructs and options
To illustrate the entering of categorical data, consider the triadic elicitation screen of Figure 8. The expert has noted that case 5 differs from the other in that the turbulence is out of range whereas for case 6 it is light. He also realizes that medium and strong turbulence may be significant for other cases and decides to enter these as categories, clicking on the "Categories" radio button in the bottom part of the screen. He gives the category the name "turbulence". The primary use of such naming is disambiguation when more than one construct with rating or categorical data has similar names for its values. In the later knowledge modeling the values will be referenced as "turbulence = light" rather than just "light".
Figure 8 Entering a category
The expert has the option to enter all the category values in the list box next to "Categories". If he does not, the values entered in response to the questions will be taken as extremal and the others as interpolated between them. Thus, in the example given, the ordering will be: "out of range", "strong", "medium", "light". Again the system is highly non-modal and more category values can be added later in the elicitation, or categories re-ordered, and WebGrid adjusts existing values so that no data re-entry is required.
When the expert clicks on "Done" WebGrid generates the data entry screen shown in Figure 9 where popup menus are again used to allow elements to be assigned value son constructs, but now from a list of categories rather than a rating scale.
Figure 9 Entering categorical data
Figure 10 shows the upper part of the main status screen when all the cases from Figure 1 have been entered. Note the detailed feedback that WebGrid has generated suggesting the addition of further elements and constructs to break matches. As shown in Figure 7, this status screen also provides the capability of deleting, editing and adding more constructs and elements, annotating elements with multimedia notes that will be displayed in the elicitation process, analysis of the data, saving it, and so on.
Figure 10 Main status screen when all the data from Figure 1 has been entered
Figure 11 shows the output returned when the "FOCUS" button is used to sort the grid to bring similar elements and similar constructs together. The results of analysis are graphed, converted to GIF format and returned to the client where they can be examined and saved if required. Note that the grid itself is a mixture of rating and categorical data, and that the construct clusters show that the use of the autolander is associated with visibility, stability and small errors.
Figure 11 FOCUS clustering of NASA autolander data
Figure 12 shows the output returned when the "PrinCom" button is used to provide a principal components analysis of the grid by rotating it in vector space to give maximum separation of elements in two dimensions. The results of analysis are graphed, converted to GIF format and returned to the client where they can be examined and saved if required. On the right of the graph it can be seen that the use of the autolander is counter-indicated when the wind is head, there is visibility, errors are large, turbulence is out of range, attitude is negative or the shuttle is unstable.
Figure 12 Principal components clustering of NASA autolander data
The cluster analyses provides the expert with feedback that enables him to check whether the model being modeled appears correct but does not capture fine details and idiosyncratic exceptions. Inductive modeling provides a more precise account of logical structure that accounts for the data. Figure 13 shows the rules returned when the "Induct" button is clicked.
Figure 13 Induct modeling of NASA autolander data through rules
When the knowledge selects a factored EDAG in the control panel at the bottom of Figure 13 and clicks on "Induct" to run it again, the EDAG shown in Figure 14 is returned. This is precisely that of Figure 2 taking into account the slightly different vocabulary used.
Figure 14 Induct modeling of NASA autolander data as EDAG
New cases may be tested against the rules by clicking the "Test" button under the list of elements in Figure 10. This results in the data entry screen shown in Figure 15 which allows the attributes of a test case to be entered and the rules used to infer a conclusion. The WebGrid inference engine uses open class reasoning to make correct inferences with data that has missing values. The current inference is that it is open whether to use the autolander or not.
Figure 15 Test case data entry and inference
The user enters data through the popup menus, say that there is visibility, the vehicle is unstable and the wind is medium, and clicks on "Infer". There is enough data to produce a definite conclusion even though some attributes are unspecified, and WebGrid returns the screen of Figure 16 showing that it can be inferred that one should not use the autolander.
Figure 16 Test case data entry and inference
The expert can continue to adjust the test case data and run inference until he is either satisfied that the system is correct or he finds a case for which the inference is incorrect. He can then correct the conclusion and click on the "Add" button to enter the data as a new case. When Induct is run again it will generate a new model that takes account the additional case and, if possible, will then be corrected on the existing cases together with the new one. Thus, knowledge acquisition can be integrated with performance.
This demonstration has shown how WebGrid provides an interactive knowledge modeling system through the Internet. The main paper (Gaines and Shaw, 1996) gives more details of related systems and of other capabilities in WebGrid.
Financial assistance for this work has been made available by the Natural Sciences and Engineering Research Council of Canada.
WebGrid can be accessed at http://tiger.cpsc.ucalgary.ca/WebGrid/
Related papers on WebGrid can be accessed through http://ksi.cpsc.ucalgary.ca/articles/
Gaines, B.R. and Shaw, M.L.G. (1996). A networked, open architecture knowledge management system. Gaines, B.R. and Musen, M.A., Ed. Proceedings of Tenth Knowledge Acquisition Workshop.
Michie, D. (1989). Problems of computer-aided concept formation. Quinlan, J.R., Ed. Applications of Expert Systems Volume 2. pp.310-333. Sydney, Addison-Wesley.