Tim Menzies
Artificial Intelligence Department
School of Computer Science and Engineering
The University of NSW
tim@menzies.com
http://www.cse.unsw.edu.au/ timm
February 20, 1998
Visual programming systems have several advantages: (1) they are very motivating for beginners; (2) a spatial representation simplifies certain limited kinds of inferencing; (3) the use of ill-structured diagrams may assist in brain-storming. However, these three benefits may not be widely applicable. Many software engineering and knowledge engineering problems are not inherently spatial. Also, most VP tools do not support ill-structured diagrams. Lastly, diagrams are not necessarily superior explanation tools. In many cases, studies claiming certain benefits with visual systems can be matched by a counter-study with the opposite results. Clearly, some variable is not being controlled for within these opposing studies. It is possible that the task of a system is more important than its presentation (visual or textual).
Many knowledge acquisition systems use some sort of visual presentation. Kremer argues convincingly that such visual languages have numerous advantages for knowledge acquisition (KA) [Kremer, 1998]. Other researchers claim numerous benefits for visual frameworks. For example:
When we use visual expressions as a means of communication, there is no need to learn computer-specific concepts beforehand, resulting in a friendly computing environment which enables immediate access to computers even for computer non-specialists who pursue application [Hirakawa & Ichikawa, 1994].
This case that pictures assist in explaining complicated knowledge seems seems intuitively obvious. But is it correct? Other widely held intuitively obvious beliefs have been found to be incorrect, and sometimes even spectacularly so:
This article takes a critical look at the available evidence on the efficacy of visual programming (VP) systems. After an introduction to VP, we will review theoretical studies and small scale experimental studies suggest an inherent utility in visual expressions. However, when we explore the available experimental evidence, we find numerous contradictory results. This exploration extends my previous arguments in this area [Menzies, 1996].
As a rough rule-of-thumb, a visual programming system is a computer system whose execution can be specified without scripting except for entering unstructured strings such as Monash University Banking Society or simple expressions such as X above 7 . Visual representations have been used for many years (e.g. Venn diagrams) and even centuries (e.g. maps). Executable visual representations, however, have only arisen with the advent of the computer. With falling hardware costs, it has become feasible to build and interactively manipulate intricate visual expressions on the screen.
More precisely, a non-visual language is a one-dimensional stream of characters while a VP system uses at least two dimensions to represent its constructs [Brown & Kimura, 1994]. We distinguish between a pure VP system and a visually supported system:
Many authors argue that VP systems are a better method for users to interact with a program. Green et. al. [Green et al., 1991] and Moher et.al. [Moher et al., 1993] summarise claims such the above quote from [Hirakawa & Ichikawa, 1994] as the superlativist position; i.e. graphical representations are inherently superior to textual representations. Both the Green and Moher groups argue that this claim is not supported by the available experimental evidence. Further, they argue against claims that visual expressions offer a higher information accessibility; for example:
Pictures are superior to texts in a sense that they are abstract, instantly comprehensible, and universal. [Hirakawa & Ichikawa, 1994]
My own experience with students using visual systems is that the visual environment is very motivating to students. Others have had the same experience:
The authors report on the first in a series of experiments designed to test the effectiveness of visual programming for instruction in subject-matter concepts. Their general approach is to have the students construct models using icons and then execute these models. In this case, they used a series of visual labs for computer architecture. The test subjects were undergraduate computer science majors. The experimental group performed the visual labs; the control group did not. The experimental group showed a positive increase in attitude toward instructional labs and a positive correlation between attitude towards labs and test performance. [Williams et al., 1993]
For another example of first year students being motivated by a VP language, see [Glinert & Tanimoto, 1984] (p18-19). However, merely motivating the students is only half the task of an educator. Apart from motivating the students, educators also need to train students in the general concepts that can be applied in different circumstances. The crucial case for evaluating VP systems is that VP systems improve or simplify the task of comprehending some conceptual aspect of a program. If we extend the concept of VP systems to diagrammatic reasoning in general, then we can make a case that VP has some such benefits. Larkin and Simon [Larkin & Simon, 1987] distinguish between:
A common internal representation for a VP systems is one that preserves physical spatial relationships. For example, Narayanan et.al. [Narayanan et al., 1995] use Glasgow's array representation [Glasgow et al., 1995] to reason about device behaviors. In an array representation, physical objects are mapped into a 2-D grid. Adjacency and containment of objects can be inferred directly from such a representation. Inference engines can then be augmented with diagrammatic reasoning operators which execute over the array (e.g. boundary following, rotation).
Other authors have argue that diagrams are useful for more than just spatial reasoning. Koedinger [Koedinger, 1992] argued that diagrams can support and optimise reasoning since they can model whole-part relations. Kindfield [Kindfield, 1992] studied how diagram used changes with expertise level. According to Kindfield, diagrams are like a temporary swap space which we can use to store concepts that (1) don't fit into our head right now and (2) can be swapped in rapidly; i.e. with a single glance. Goel [Goel, 1992] studied the use of ill-structured diagrams at various phases of the process of design. In a well-structured diagram (e.g. a picture of a chess board), each visual element clearly denotes one thing of one class only. In a ill-structured diagram (e.g. an impressionistic charcoal sketch), the denotation and type of each visual element is ambiguous. In the Goel study, subjects explored
One gets the feeling that all the work is being done internally and recorded after the fact, presumably because the external symbol system (MacDraw) cannot support such operations. [Goel, 1992]
Goel found that ill-structured tools generated more design variants (i.e. more drawings, more ideas, more use of old ideas) than well-structured tools. We make two conclusions from Goel's work. Firstly, at least for the preliminary design, ill-structured tools are better. Secondly, after the brain-storming process is over, well-structured tools can be used to finalise the design.
It is not clear which of the above advantages apply to general software or knowledge engineering. Many software engineering or knowledge engineering problems are not naturally two-dimensional. For example, while we write down an entity-relationship diagram on the plane of a piece of paper, the inferences we can draw from that diagram are not dependent on the physical position of (e.g.) an entity.
In terms of the ill-structured/well-structured division, the VP tools I have seen in the KA field are all well-structured tools. That is, they are less suited to brain-storming than producing the final product.
Jarvenpaa and Dickson (hereafter, JD) report an interesting pattern in the VP literature [Jarvenpaa & Dickson, 1988]. In their literature review on the use of graphics for supporting decision making, they find that most of the proponents of graphics have never tested their claims. Further, when those tests are performed, the results are contradictory and inconclusive. For example:
Similar contradictory results can be found in the study of control-flow and data-flow systems.
Given these conflicting results, all that can conclude at this time is that the utility of control-flow or data-flow visual expressions are an open issue.
In other studies, the Green group explored two issues: superlativism and information accessibility (defined above). Subjects attempted some comprehension task using both visual expressions and textual expressions of a language. The Green group rejected the superlativism hypothesis when they found that tasks took longer using the graphical expressions than the textual expressions. The Green group also rejected the information accessibility hypothesis when they found that novices had more trouble reading the information in their visual expressions than experts. That is, the information in a diagram not instantly comprehensible and universal. Rather, such information can only be accessed after a training process.
The Moher group performed a similar study to the Green group. In part, the Moher study used the same stimulus programs and question text as the Green group. Whereas the Green group used the LABVIEW data-flow system, the Moher group used Petri nets. The results of the Moher group echoed the results of the Green group. Subjects were shown three variants on a basic Petri net formalism. In no instance did these graphical languages outperform their textual counterparts.
The Moher group caution against making an alternative superlativism claim for text; i.e. text is better than graphics. Both the Moher and Green groups distinguished between sequential programming expressions such as a decision true and circumstantial programming expressions such as a backward-chaining production rule. Both sequential and circumstantial programs can be expressed textual and graphically. The Moher group comments that:
Not only is no single representation best for all kinds of programs, no single representation is ... best for all tasks involving the same program. [Moher et al., 1993]
Sequential programs are useful for reasoning forwards to perform tasks such as prediction. Circumstantial programs are output-indexed; i.e. the thing you want to achieve is accessible separately to the method of achieving it. Hence, they are best used for hypothesis-driven tasks such as debugging.
The core of the case for VP is something like VP lets us explain the inner workings of a system at a glance. This section explores the issue of VP and explanation using the BALSA system.
In the BALSA animator system [Brown & Sedgewick, 1985], students can (e.g.) contrast the various sorting algorithms by watching them in action. Note that animation is more than just tracing the execution of a program. Animators aim to explain the inner workings of a program. Extra explanatory constructs may be needed on top of the programming primitives of that system. For example, when BALSA animates different sorting routines, special visualisations are offered for arrays of numbers and the relative sizes of adjacent entries.
Animators like BALSA may or may not be pure VP systems. BALSA does not allow the user to modify the specification of the animation. To do so requires extensive textual authoring by the developer. BALSA therefore does not satisfy the Rule 2 of pure VP system (defined above).
One drawback with the BALSA system is that its explanations must be hand-crafted for each task. General principles for explanation systems are widely discussed in AI. Wick and Thompson [Wick & Thompson, 1992] report that the current view of explanation is more elaborate than merely print the rules that fired or the how and why queries of traditional rule-based expert systems. Explanation is now viewed as an inference procedure in its own right rather than a pretty-print of some filtered trace of the proof tree. In the current view, explanations should be customised to the user and the task at hand. For example:
Summarising the work of Wick and Thompson, Leake, and Paris, I diagnosis the reason for the lack of generality in BALSA's explanation system as follows. BALSA's explanation systems were hard to maintain since BALSA lacked:
On the positive side, visual systems are more motivating for beginners than textual systems. In the case of spatial reasoning problems, a picture may indeed be worth 10,000 words [Larkin & Simon, 1987]. Given some 2-D representation of a problem (e.g. an array representation), spatial reasoning can make certain inferences very cheaply. Also, ill-structured diagramming tools are a very useful tool for brainstorming ideas.
On the negative side, beyond the above three specific claims, the general superlativist case for VP is not very strong. Many software engineering and knowledge engineering problems are not inherently spatial. Most of the VP systems I am aware of do not support Goel's ill-structured approach to brainstorming. The JD research suggests that claims of the efficacy of VP systems have been poorly documented. The Moher and Green groups argue that VP evaluations cannot be made in isolation to the task of the system being studied. Lastly, a diagram may not necessarily support information accessibility for knowledge. A good explanation device requires far more than impressive graphics (recall the BALSA case study). Like many of our current approaches for knowledge engineering [Menzies, 1997b, Menzies et al., 1997, Menzies, 1997a], VP systems need to be better evaluated.