Re: Hierarchy support in HTML [Was: Tables: what can go in a cell]

James D Mason (MASONJD@oax.a1.ornl.gov)
Fri, 17 Feb 95 17:35:11 EST

Last week I posted a comment in the thread about what should be allowed in
table cells in which I suggested that HTML might need to move towards a
more hierarchical structure. My posting brought forth a response from Dan
Connolly. I regret that I have been late in responding to him, but I was at
an ISO meeting and was short of time. His posting also deserved careful
consideration, and I wanted to discuss issues with several other people. I
hope this message is not too late to be of use.
(Warning: this is a LONG posting.)

Summary
Mature documentation systems tend to recognize of the hierarchical
structure of many documents. This recognition benefits the user directly by
simplifying manipulation of structure, leading to the creation of tighter,
better-built documents. It also produces benefits through automation of
processes that track structure, such as section numbering and creation of
lists. The costs are in increased planning up front, slightly more complex
processing, and the need to support validation.

Dan raises some good questions in response to my
comments about hierarchy in HTML. I'm not sure I know enough of the things
that other people do with the WWW to speak for other environments besides
the technical ones in which I do publications development and support. I
certainly haven't had the time to think out all the impacts of hierarchical
changes (I used to edit Environmental Impact Statements, and I fear the Web
would have moved on and left us all if I did an EIS on my suggestions).

Dan ends with the observation that files should be sent through a
validator before being put up on a server. That's really a side issue, but
it's one with which I am in full agreement. The Department of Energy's
central site (http://apollo.osti.gov/home.html) has a policy of validating
anything that comes from their internal sources. We try to do likewise and
also to show some sort of a corporate style, though it's hard to do when
half the staff of a large institution seems sometimes to be trying to have
individual home pages. Many things we put up will, we hope, be machine
generated (e.g., documents from our own complex DTD translated to the HTML
DTD by our parser) and so will be valid without human intervention.
Nonetheless, I believe establishing validation servers would be a good
step. We have our own and also recommend Dan's. I think, furthermore, that
as the Web community matures, more people will begin to use validating
editors (along the model of SoftQuad's HoTMetaL) to prepare their
documents. Editors of that sort will also be almost a necessity if the
HTML DTD becomes more complex.

Now to return to the original point, hierarchy. How much
to have and how rigidly to enforce it are subjects of considerable
discussion in most SGML environments, not just among HTML folks. In fifteen
years of working with computerized technical publication systems, I have
observed that as systems mature they tend to become more hierarchical. What
differs among them is mostly a matter of how many objects they can track in
the hierarchy and how rigidly they enforce it. In the HTML world we're
already into that concern since the question at hand is "what can go in a
table cell".

Dan asks for a statement of the issue as I see it.

I see the issue of hierarchy as a reflection of different concepts of
(1) structure and (2) tagging. It isn't intrinsically an SGML issue. It
could be implemented in all sorts of documentation environments. But more
important is how it reflects ideas about writers and writing. Systems
dividing documents into units (front matter, body, back matter, then the
division of a body into chapters, sections, subsections, etc., with their
associated headings) reflect a commonly understood hierarchy within
documents. Other examples come to mind; tables are frequently
polydimensional hierarchies, and mathematical formulae can be seen as
recursive hierarchies. What I had in mind when I made my original posting,
however, was heading systems.

Headings Without Hierarchy
The model I see for the structural tags in HTML (particularly <h1>,
<h2>, etc.) is style-sheet-driven word processors and desktop-publishing
packages. These programs have typically grown from non hierarchical sources
that did not use style sheets. When style sheets are implemented in such
programs, users can create any sorts of styles/tags they want and use them
in any order. A program like Ventura, for example, that tags only
paragraph-level objects, has no way of enforcing hierarchy: the human user
becomes as much of a validator/enforcer as whim (or, better, experience,
wisdom, and planning skill) can dictate. Some programs (e.g., Microsoft
Word) that also include outlining capability have a limited means of
enforcing a hierarchy that runs in parallel to their style sheets. That is,
Word's outliner knows only about the existence of predefined style names
for headings, not about any other styles that the user may define or the
textual units may come between the headings, and its ability to move the
text associated with a heading and the text following it has no effect on
structures therein (it can't, for example, renumber figures or tables).
The effect is that the headings are like a structured overlay on the
rest of the text. Headings in all these systems are, like those in HTML,
simply styled paragraphs. There are no containing elements that link
headings with the text they govern, unless those containers are in the mind
of the user. In a model like that of HTML, tags are independent labels, and
the computer is only marginally involved in managing the structure of the
text. Most of the work is left to the human operator.
(There are places where both HTML and the old macro
systems do enforce containment, such as lists, which must have start and
end tags as well as tags on the items. But this sense of containment is
almost orthogonal to the heading systems. The Bell Labs troff-mm system
lacked proper containment but did reset all list counters both as levels of
nesting were popped and whenever new sections intervened.)

Headings With Hierarchy
In contrast to these systems that leave use/abuse of headings to the
user's discretion, many SGML applications have both containment and
enforcement of hierarchy. Here is a fragment from a Department of Energy
DTD:

<!ELEMENT sect1 - O (title , ((p+ , (sect2 , sect2+)?) | (sect2,
sect2+))>

<!ELEMENT sect2 - O (title , ((p+ , (sect3 , sect3+)?) | (sect3,
sect3+))>

(And so on, down about six levels. There's actually more
in the content models, too, but it's irrelevant to this discussion.)
This DTD requires that a section of level n have a title followed by
either sections of level n+1 or text followed by sections of level n+1.
Furthermore, there must be at least two n+1 sections. It's rigid, but it
emulates the rules of outlining that those of us who used to be writing
teachers labored to get our students to apply.
This view of hierarchy and tagging reflects a view of a document as an
organic whole. It expects the writer to plan the document as a whole and to
strive for consistency. What constraints may be built into the DTD are more
to reinforce the writer's planning than to restrict the writer's
creativity.
Aside from enforcing what the Freshman English teacher
often wasn't persuasive enough to do, there are often good reasons for
enforcing structure through the DTD. As someone said to me at lunch
recently, there are times when someone may die if the documentation isn't
right (many of you have flown on aircraft whose onboard documentation is in
SGMLbased hypertext systems--think about it). Managing structure is often
an integral part of making sure all the pieces are there and in the right
order.
Compared to HTML the fragment above may seem like a rigid model. In
practice, however, it's moderate. I'm currently working on a DTD that
doesn't just enforce containment and hierarchy: it actually enforces by
name the presence of specific sections in a specific order. In contrast to
that rigid enforcement, companion "loose" DTDs are often used for authoring
and other conditions when documents are in incomplete or unstable
conditions. Such a DTD would allow a model of simply optional n+1 sections
rather than the two-or-more enforcement.
The reason for tracking hierarchy often isn't the urge
to control. It may be to automate tasks reliably. Not only can sections be
numbered automatically, but so can figures, tables, footnotes, and other
objects. Running heads are generated automatically (that's not just a
print-medium function: I can see similar things being done with HTML's
current <title> tag.) Lists can be nested with predictable results.
Hyperlinks can be inserted in a consistent manner (we do extensive linking,
with both the "name" and the "href" attribute values generated
algorithmically, when we generate HTML from our own hierarchical DTD
applications).

Dan asked for a proposed solution.

I'm not so foolish as to propose the highly constrained model above
for HTML. Solutions for HTML will need more input from more environments
than just technical publishing. Above all, I'm leery of suggesting single
solutions.
When the DoD started the CALS initiative, some of their managers (not
hands-on documentation people) decided that DoD would have *ONE* DTD for
all their documentation, whether the subject was operating manuals,
maintenance manuals, inventory records, or procurement contracts. A lot of
us with experience in SGML and publications put up a howl, but it took
getting burned by (and having to pay for the consequences of) that edict to
get DoD to recognize that they need lots of DTDs for lots of applications.
People have many different expectations of the WWW. Some
of us want to use it to disseminate research articles or corporate
operating procedures (just two things on my plate). Others may just want to
advertise their neighborhood deli. Still others will try to create
avant-garde multimedia art. Will one DTD fit all of them? I think not.
HTML as we know it today is like The Electric Pencil and other
products I remember from my days of writing on a Osborne 1. It's simple.
Just about anyone can learn to do some things. The fact that it works at
all is so remarkable that people either ignore its limitations or become
very creative in working through and around them. So long as we have just
one DTD and it remains simple, we'll have that situation in which a lot of
people can use it and a number of them will be simultaneously creative and
frustrated. But if we try to jump from The Electric Pencil directly to
WordPerfect 6.2, as it were, we're going to have all sorts of side effects.
Not only will we make mistakes, even the things we do right won't satisfy
everyone. (I looked forwards to Word's automated list numbering until it
came out; now I have to do all sorts of things to defeat it. Nonetheless, I
wouldn't go back to the tools I started with in computing.)

So rather than suggest that we scrap the current nonhierarchical
application and shove something like the my complex pet DTD in its place, I
suggest that we need to move in the direction of supporting multiple DTDs.

Furthermore, I think that we need to consider two
parallel approaches to multiple DTDs. One path is for there to be shared
DTDs, developed and agreed to in public like the current DTD. The other
path is for support for userdefined DTDs (DTDs specific to a particular
application, to a particular user community, etc.)
On the first path, we should try to do "most things for most people".
We should maintain a simple, nonhierarchical DTD, perhaps a slightly
expanded version of the present one, as a frozen beginner's tool.
Rather than keep packing more features into the simple DTD until it
becomes unusable and unmaintainable, we should also provide at least one
hierarchical DTD with a set of features analogous to the kinds of objects
typically packaged in the sample applications that come with word
processors (sectioning, lists, paragraphs, special functions like address
blocks and preformatted blocks). We should make the elements as structural
as possible (make what we need to provide in the direction of formatting
through attributes). Hierarchy should be loose (e.g., jumping from a
section at level n to one at n+2 should be prohibited, but there should not
be a requirement that there be at least two level-n+1 sections in a level-n
section). Perhaps we should add to it simple tables and sub scripts and
superscripts (but not built-up equations). I believe that this extension
should be hierarchical because of the benefits the increased level of
structure can bring. Thus we can still have the ease of use of HTML as we
know it plus an extension for the many people who need something more, but
not a full-blown publishing system.
I recognize that the current DTD for HTML 3.0
perpetuates the current heading tags and lack of hierarchical structure.
If, however, we choose to establish two (or more) base DTDs, rather than
continuing to build "one size fits all", that part of HTML 3.0 could be
changed without negating the other work thereon. (Our own DTDs are modular:
we can change the sectioning, the lists, or the tables, for example,
without breaking the rest of the DTD. I recommend this, along with the
marked-section facility, for maximizing flexibility.)
Furthermore, if markup minimization is properly managed,
the tagging burden will not increase for the user (of course, using
structured, DTD-sensitive editors, will greatly reduce the burden, too).
The second path may already be on us with the appearance
of extension browsers like SoftQuad's Panorama. I know of quite a few
people who are waiting for that product to get out of beta so that they can
deliver their own documents, done to their own DTDs, without conversion and
use HTML only for their home pages.
We should leave complex tables (vertical and horizontal spanning of
headings, inserted headings, footnotes within tables) and mathematics and
chemistry to these people. Although there are some well-known table systems
(e.g., CALS) and equation notations (e.g., ISO 12083, ISO/IEC TR 9573-11),
none of them suits all users. A DTD does not a whole publications solution
make, on the WWW any more than on paper. There will need to be means for
communicating the semantics of tags to the receiving party. In most cases
this will involve communicating style sheets for presentation (in the case
of tables and equations most people have so confused presentation and
semantics that the two are not easily separable). SoftQuad has recognized
this need in Panorama: both DTD and style sheet need to accompany a
document. Now that DSSSL is near to completion, we may have a standardized
means for interchanging presentation semantics.
(DSSSL passed its second ballot as a Draft International Standard
overwhelmingly. It now appears that there will be convergence of some of
the shared components of DSSSL and HyTime, perhaps as soon as early summer.
DSSSL-Lite, sometimes mentioned as a stylesheet tool for HTML, will benefit
from the convergence).
What should be avoided in this development is the proliferation of
proprietary extensions to HTML by individual providers of browsers.

Another issue to be considered in evaluating DTDs as
part of a solution is how we view the thing called HTML itself. For many
people currently using the WWW, HTML is the application for original
creation of documents.
Except for some control structures like home pages, that
is not the case for others of us. The actual documents my group does are in
much more complex applications of SGML. HTML is for us an output language,
used much in the same way we use PostScript. Our HTML documents are
generated entirely by machine. We fully enforce and use hierarchy in our
source documents. In our environment, the richness of HTML, rather than the
presence or absence of hierarchy is of greatest importance. We reflect part
of our hierarchy by converting a single document in our rich application to
a tree of linked HTML documents. Thus, for example, <title> on the
titlepage is mapped to both HTML's <title> and to an <h1> element on our
top-level document that links to chapter files, <title> inside <sect1> may
be mapped to <h1>, <title> inside <sect2> to <h2>, etc. in chapter and
subchapter documents. The containing elements, having served their function
to manage original structure, simply disappear in HTML. Implementing
hierarchy in HTML might simplify our mapping process for some elements, but
we would be unlikely to create more documents directly in the HTML
application. In other cases, we might use task-specific tagging in our
local DTD but translate to quite different structures in HTML-seen-as-
output-language, such as mapping labeled paragraphs into a two-column
table. We can use a very complex SGML application in a controlled
environment, with structured editors and staff trained in our tagging
procedures. I wouldn't expect the general user to work that way.
However, I notice an increasing number of writers for
whom HTML is the language of choice for original creation. It is for them
that I believe that hierarchy offers the greatest potential use.
Implementing hierarchy would, for example, enable outliner-like functions
in HTML editors. It would encourage better-built documents. It would also
simplify the writing of DSSSL-based stylesheets. Hierarchical structure
also enhances searching with tools that are able to support it (e.g.,
retrieve a chapter-level unit whose title element contains "whales"). For
these morethan-casual users, hierarchy and containment should offer more
benefits than they cause additional complications.

Having said that, I can come to Dan's other questions, whether such a
solution is cost effective and whether the impacts are manageable.

I can't speak for global cost effectiveness (I'm in publishing, not
software marketing). A large part of the explosion of the Web seems to have
been a result of cheap software. Home users, students, even practitioners
of "big science" can try it out without making a big investment in
software. I'm sure that some people will drop out if either the cost of the
software or the cost of learning becomes too great. (In the more general
SGML world we hear the same sorts of arguments. It helps to be able to
point to EMACS, James Clark's parsers, PERL, and TeX as "free" software for
building a system. Of course such a system requires a healthy investment in
learning, but that's another story.)
But there are a number of companies betting that some customers will
be willing to pay for Web tools. I think they will find buyers. Our
operating contractor in Oak Ridge has invested in both an SGML-based
information-management system and a commercial WWW user agent. We're willing to pay for better performance and support.
Freezing a simple DTD should obviously be cost
effective. We're almost at the point where that could be done. Browsers
could maintain a compatibility mode (think how almost all word processors
can both read and write something of an ASCII sort or even old versions of
WordStar). All that a browser writer would need to add to a current product
to support it would be a means for switching modes. The benefits for new
users or users who need to do only the things that are currently possible
should be self evident.
Building a second, publicly shared DTD that all browsers would also
support might simultaneously avoid breaking the current design, satisfy the
needs of many users who aren't satisfied with its capabilities, and stave
off proprietary extensions. I favor software vendors' competing on
performance or usability. Extending the DTD outside the public review
process or trying to use extensions to shut out competitors seems utterly
foreign to the spirit of the WWW. While I recognize vendors' need to
differentiate themselves, I take the part of the user. I'm in favor of
reaching as many users with as much flexibility in documents as possible.
Thus I favor a small number of widely supported DTDs.
Adding hierarchy to a DTD would mean that the user would need to
validate documents before posting them. For the impoverished user we could
hope for a continuation of current public validation services. But there
should also be market opportunities for vendors of validating editors. (I
should remark that is _not_ necessary for browsers to support full
validation so long as they handle errors gracefully. SoftQuad's Panorama
does not validate, though it reports some SGML errors. Validation means
much more on the side of the originator.)
The burden of supporting individualized DTDs can appropriately fall on
those who need them. The greatest expense is in developing the DTDs and
supporting entities like DSSSL specifications. A secondary expense is
creating the documents. But those who need complexity are often ready to
meet the challenge: think of the many mathematicians who still learn TeX,
in spite of the equation capabilities of popular software, because only TeX
meets their whole need. These users still face the need for having wide
distribution for open-ended browsers. Nonetheless there may be a business
model for this, and not just in SoftQuad's offering of Panorama. Adobe was
already doing something similar with Acrobat viewers. In both cases, the
vendor seems to expect to make their expenses by selling software for
creating documents, rather than for viewing them.

This has been perhaps too long a response, but I think
these issues need to be considered as we think about the long-term future
of HTML and its descendants.

Dr. James D. Mason
(ISO/IEC JTC1/SC18/WG8 Convenor)
Oak Ridge National Laboratory
Information Management Services
Bldg. 2506, M.S. 6302, P.O. Box 2008
Oak Ridge, TN 37831-6302 U.S.A.
Telephone: +1 615 574-6973
Facsimile: + 1 615 574-6983
Network: masonjd @ ornl.gov