Hierarchy support in HTML [Was: Tables: what can go in a cell]

Daniel W. Connolly (connolly@hal.com)
Wed, 8 Feb 95 12:19:07 EST

In message <01HMRR3JV3TQ8Y5MKR@oax-2mr.mr.ornl.gov>, James D Mason writes:
> Although current
>HTML is nonhierarchical, I believe that it will need to evolve towards
>hierarchical models if it is to be able to deal with scientific and technical
>information, among other things. (In other words, I'd like to see <body>
>enforced as a container and would go even further to replace <h1>, <h2>, etc.
>with container elements that consist of a title followed by data that might
>include lower-level container element.)

Could you elaborate on this? I'd like to see:

* a statement of the problem as you see it

* a proposed solution

* a demonstration that this solution is globally cost-effective
without being locally prohibitive (i.e. the sum of all the
effort of deploying this solution is less than the cost of
dealing with the problem with existing technology, and yet
no one party bears too much of the burden. For example, if
you require every information provider to do something, it
had better be minimal.)

* a discussion of graceful deployment and interoperability issues.

Granted, the situation as it stands is not optimial. I wish that
everybody who ever stuck an html page on a web server had validated it
first from day 1, and that if they had found the DTD restrictive, they
would have tweaked it and discussed their tweaks with other folks.

[By the way: I do this all the time. I use CVS to manage my web, and
every time I commit a change, I use CVS hooks to validate my html
files. So nothing gets into my public web without being validated. I
hope to make this sort of toolset widely available someday... let me
know if you want to help.]

But here we are. The question is: what steps can we take to enhance
the situation without breaking things.

One nifty thing to do would be to add support in the popular servers
for validating documents. For example: some sort of "sgml cop daemon"
could walk through the tree of documents, validating them, and
changing them to be not world-readable if they don't pass.

Now why would anybody do that?

Suppose they're O'Reilly or Digital, or somebody who wants to maintain
a certain "house style". They could tweak the DTD locally so that
documents must conform to the HTML 2.0 spec, plus house style rules
(every page has an <address> either at the beginning or the end of the
body... every page has at least one <link> tag...)

Another idea is to enhance browsers to support a "validating mode" so
that while folks are previewing their documents, they can SGML
validate them. I hope that sort of thing gets deployed in my lifetime,
but I can't make a good case that it's cost effective right now.

Plug: the QuarterDeck HTML authoring add-on to Word-for-windows 6.0
does exactly that: validates your document when you save it. I hope
more authoring tools support this sort of thing.