Re: Concerns about HTML+ complexity (example)

fox@pt0204.pto.ford.com (Ken Fox)
Errors-To: listmaster@www0.cern.ch
Date: Fri, 17 Jun 1994 20:49:24 +0200
Errors-To: listmaster@www0.cern.ch
Message-id: <9406171842.AA20932@pt0204.pto.ford.com>
Errors-To: listmaster@www0.cern.ch
Reply-To: fox@pt0204.pto.ford.com
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: fox@pt0204.pto.ford.com (Ken Fox)
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: Concerns about HTML+ complexity (example)
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Type: text
Content-Type: text
X-Mailer: ELM [version 2.4 PL23]
X-Mailer: ELM [version 2.4 PL23]

I think it would be helpful to demonstrate what I'm after in HTML+.  Here's
a *very* simple HTML example:

------
<ul>
<li> first
<li> second
<li> third
</ul>
------

Looks pretty harmless doesn't it?  Now as a user I might want to configure
things.  The browser may let me change the fonts, sizes, colors, bullet
shapes, etc. that are used to render the list.  The author doesn't really
get to do anything except control the order and that fact that it should
should not be numbered.

I think that this is quite a lot of functionality.  HTML looks pretty good.
Individual browsers may not allow as much control as what I mentioned, but
this is pretty easy to implement and there are quite a number of browsers
that I can choose from.  Chances are *somebody* will do something that I
like.

There are alternatives to the HTML example though:

------
<vbox style="list">
<hbox> <img style="bullet"> <p style="item"> first </p> </hbox>
<hbox> <img style="bullet"> <p style="item"> second </p> </hbox>
<hbox> <img style="bullet"> <p style="item"> third </p> </hbox>
</vbox>
------

I know this is ugly!  Please bear with me... ;-)

It has an advantage over the first example in that it is building a list
structure out of more general elements.  There are three types of geometry
managers (or packers) in this example:  the vertical box manager (packs
boxes into a column), the horizontal box manager (packs boxes into a row),
and the paragraph box manager (packs boxes into a paragraph shape).  There
are only two visual elements:  inline image and text.  Everything uses the
attribute database via the style attribute.  This provides information the
browser (or a rendering agent) uses to figure out how to "draw" an element.
(Where "draw" might mean how wide the margin should be, how loud the volume
is, whether footnotes are popped up or scrolled to, etc.

(Aside:  I would also add one more geometry manager:  the plain box, which
simply puts it's children where ever they want to go.  This manager would
allow overlays for example.)

A lot of things just "work."  For example, since paragraph text can flow
around any box, and the list is in a box, text could flow around the list.
There's no need for special cases in the layout code:  everything is some
kind of box arranged with a geometry manager.  Tables become part of the
general case:  they are just <grin> collections of boxes with horizontal and
vertical rules between them.

I would be the first to admit that there must be sophisticated layout
algorithms (especially for the paragraph manager) to handle this
architecture.  However, once good sample layout algorithms are designed, this
should not be a problem, since additional rendering and layout algorithms
can be easily integrated.  The layout algorithm, much like the data transfer
algorithms in libWWW, should be fairly easy to provide in library form.

It's extremely extensible.  Each visual element (possibly each geometry
manager) is associated with a rendering agent.  This agent may be internal
or external.  The agent is identified via HTML element and/or MIME type.  A
simple browser might not render a box if it doesn't know how.  An advanced
browser might ask the user questions on how to render an unknown type ---
e.g. ask the user whether they want to download another rendering agent.
The document might even provide hints as to where they can be found (maybe
through the existing SGML mechanisms.)

The attribute database can be customized by the user or the author.  Authors
provide style tags in the text and a default attribute database.  The
default attribute database is merged in with the user's database.  This
gives the capability of "style sheets" --- but is more general since the
attribute database can specify a lot more than just style.

Now to address the ugly part:

There should be a simple macro language to do the conversion:

<ul> { <vbox style="list"> }
</ul> { </vbox> }
<li> text'itemTokens { <hbox> <img style="bullet"> <p> text </p> </hbox> }

Ignore the syntax, I don't really care what it looks like.  This example is
Bertrand'ish.  It would be reasonable for it to be m4'ish or prolog'ish or
Tcl'ish.  For this to be easy to implement, we will need to define what the
terminal tokens (and types) are.  If documents are able to introduce
arbitrary new types (like itemTokens from above) browsers will have a much
tougher time...  I know that SGML has some capability in this area, but I
don't know how complete it actually is.  Ideally, we should be able to test
out syntax changes for document authoring without ever changing a Web
browser.  For instance, I can see where a *lot* of table proposals could be
experimented with.

I know there are going to be complaints about how little "structure" there
is in this proposed direction for HTML.  My argument to this is that there
shouldn't *be* a lot of structure for a viewer.  There should only be
"viewing" structure.  If you want additional structure, use a DTD suitable
for the subject (e.g. a legal contract DTD) and then convert it to HTML for
viewing.

I appologize if these ideas have come up before.  I figured if they had I
would have seen some (however slight) influence on HTML+.

One last thing:  I have to agree with Marc Andreessen on the thought of
standardizing existing practice before charging ahead with a new and
radically different HTML.  I guess this means that HTML+ == what Mosaic
handles.  Although, maybe the paragraph tags should be fixed anyway? :-)
Perhaps I'm arguing about HTML 2? 3?

- Ken

-- 
Ken Fox, fox@pt0204.pto.ford.com, (313)59-44794
-------------------------------------------------------------------------
Ford Motor Company, Powertrain    | "Is this some sort of trick question
CAD/CAM/CAE Process Integration   |  or what?" -- Calvin
AP Environment Section            |