on change bar support in HTML+

Lou Burnard <lou@vax.ox.ac.uk>
Sender: lou@vax.ox.ac.uk
Date: Mon, 01 Nov 1993 11:26:06 +0000
From: Lou Burnard <lou@vax.ox.ac.uk>
To: WWW-TALK@nxoc01.cern.ch
Cc: lou@vax.ox.ac.uk
Message-id: <00974E32.5ADD6860.31228@vax.ox.ac.uk>
Subject: on change bar support in HTML+
The new spec for HTML+ (which is wonderful in many ways) proposes a
CHANGED element to mark both the beginning and end of that part of a
document which might be side-lined or red-lined or otherwise marked as
changed. It also proposes two (different) elements for passages which
are marked as having been deleted or added, as in legal texts (it says)
when for example deleted text may actually be printed with strike though
characters.

I am very happy with the idea of supporting these facilities in HTML+
but I have a few comments on the mechanisms proposed.

Of the three tags proposed, only CHANGED will allow you to deal easily
with a case where a change starts in one paragraph and ends in another.
(When this was touched on in earlier discussion here, I think someone
said flatly that as far as he/she was concerned, a correction that
spanned two paragraphs was two corrections so what was the beef?). 
Leaving aside, for the moment, the mechanism proposed to do this, I
think it's more than likely that passages to be marked as deleted or
added will span paragraphs in exactly the same way as passages to be
marked as CHANGED. Is one supposed to nest multiple occurrences of them
within a CHANGED ... CHANGED element-pair when that happens?

Secondly, one very obvious application for such elements might be for
simple version control. A browser could be instructed to display or
implement only changes relating to a particular version, for example, if
you added a VERSION attribute (or similar) on CHANGED. Maybe that's
already been proposed and rejected, but I missed the reasoning (one
objection is that it won't scale up very gracefully)


Now, as to the mechanism... Let's call the thing you want to mark a ZONE
(whether it's a zone of "stuff you wish to show as having been changed
in some unspecified way" or "stuff you wish the browser to mark as
having been deleted/added "). Current proposals are either

(a) mark the start and end of the zone with the same empty tag. The tag
marks a transition point. On one side you are outside the zone, on the
other you are inside. Only by a sequential scan of the context can a
browser tell where you are. If you do a hyper-leap into the middle of
the zone you won't have the faintest idea that you're in it.

(b) a variant on the above in which you mark the start of the zone with
a ZONE_START and the end with a ZONE_END tag. Again, if you land in the
middle you're in trouble, but now at least when you find a ZONE_* tag
you know which direction to look to find out what's happening. I'm not
sure whether this is worth the extra confusion.

In either case, to find the "other" one, you can use the SGML id/idref
mechanism. Each ZONE tag has an ID attribute, the value of which must be
unique within the document, and either a STARTS or an ENDS attribute,
the value of which is the same as the ID value on its 'partner'. Thus:

.... <zone id=z1 ends=z2> 
         this is inside a zone 
     <zone id=z2 starts=z1> this is outside  

The value for whichever of starts/ends is not specified is understood to
be identical with the value for ID. 

It should be stressed that this is all rather flakey from the SGML point
of view. An SGML parser won't help you beyond checking that there does
is exist somewhere a zone with the ID you supply on a a starts/ends
attribute. An SGML application won't recognize  the stuff within
a zone as a meaningful element. 

Lou Burnard