Re: <!-- comments -->

Joe English (joe@trystero.art.com)
Fri, 20 Jan 95 19:43:10 EST

mag@ncsa.uiuc.edu (Tom Magliery) wrote:

> Forgive me for injecting a 2.0 question into the morass... Consider the
> following:
>
> <!------------------------------------------------------------>

Ah, my favorite subject :-)

> My left brain (or is it my right brain?) does not want this to be a valid
> comment, but I can not dispute it wielding the HTML 2.0 spec. Who wins?
> Brain or spec?

The current (19941128) HTML 2.0 spec is incorrect wrt. SGML.
In section 2.6.5, "Comments", it reads:

After the comment delimiter, all text up to the next
occurrence of --> is ignored.

It should say "... up to the next occurrence of -- is ignored."
Comments are terminated by COM (--), not COM MDC (-->).
This would still be slightly different from the ISO 8879 definition
(which allows more comments to follow the first), but it would
clear up the ambiguity.

My brain doesn't want it to be a valid comment either, but
this one happens to be. There are (unless I've miscounted)
60 hyphens, which tokenize as 30 COM delimiters, which parse
as 15 (empty) comments, which is a valid comment declaration.

One, two, or three hyphens more or less, and it would be invalid.
Insert or delete four, and it's valid again.

Needless to say, some browsers are likely to choke on it
(though 'sgmls' says it's OK.)

A sensible alternative is:

<!-- ======================================================== -->

This is not HTML-specific; it applies to any SGML application.

See <URL:gopher://ftp.ifi.uio.no/00/pub/SGML/productions>,
productions 91, 92, 50, and 5 for most of the gory details.
(Even *that*'s incomplete; see also _The SGML Handbook_,
pp. 359-361 for the list of recognition modes, which explains
why what look like an ambiguity in the grammar really isn't.)

I think that about covers it...

--Joe English

joe@trystero.art.com