Joe English (
Thu, 6 Jul 95 14:23:59 EDT

Eric Bina <> wrote:
> Joe English <> says:
> > It would suffice to ignore the content of unrecognized
> > elements (i.e., those not in HTML 2.0) until it can be
> > determined that the <BODY> element has started.
> [...]
> If I understand this proposal, it fails for new non-content head tags.
> In the above example you assume that even though the browser doesn't know
> <STYLE>, it knows to match it to </STYLE> and ignore content. Suppose
> you have:
> <!doctype html PUBLIC "-//IETF//DTD HTML Experimental//">
> <html>
> <head>
> <title>blah</title>
> <newtag>blah...
> <!-- <newtag> unknown; how do you know how far ahead to look for
> </newtag> before giving up and inferring </HEAD><BODY>
> to make "blah..." the start of the body of the document? -->

Following the suggested heuristic, the browser would have
to assume that <NEWTAG> has content, and ignore the "blah...",
even though a parser which understood "DTD HTML Experimental"
would infer "</HEAD><BODY>" once it saw "blah...".

This is another case where authors who use new features
(in this case <!ELEMENT NEWTAG - O EMPTY>) will need
to explicitly include </HEAD> and/or <BODY> if they
want to ensure compatibility with level 2 browsers.
(I would be in favor of making <HEAD>, </HEAD>, <BODY>, and </BODY>
mandatory in HTML 2.1 for this and other reasons. They're
only currently omissible for compatibility with level 1
documents, which -- I assume -- will still be parsed according
to the 2.0 spec since they lack a <!DOCTYPE> declaration.
(I would also be in favor of mandating the <!DOCTYPE...> declaration
as well...))

This brings up another issue: future elements with empty content
or omissible end-tags. The "ignore unrecognized tags" rule
will work for most current browsers, since they don't generally
examine the element hierarchy, but this may interact badly
with stylesheet mechanisms, structured queries, or other
applications where the nesting level is important.

This suggests another guideline for future enhancements:

All new block-level and phrase-level elements must have
content, and must not allow end-tag omission.

The TABLE element conforms to this guideline, for example;
even though <TR>, <TH>, and <TD> *do* allow end-tags to
be omitted, the outer-level TABLE element does not,
so applications that don't yet grok tables can still
re-synchronize when they see the </TABLE> end-tag.

This guideline would not apply to new HEAD elements,
since presumably authors using new features will include

--Joe English