Specifying error recovery heuristics (Was Re: META)

Joe English (joe@trystero.art.com)
Wed, 5 Jul 95 14:28:00 EDT

Glenn Adams <glenn@stonehand.com> wrote:
> From: Joe English <joe@trystero.art.com>
>
> Plus, I don't think the spec should say any more about
> how to deal with illegal documents than it already does.
>
> The problem is that the spec does say something already in a way
> that may result in ambiguity regarding the HEAD/BODY distinction.
> Namely, it says (under Undeclared Markup Error Handling):
>
> "markup in the form of a start-tag or end-tag, whose generic
> identifier is not declared, is mapped to nothing during tokenization"

(FWIW, I was opposed to including that in the spec too.)

That heuristic works, more or less, for some proposed
extensions like <CENTER>, <FONT> (and most of the Netscape
extensions in fact), and <FIG>.

It breaks on other proposed extensions like <TABLE> and <STYLE>.
We could specify another heuristic to handle future HEAD
elements like <STYLE>, but in the general case it's impossible
to specify today what a browser should do when it encounters
something that gets invented tomorrow.

On the other hand, recovery procedures in the 2.0 spec can be
used to to evaluate whether future enhancements are compatible with
2.0 browsers. E.g., HTML 2.x documents that use <TABLE> or
<STYLE> would have to be labelled "text/html; level=3" or
something like that since those elements are not compatible with
level 2 browsers; those that use only <FIG> could be labelled as
"text/html" because that element is backwards-compatible.

The "don't display content in the HEAD" rule would be a good
heuristic to include in the spec; unfortunately the majority
of browsers circa June 1994 (or circa July 1995, for that matter)
did not implement it, so I don't think it should be.

--Joe English

joe@art.com