>But I agree with Dan that error recovery is for the developer to
>work out, not something that goes in the spec.
>
It might be a good idea to rethink this policy and start recommending
standard behavior on errors. A developer oriented supplement to the HTML2
spec could show some of the common parsing problems and recommend what to do
with them. I believe a lot of the parsing problems in the current browsers
comes from the fact that their developers were not SGML experts (I'm not one
either). If proper parsing behavior was described in simple terms maybe
future browsers developers wouldn't repeat the same mistakes.
The long term problem with parser mistakes is that authors come to depend on
them as proper behavior instead of errors.
A few lexical level issues that I know about:
1) Netscape uses > to terminate a quoted string. For example
<IMG ALT="this is a >
I've had an author argue with me that this is more convenient and refuse to
change his pages.
2) attribues without quotes <IMG HREF=http://djdjd.djdjd>
3) &whatever - what to do with undefined entities
4) érunon - what if I wan't to define an entity named eacuterunon?
5) using multiword strings when the dtd says SGML ID. I saw this used with
the HTML3 <TAB>.
6) Comments of the form <-----------------------------------> with a random
number of -'s.
The objective is to educate browser developers about what is illegal.
The hope being that future browsers they write will flag the errors.
Jon Smirl, jonsm@aol.com