I always thought that the whole point of expressing HTML
in terms of SGML is that you wouldn't *have* to write
an HTML parser.
There are hundreds of RFCs that contain BNF or EBNF
grammars, yet none that I've seen have a description
of how to build a parser from a context-free grammar.
Also, perhaps more relevantly, look at any RFC that defines
an SNMP MIB -- pages and pages of ASN.1, which is just as
incomprehensible to the uninitiated as a DTD. Yet there's
no ASN.1 RFC either. [ Or maybe there is -- if so, please
tell me where, I've been looking for one! ]
The HTML RFC is not the right place for a definition
of SGML syntax.
> But I think it has finally sunk in: the stuff about "tokenization"
> needs to be expanded to be as detailed as a lex specification.
>
> So, barring objections from this working group, I'm going to make
> another revision to address this issue.
Add one more objection. Any definition of SGML in this
RFC is sure to take much longer than a week to get right,
and will almost certainly be incomplete.
If this issue becomes a real show-stopper with the consortium
staff, the IETF, or the RFC editor, there is another option.
There's a highly knowledgable SGML expert who has been working
on an SGML RFC in his spare time; last I heard it was still far
from complete, but we could ask him real nicely to submit what
he's got so far as an Internet-Draft, then cite it as a
work-in-progress.
--Joe English
joe@art.com