How about (Dan) producing a (draft) informational RFC that describes
the tokenization of SGML/HTML?  A separate document would provide the
desired documentation but would not impede the standardization of HTML
2.0.  Adding more wording to a document that required a lot of time to
reach rough consensus would further delay the process and introduce new
contention.
Dave Kristol