Re: More syntax details in HTML 2.0?

Peter Flynn (pflynn@curia.ucc.ie)
Wed, 14 Jun 95 19:01:38 EDT

I asked a few members of the web consortium staff, and I got very
strong feedback from folks with generally solid technical backgrounds
who were somewhat new to the HTML spec that the spec is incomplete: it
defers too much to the SGML spec, especailly on lexical issues:
delimiter recognition, tag ommission, treatment of whitespace; that
sort of thing. The bottom line is: you can't pick up the HTML spec and
write an HTML parser without becoming an SGML priest.

That is what I would expect: as HTML is written in SGML, you do need
to understand SGML in order to write a parser. By the same token, an
HTML parser _is_ an SGML parser...and there is code already available
in the public domain to do this...

I think it is quite unnecessary for the HTML spec to contain details
of SGML at the level you are suggesting. HTML is an _application_:
anyone wishing to write software within the domain of HTML will
probably need to know something at least of SGML before they start.

If I buy package X which is written in C, then I don't need to know
any C to just _use_ it, but if I want to write contributory code, then
it is presupposed that I am fluent in C.

Now that I think about it, I think this complaint has been lodged
in this forum many times, and I've turned a deaf ear to some extent.

I think you've been busy doing a lot more valuable work guiding the
spec than rewriting the SGML Handbook.

But I think it has finally sunk in: the stuff about "tokenization"
needs to be expanded to be as detailed as a lex specification.

I really don't think the HTML Spec if the place for this at all.

Get a copy of the DocBook DTD Spec. It does what we're doing: it
explains what the elements are and what they do...it doesn't try to
provide details about how to parse SGML.

So, barring objections from this working group, I'm going to make
another revision to address this issue. I expect it will take a week
to write and a week to review. So the target would be June 26 or so.

I appreciate your concern, Dan. I'm sure we've all been the target of
queries about how to "do this" or "do that" piece of work "in HTML". I
also share your concern that there are no books on the market about
how to write SGML software (not that I have seen: somebody tell me I'm
wrong)...you have to deduce it from the spec.

I reckon that makes this an objection :-) I really don't think we
want to overload the HTML Spec with this material, unless someone can
come up with convincing reasons why it's going to stymie the project
if we don't. It hasn't so far.

But I do share your view that we should point interested readers are
other places they can find this information.

///Peter