Re: A Yacc grammar for HTML

Peter Flynn (
Thu, 5 Jan 95 07:04:24 EST

> I just finished cooking up something of a yacc grammar for HTML. It's
> not perfect (and I haven't done the tedious part of hammering out the
> lexical analyzer yet) but I've had a lot of motivation lately to cook
> it up:
> * folks ask me point blank "Is there a lex/yacc description of html"?

Not being a lex/yacc user, can someone fill me in on why this would
be useful? It obviously is, otherwise people wouldn't ask for it, but
I'm curious to know why.

> * folks ask me little syntax questions that can be answered by sgmls
> or the html validation service, but I can't point them to a place
> in the HTML spec that will answer their question. I can point them
> to chapter and verse in the SGML standard, but they probably don't have
> a copy of that.

If you have a list of these questions, or if we can arrange for the queries
to be archived, I'd be happy to develop an "Ask Uncle SGML" section for the
FAQ or the spec.

> * It's bothersome that the HTML spec, a freely available IETF document,
> will depend on the SGML standard, which is not readily available to most
> consumers of this spec.

No more so than an arbitrary C program, which you might develop, will
depend on the C language standard, which is not readily available etc etc.

> * HTML itself is somewhat simpler than SGML in general -- it seems silly
> that folks should have to go through the painful process of learning
> how to read the SGML standard and related literature, just to build WWW
> clients.

No more so than I would have to learn a programming language in order to
write a program. If I wanted to develop a yet another compiler compiler,
I'd have to go and learn the language I wanted it to handle...

> Ah... I'm using bison. The version of yacc around here craps out at
> 600 rules.

Q. What's the difference between a buffalo and a bison?
A. You can't wash your hands in a buffalo...