Why DocType is important

Jeff Suttor (jsuttor@Library.UCLA.EDU)
Sun, 18 Dec 1994 06:51:33 +0100

<!DocType ..> is going to be very important. As we begin to serve
arbitrary SGML instances over WWW, it will help to know what it is :)

*For Now*, we can get away with the peek at the first # chars, insert
<!DocType HTML...> if not present.

I recommend we all start/continue using <!DocType ...>.


Fred E Potts writes:
> Mike,
> I used the HaLsoft Validation service and just fed the URL into it,
> which was the way I figured you would want to have it parsed.
> When I changed the <!DOCTYPE to:
> it parsed okay. <gag!> Live and learn. This presents a real
> interesting problem, and it looks as though a bit of work needs to be
> done.
> It seems the following is how DOCTYPE is currently being used for 2.0:
> PUBLIC "-//IETF//DTD HTML//EN" html.dtd
> PUBLIC "-//IETF//DTD HTML 2.0//EN" html.dtd
> PUBLIC "-//IETF//DTD HTML Level 2//EN" html.dtd
> PUBLIC "-//IETF//DTD HTML 2.0 Level 2//EN" html.dtd
> Regards...
> Fred
> ----- Begin Included Message -----
> > This is what sgmls set at recommended (strict) has to say about
> > http://www.phone.net/~mwm/browserbuster.html :
> >
> > sgmls: Error at -, line 1 in declaration parameter 4:
> > Could not find external document type "HTML"
> > What am I missing here? Certainly a return like the above would cause
> > me to rework the document.
> Well, it says that it couldn't find the DTD for HTML. I would expect
> pretty much everything to fail after that. Since I did something
> unusual for the WWW, and included the <!DOCTYPE to fetch the HTML
> public DTD, I have to assume your sgml system isn't configured
> properly (or mine isn't).
> > As far as I can tell, most documents on the Web can't pass the current
> > DTD, not to mention when sgmls is set to ``recommended.'' And this
> > certainly goes for documents prepared using an HTML authoring editor.
> There are two issues involved here. One is that a web browser fetches
> only one entity. The only way for that single entity to validate under
> sgml is for it to include the HTML decleration and DTD, which you
> really don't want to include in every document you send over the wire
> (the DTD is noticably larger than most HTML pages).
> The other is that the document type is known to be HTML, so many
> people don't even bother with the DOCTYPE statement.
> My solution is a Rexx script that checks for the DOCTYPE, adds it if
> it isn't there, and feeds the HTML decleration to sgmls before the
> document. Basically, I treat the HTML document as an entity that's
> expected to include only the instance. There might be a better way to
> do this, but it does work.
> BTW, I realized a word was left out of the original post; the first
> line should have been:
> As if it were reasonable to expect a browser to swallow all legal HTML.
> ^^^
> <mike
> ----- End Included Message -----