Alex Hopmann (
Tue, 4 Jul 95 15:47:07 EDT

Only July 4, 1995 Terry Allen writes:
>| Input
>| 1. <!-- select doctype above... -->
>| 2. <HEAD>
>| 3. <TITLE><!-- your title here --></TITLE>
>| 4.
>| 5. abc
>| 6. <!-- your HTML test data -->
>| 7. </BODY>
>| 8.
>| Ye ole validation service at
>| says:
>| Bing! It's legal, and abc is in BODY:
>| Parsed Output (Element Structure Information Set)
>| (HTML
>| (HEAD
>| )HEAD
>| (BODY
>| -abc\n\n
>| )BODY
>| )HTML
>| C
>| ------ end included message ------
>| What this message means to me is if an unknown HEAD tag with content is
>| encountered, the unknown tag will be ignored, and the content will force
>| an assumed beginning of the body because </HEAD> is optional. Needless
>| to say, this is bad.
>| Eric
>That does not follow. It's only when you hit something you know should
>be in BODY that you can be sure HEAD is finished (in a conforming doc).
>"abc" is something (PCDATA) already accounted for in the DTD.

Let me try to revisit the example that Eric Bina gave:

1. <!-- select doctype above... -->
2. <HEAD>
3. <TITLE><!-- your title here --></TITLE>
6. <!-- your HTML test data -->
7. </BODY>

Now lets pretend for a minute that I'm a parser that doesn't understand
<METADATA>. So I have no idea if <METADATA> is a HEAD tag or a BODY tag. If
<METADATA> is in the BODY I would ignore the METADATA tag (not understanding
it) and just display abc normally. But if METADATA is in HEAD, I would just
ignore the whole thing.

This is I think where SGML does not work always for the WWW. SGML assumes
that it always has a correct DTD, whereas we need to build an application
that can be libreal with what we receive (And hopefully strict with what we
send out...) Disclaimer: I am not saying SGML is bad or anything of the
sort. I just think we need to keep in mind that we need to only use those
constructs that will actually work without the exact DTD that was used to
create the document being available.

Alex Hopmann
ResNova Software, Inc.