Re: FORM content in DTD (Again)

Daniel W. Connolly (connolly@hal.com)
Thu, 27 Oct 94 13:17:01 EDT

In message <199410271322.AA29245@char.vnet.net>, Stan Newton writes:
>(Retransmission to include working group)
>
>>>the special FORM
>>>tags like SELECT and INPUT are not valid inside ordinary containers
>>>like P.
>>
>>Ah! Here is where your confusion lies.
>>
>>SELECT and INPUT are allowed inside FORM by way of a nifty hack called
>>inclusion exceptions.
> ^^^^^^^^^^^^^^^^^^^^
>>This means that they are allowed inside FORM,
>>and inside any element contained in FORM.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>Say it ain't so, Dan! This means that the HTML DTD Reference sections of the
>HTML 2.0 spec are misleading, doesn't it? The DTD Reference, which has been
>so helpful at one level, does not take these 'inclusion exceptions' into
>account.

Sorry to burst your bubble, but this is true. The DTD Reference should
have some sort of disclaimer to this effect.

>Which lines of code in the DTD itself are supposed to tell me this?

<!ELEMENT FORM - - %body.content -(FORM) +(INPUT|SELECT|TEXTAREA)>

The +(INPUT|...) is the inclusion exception. The -(FORM) is an
exclusion exception. It says you can't nest forms, even though
%body.content might suggest that you can.

> Are
>there any other exceptions besides FORM?

Very few:

<!ELEMENT A - - %A.content -(A)>
<!ELEMENT (DIR|MENU) - - (LI)+ -(%block)>

This means you can't do something like:

<a href="#abc"> some stuff <em>and <a href="#def">more</a></em>
stuff</a>

Even though A is usually allowed inside EM, it's not if the EM is
inside another A!

> Or, am I really lost and everything
>works differently than I understood?

SGML is a little bit like quantum mechanics: your intuition will not
serve you well. You have to do a certain amount of unlearning and
accepting. Be glad that HTML makes minimal use of the more obtuse
features of SGML.

>Historical note:
>I am working on my first project in this HTML area and I have relied heavily
>on the draft of the HTML 2.0 specification to help me understand how this is
>all supposed to work. And I have made a lot of progress because overall I
>think it is quite good.

Wow! What a testimonial! I'm clipping that one out for my scrap book!
Thanks. That's really what this is all about.

>Please take my areas of confusion (FORM tag content and #PCDATA content) as
>feedback about assumptions that are apparently being made about what readers
>understand about SGML. I will take your suggestion and expand my study scope
>and will try the validation testing service. I appreciate your taking the
>time to clarify these points for me.

The HTML 2.0 spec is an incomplete compromise in several ways:

* it includes a certain amount of explanation about SGML,
hoping that folks who have no experience with SGML can get
something out of the document, but it fails to completely
specify the syntax of HTML independent of the SGML standard.

The real syntactic definition of HTML is given by the
SGML standard and the HTML DTD. The other parts of the
HTML document are redundant -- they're there because SGML
technology and specifications are not generally available
to the document's audence.

* It attempts to be an "introduction" to HTML in some
ways, but I have heard it described as "yet another
impenetrable standards document."

* It includes a certain amount of explanation about WWW clients,
but it fails to completely specify what a WWW client is...

* It makes some oblique references to object addresses
(URIs and URLs) without
clearly referencing a normative definition of those, nor
giving a specification for them.

* It mentions HTTP and references an old specification, but
neither completely specifies the HTTP interactions nor
references a document that does so.

Hopefully, all this will be improved over time.

Dan