Re: Toward Closure on HTML

"Daniel W. Connolly" <connolly@hal.com>
Errors-To: listmaster@www0.cern.ch
Date: Thu, 7 Apr 1994 20:41:57 --100
Message-id: <9404071829.AA21166@ulua.hal.com>
Errors-To: listmaster@www0.cern.ch
Reply-To: connolly@hal.com
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: "Daniel W. Connolly" <connolly@hal.com>
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: Toward Closure on HTML 
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Type: text/plain; charset="us-ascii"
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0
Mime-Version: 1.0
Content-Length: 2780
In message <Pine.3.85.9404071040.A8054-0100000@hmmm>, "Rob Raisch, The Internet Company" writes:
>
>Dave, that may be correct in principle, but in real life it opens a 
>rather nasty can of worms.  (Re: <p>Text -- infering or assuming  the 
>missing </p> endtag)

Inferring </p> tags would be is easy. Well... at least the SGML standard says
how to do it in a way that's consistent with current practice in HTML.

It's the start tags (<p>) that cause trouble.

>When we get to a point where we support stylesheets (PLEASE!) it is of 
>extreme importance to consider <p></p> a container.  Without this, it is 
>not possible to assign stylistic attributes to a contained element.

Counter-argument: The MidasWWW browser had a really nifty stylesheet-based
hypertext widget set, and it grokked empty P elements just fine. Something
like:
	*HTML*BODY.font: ...
	*HTML*BODY*P.breakBefore: True
	*HTML*BODY*P.breakAfter: True

>Current practice suggests that <p> is not a container at all, it is a
>logical break -- or it is considered as a container with no contents.  This
>is the behavior of available browsers, as I understand them.

Agreed.

>---------------------Example---------------------
><body>
>This is text with no container. (1)
><p>
>Perhaps this is text in a <p> container. (2)
><p>
>Hmmm... no </p> associated with the previous <p>!  Do we assume that there 
>was to be one, or do we treat <p> as a break? (3)
></body>
>-------------------------------------------------
>
>The principles behind SGML -- and by its lineage, HTML -- are to markup 
>the structure of the document.  

>In the previous example, what is the text associated with (1)?  It is
><body> text or <p> text?

It is straightforward to construct DTD's where (1) is content of
the BODY element. The draft-iiir-html-01 version of the html DTD
did this. My recent html version 1.7.2.4 also does this.

I think it is impossible to construct a DTD where (1) is the
content of a P element without doing stuff like "The first
element of a BODY element must be a P."

>  And if we build stylesheets which allow logical
>elements within the document to have their own stylistic "hints", which do
>we apply to (1)? 

Body.

The declarations
    <!ELEMENT BODY O O (#PCDATA|P|OL|UL|DL|H1...)>
    <!ELEMENT P - O EMPTY>
are consistent with current practice.
I have considerable evidence to back that claim.

Parsing extant documents relative to delcarations like
	<!ELEMENT BODY O O (P|UL|OL|...) -- no #PCDATA -->
	<!ELEMENT P - O (%htext)+>
results in errors.

If there is sufficient motivation to change all the documents out
there to move #PCDATA out of BODY and into a subordinate paragraph
element (which I agree is a good idea), why not call give that
element a new name like PP while we're at it?


Dan