Re: Agree: empty P, container PP [Was: Hot Metal and HTML ]

Murray Maloney <murray@oclc.org>
Date: Wed, 15 Jun 94 13:13:27 EDT
Message-id: <9406151304.aa09742@dali.scocan.sco.COM>
Reply-To: html-ig@oclc.org
Originator: html-ig@oclc.org
Sender: html-ig@oclc.org
Precedence: bulk
From: Murray Maloney <murray@oclc.org>
To: Multiple recipients of list <html-ig@oclc.org>
Subject: Re: Agree: empty P, container PP [Was: Hot Metal and HTML ]
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
X-Comment: HTML Implementation Group
First, I agree with Dan.  Option 1 or the 2nd option 2.

Second, I'd like to respond to the assertion that
"the SGML tag implication algorithm is not strong enough"
to deduce opening tags.  That's not quite accurate.
In fact, I have been toying with this all morning
and through my lunch hour.

There is a way to have start tag inference by modifying
the DTD in a subtle (and potentially ugly) way.  Before
I go on to explain it, I'll note that I have not yet
convinced myself that it would be desirable to employ
this technique.  Neither have I convinced myself that
I haven't overlooked something that will bite us if
we try something like this.  But, since this seems
like an opportune time, here goes...

If the DTD were defined such that all block elements
and headings must be followed by a <P>, then the DTD
could specify that start-tag minimization was allowed
in those instances.  So, the following:

	<H1> Title </H1>
	Some text
	<P>
	Some more text
	<UL><LI>foo<LI>bar</UL>
	Still more text

Would be read by an SGML parser as:

	<H1> Title </H1>
	<P>
	Some text </P>
	<P>
	Some more text </P>
	<UL><LI>foo</LI><LI>bar</LI></UL>
	<P>
	Still more text </P>

The drawbacks to this that I can see are twofold:
	
	1) an SGML editor. like HoTMetaL would require
	   an author to insert a <P> after every 
	   element which required it

	2) all such elements which were followed
	   by other block elements would still 
	   have to have a phantom <P> in between


The list of affected elements, I think, is:
	<H1> <H2> <H3> <H4> <H5> <H6>
	<HR>
	<ADDRESS>
	<BLOCKQUOTE>
	<UL> <OL> <DL> 
	<PRE>

And, presumably, the obsolete elements:
	<DIR> <MENU>
	<XMP> <LISTING>

Also, %flow; would have to be modified so that instead
of allowing (%text | %block ), it would allow ( P, (%block)*)

Comments?

> 
> In message <9406150919.AA02162@www3.cern.ch>, Tim Berners-Lee writes:
> >
> >This plan failed, as the SGML tag implication algorithm is not
> >strong enough (-Dan).  That is, it can deduce closing
> >tags but not opening tags. So the trick will work for <LI>
> >and <DT> and <dd> because they all have opening tags, but
> >it won't work for <p>.
> >
> >This means that either
> >1. <p> is kept as a separator, maybe with <pp> as a para style container, or
> >2. We mandate that HTML parsers have a higher level of tolerance
> >   than SGML parsers, in particular they can infer opening tags; or
> >2. Text is allowed outside paragraphs as well as inside, as
> >   Dave Ragget has suggested for html+; or
> >3. The new spec is called HTML+ or HTML2 but not text/html.
> >
> >These are as I see it the four options open to us as we plot the course of
> >WWW history. 
> 
> This is an excellent characterization of the situation.
> 
> I'm willing to live with option 1 or the second option 2, but not
> option 3 or the first option 2.
> 
> I suggested option 1 long ago on www-talk
> 
> http://gummo.stanford.edu/html/hypermail/www-talk-1994q2.messages/100.html
> message-id:9404071530.AA20967@ulua.hal.com
> 
> The problem is how to introduce the PP tag... Information providers
> can't be expected to just start writing:
> 
> 	<h1>head</h1>
> 
> 	<pp>para 1
> 
> 	<pp>para 2
> 
> 	<h2>another head</h2>
> 
> today, because it won't "look right" -- no browsers will distinguish
> para 1 from para 2. I futher suggested in
> 
> http://gummo.stanford.edu/html/hypermail/www-talk-1994q2.messages/109.html
> message-id:9404071850.AA21181@ulua.hal.com
> 
> that we provide, as a
> transition technique, a declaration of PP like:
> 
> 	<!ELEMENT PP - O (%htext, P?)>
> 
> and folks could write:
> 
> 	<h1>head</h1>
> 
> 	<pp>para 1<p>
> 
> 	<pp>para 2
> 
> 	<h2>another head</h2>
> 
> and get interoperability with current browsers.
> 
> 
> Dan