Proposed DTD Names, Structure [Was: HTML 2.0 editing status ]

"Daniel W. Connolly" <connolly@hal.com>

Mail folder: html-archive
Next message: Daniel W. Connolly: "Reconstruct links? [Was: HTML 2.0 editing status ]"
Previous message: David C. Martin: "Re: HTML 2.0 specification "
Reply: Murray Maloney: "Re: Proposed DTD Names, Structure [Was: HTML 2.0 editing status ]"

Date: Tue, 6 Sep 94 14:04:45 EDT
Message-id: <9409061804.AA01780@ulua.hal.com>
Reply-To: connolly@hal.com
Originator: html-wg@oclc.org
Sender: html-wg@oclc.org
Precedence: bulk
From: "Daniel W. Connolly" <connolly@hal.com>
To: Multiple recipients of list <html-wg@oclc.org>
Subject: Proposed DTD Names, Structure [Was: HTML 2.0 editing status ]
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
X-Comment: HTML Working Group (Private)


OK folks... we need more input on this than just Terry and myself:

Proposal 1: Eliminate the HTML.Obsolete, HTML.Proposed, and HTML.Prescriptive
	marked sections	in the DTD -- leave the Obsolete stuff in,
	and take the Proposed and Prescriptive stuff out. The net effect
	on the grammar defined by the DTD would be nothing.


Proposal 2A: Keep the public identifiers as-is:
html-0.dtd:     "+//ISBN 82-7640-037::WWW//DTD HTML Level 0//EN//2.0"
html-1.dtd:     "+//ISBN 82-7640-037::WWW//DTD HTML Level 1//EN//2.0"
html.dtd:       "+//ISBN 82-7640-037::WWW//DTD HTML//EN//2.0"

Proposal 2B: Replace them with an unregisterd FPI, naming IETF as the owner:
html-0.dtd:     "-//IETF//DTD HTML Level 0//EN//2.0"
html-1.dtd:     "-//IETF//DTD HTML Level 1//EN//2.0"
html.dtd:       "-//IETF//DTD HTML//EN//2.0"

Proposal 2C: Replace them with an registerd FPI, naming IETF as the owner.
	This requires that the IETF register itself somehow with ISO.
html-0.dtd:     "+//IETF//DTD HTML Level 0//EN//2.0"
html-1.dtd:     "+//IETF//DTD HTML Level 1//EN//2.0"
html.dtd:       "+//IETF//DTD HTML//EN//2.0"


Proposal 3: Combine html-0.dtd, html-1.dtd, html-2.dtd into one file,
	as per Terry's suggestion:
>My solution to the three-DTD problem is to fold -1 and .
>into -0, as marked sections, with the basic content models
>supplied with empty parameter entities that expand within
>those marked sections.  Then there's only one file named
>.dtd, and the IGNORE/INCLUDE operations are simple.  Example
>on request.



Background...

In message <199409051826.LAA06383@rock>, Terry Allen writes:
>| 
>| In message <199409031449.HAA06296@rock>, Terry Allen writes:
>| >
>| >The DTDs don't parse together without generating errors
>| >about duplication.
>
>| Could you give some details? I agree that the usage of the
>| varous DTD fragments is underdocumented, but when used as intended,
>| they produce no warnings nor errors for me. Try the html validation
>| service, for example.
>
>I'm using sgmls -degruv and for html-0.dtd I get:
>
>sgmls version 1.1
>sgmls: In file included at litl, line 1:
>       Warning at ./html-0.dtd, line 114 in declaration parameter 4:
>       Duplicate specification occurred for "%block"; duplicate ignored
>sgmls: In file included at litl, line 1:
>       Warning at ./html-0.dtd, line 245 in declaration parameter 4:
>       Duplicate specification occurred for "%html.content"; duplicate ignored


So don't use the -d flag of sgmls. That flag causes sgmls to warn about
complete legal, standard idioms. It may be useful for debugging or some
such, but those warnings do not indicate any non-standard or ill-defined
behaviour.

>and if you look at the DTD you see that 
>
><!ENTITY % block "P | %list | DL
>                | PRE | BLOCKQUOTE %block-2">
>
>is defined at l. 114, but above,
>
><![ %HTML.Obsolete [
>        <!ENTITY % block "P | %list | DL
>                | PRE | XMP | LISTING
>                | BLOCKQUOTE %block-2">
>]]>
>
>however, 
>
><![ %HTML.Prescriptive [
>        <!ENTITY % HTML.Obsolete "IGNORE">
>]]>
>
><!ENTITY % HTML.Obsolete "INCLUDE"
>        -- marks things that may disappear in future revisions -->
>
>so I'm rather confused, because
>
><!ENTITY % HTML.Prescriptive "IGNORE"
>        -- marks things that may become standard in future revisions -->
>
>So we end up with the Obsolete and the normal entity definitions,
>thus the warning.  Something's amiss here, or I don't have the
>right versions of the DTDs.  
>
>I am also concerned that the target audience, developers without
>much experience with SGML, will find these double negatives
>troublesome.  

OK, so the logic is contorted. We still have not decided whether
XMP and LISTING go in the HTML 2.0 DTD or not. This way, I can
conveniently test either case.

If there is consensus among the WG that these should go in, or
that they should go out, I can edit the DTD and be done with it.

The fewer marked sections the better, to me. Marked sections are
like #ifdef -- evil.

>
>| I would take this declaration to mean "gimme the current version of the
>| HTML DTD."
>
>But here we're defining 3 current versions.

Right. To rephrase: "gimme the highest level of the current version of
the HTML DTD."

>| Anyway... it's just more convenient in practice to have something
>| called html.dtd. Perhaps html.dtd should be a synonym (implemented
>| as a symlink?) for a file called html-2.dtd.
>
>It may seem silly to continue flogging this horse, but then naming is 
>always contentious.  There are lots of "HTML" DTDs floating around; 
>here we are setting everyone up for confusion if what we call the
>HTML 2.0 DTD is called html.dtd *in a set that includes html-0
>and html-1*.  I should think HTML2.0.DTD would be about right.
>
>My solution to the three-DTD problem is to fold -1 and .
>into -0, as marked sections, with the basic content models
>supplied with empty parameter entities that expand within
>those marked sections.  Then there's only one file named
>.dtd, and the IGNORE/INCLUDE operations are simple.  Example
>on request.

Go for it. For me, that would be a very expensive change. I have a
whole test suite with Makefiles and such set up around these
names. It's not a complex operation to change the names and structure,
but it's tedious and time-consuming to Q/A them again. And renaming
the CVS/RCS version control files is tedious. I don't see sufficient
motivation to destabilize things so at this point.

The filenames are arbitrary. The public identifiers are the names
we should be bickering over. The ISBN... one I'm currently using is
owned by Erik Naggum. In Tornonto, nobody seemed to like that idea.
The sentiment was that the IETF should be the registered owner of
the public text.

And perhaps we should deistribute an SGML-Open style catalog
file with the DTD.

>| >  Seems to me that if we're documenting current
>| >practice (approximately) we can't very well mark anything
>| >Obsolete,
>| My view is that in fact, XMP and LISTING are obsolete in current
>| practice. Their actual definition is not expressible in SGML, and
>| they are only supported through backwards-compatibility hacks.
>
>I see that from the doc, but then to be strict about it they
>can't appear in any DTD.  Would it not do just as well to leave
>them in and deprecate their use?

Good idea.

>  Are we willing to say that
>for Level 2 these elements may not be used at all?  If not,
>let's eliminate these marked sections.  Proposed stuff should
>fall into Level 1 or 2 or be eliminated.  

I agree. I just wasn't willing to zap things without consensus.

>The use of Prescriptive might be avoided by making these changes
>between versions of the DTD.  As it stands now, these categories
>(Proposed, Prescriptive) crosscut the 0, 1, 2 DTD structure,
>making it possible to have Level 1 with or without Prescriptive,
>etc.  Let's collapse those categories into the 0, 1, 2 sequence
>of changes, if possible.  Then the only marked sections would
>be the ones including Level 1 stuff and (within that) including
>Level 2 stuff.  

I'm willing to get rid of Proposed and Prescriptive stuff by now,
I guess. Things have been stable for long enough to "write it
in stone" by now.

>| Perhaps I could maintain a separate DTD for testing purposes, but
>| I don't think that's a good idea, and I hope you don't either.
>
>I quite see your point, but the published DTD(s) don't need these
>testing constructs if the point of the testing is to figure out
>how to arrive at the published DTD(s).  Or do you see this as a
>useful feature, to be included in the published DTD?

I guess we should decide out which way to flip the switches, and
eliminate the marked sections. This seem pretty stable by now.


Dan