Re: Proposed DTD Names, Structure [Was: HTML 2.0 editing status ]

Murray Maloney <murray@sco.COM>
Date: Tue, 6 Sep 94 14:46:04 EDT
Message-id: <9409061432.aa04141@dali.scocan.sco.COM>
Reply-To: murray@sco.COM
Originator: html-wg@oclc.org
Sender: html-wg@oclc.org
Precedence: bulk
From: Murray Maloney <murray@sco.COM>
To: Multiple recipients of list <html-wg@oclc.org>
Subject: Re: Proposed DTD Names, Structure [Was: HTML 2.0 editing status ]
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
X-Comment: HTML Working Group (Private)
Hi Gang,

I'm back from vacation and ready to rock 'n roll...

> 
> OK folks... we need more input on this than just Terry and myself:
> 
> Proposal 1: Eliminate the HTML.Obsolete, HTML.Proposed, and HTML.Prescriptive
> 	marked sections	in the DTD -- leave the Obsolete stuff in,
> 	and take the Proposed and Prescriptive stuff out. The net effect
> 	on the grammar defined by the DTD would be nothing.

I agree with getting rid of the marked sections -- evil, nasty things!
But I am still confused about keeping the Obsolete stuff in.
I guess that I really don't have strong feelings about XMP or LISTING,
so I am happy to accept the majority opinion.


> Proposal 2A: Keep the public identifiers as-is:
> html-0.dtd:     "+//ISBN 82-7640-037::WWW//DTD HTML Level 0//EN//2.0"
> html-1.dtd:     "+//ISBN 82-7640-037::WWW//DTD HTML Level 1//EN//2.0"
> html.dtd:       "+//ISBN 82-7640-037::WWW//DTD HTML//EN//2.0"
> 
> Proposal 2B: Replace them with an unregisterd FPI, naming IETF as the owner:
> html-0.dtd:     "-//IETF//DTD HTML Level 0//EN//2.0"
> html-1.dtd:     "-//IETF//DTD HTML Level 1//EN//2.0"
> html.dtd:       "-//IETF//DTD HTML//EN//2.0"
> 
> Proposal 2C: Replace them with an registerd FPI, naming IETF as the owner.
> 	This requires that the IETF register itself somehow with ISO.
> html-0.dtd:     "+//IETF//DTD HTML Level 0//EN//2.0"
> html-1.dtd:     "+//IETF//DTD HTML Level 1//EN//2.0"
> html.dtd:       "+//IETF//DTD HTML//EN//2.0"


I think that 2B or 2C are both preferable to 2A.
Registration should be attached to IETF [or even WWW Org.]
For most applications, there will be no effective difference
between 2B and 2C so long as both/either are recognized.
Ideally, an FPI registered to IETF is needed.


> Proposal 3: Combine html-0.dtd, html-1.dtd, html-2.dtd into one file,
> 	as per Terry's suggestion:
> >My solution to the three-DTD problem is to fold -1 and .
> >into -0, as marked sections, with the basic content models
> >supplied with empty parameter entities that expand within
> >those marked sections.  Then there's only one file named
> >.dtd, and the IGNORE/INCLUDE operations are simple.  Example
> >on request.


Ooh!  Nasty, evil marked sections again.  In this case, the 
benefit outweighs the cost.  A single file will be far easier
to manage, understand and maintain.


Murray



> Background...
> 
> In message <199409051826.LAA06383@rock>, Terry Allen writes:
> >| 
> >| In message <199409031449.HAA06296@rock>, Terry Allen writes:
> >| >
> >| >The DTDs don't parse together without generating errors
> >| >about duplication.
> >
> >| Could you give some details? I agree that the usage of the
> >| varous DTD fragments is underdocumented, but when used as intended,
> >| they produce no warnings nor errors for me. Try the html validation
> >| service, for example.
> >
> >I'm using sgmls -degruv and for html-0.dtd I get:
> >
> >sgmls version 1.1
> >sgmls: In file included at litl, line 1:
> >       Warning at ./html-0.dtd, line 114 in declaration parameter 4:
> >       Duplicate specification occurred for "%block"; duplicate ignored
> >sgmls: In file included at litl, line 1:
> >       Warning at ./html-0.dtd, line 245 in declaration parameter 4:
> >       Duplicate specification occurred for "%html.content"; duplicate ignored
> 
> 
> So don't use the -d flag of sgmls. That flag causes sgmls to warn about
> complete legal, standard idioms. It may be useful for debugging or some
> such, but those warnings do not indicate any non-standard or ill-defined
> behaviour.
> 
> >and if you look at the DTD you see that 
> >
> ><!ENTITY % block "P | %list | DL
> >                | PRE | BLOCKQUOTE %block-2">
> >
> >is defined at l. 114, but above,
> >
> ><![ %HTML.Obsolete [
> >        <!ENTITY % block "P | %list | DL
> >                | PRE | XMP | LISTING
> >                | BLOCKQUOTE %block-2">
> >]]>
> >
> >however, 
> >
> ><![ %HTML.Prescriptive [
> >        <!ENTITY % HTML.Obsolete "IGNORE">
> >]]>
> >
> ><!ENTITY % HTML.Obsolete "INCLUDE"
> >        -- marks things that may disappear in future revisions -->
> >
> >so I'm rather confused, because
> >
> ><!ENTITY % HTML.Prescriptive "IGNORE"
> >        -- marks things that may become standard in future revisions -->
> >
> >So we end up with the Obsolete and the normal entity definitions,
> >thus the warning.  Something's amiss here, or I don't have the
> >right versions of the DTDs.  
> >
> >I am also concerned that the target audience, developers without
> >much experience with SGML, will find these double negatives
> >troublesome.  
> 
> OK, so the logic is contorted. We still have not decided whether
> XMP and LISTING go in the HTML 2.0 DTD or not. This way, I can
> conveniently test either case.
> 
> If there is consensus among the WG that these should go in, or
> that they should go out, I can edit the DTD and be done with it.
> 
> The fewer marked sections the better, to me. Marked sections are
> like #ifdef -- evil.
> 
> >
> >| I would take this declaration to mean "gimme the current version of the
> >| HTML DTD."
> >
> >But here we're defining 3 current versions.
> 
> Right. To rephrase: "gimme the highest level of the current version of
> the HTML DTD."
> 
> >| Anyway... it's just more convenient in practice to have something
> >| called html.dtd. Perhaps html.dtd should be a synonym (implemented
> >| as a symlink?) for a file called html-2.dtd.
> >
> >It may seem silly to continue flogging this horse, but then naming is 
> >always contentious.  There are lots of "HTML" DTDs floating around; 
> >here we are setting everyone up for confusion if what we call the
> >HTML 2.0 DTD is called html.dtd *in a set that includes html-0
> >and html-1*.  I should think HTML2.0.DTD would be about right.
> >
> >My solution to the three-DTD problem is to fold -1 and .
> >into -0, as marked sections, with the basic content models
> >supplied with empty parameter entities that expand within
> >those marked sections.  Then there's only one file named
> >.dtd, and the IGNORE/INCLUDE operations are simple.  Example
> >on request.
> 
> Go for it. For me, that would be a very expensive change. I have a
> whole test suite with Makefiles and such set up around these
> names. It's not a complex operation to change the names and structure,
> but it's tedious and time-consuming to Q/A them again. And renaming
> the CVS/RCS version control files is tedious. I don't see sufficient
> motivation to destabilize things so at this point.
> 
> The filenames are arbitrary. The public identifiers are the names
> we should be bickering over. The ISBN... one I'm currently using is
> owned by Erik Naggum. In Tornonto, nobody seemed to like that idea.
> The sentiment was that the IETF should be the registered owner of
> the public text.
> 
> And perhaps we should deistribute an SGML-Open style catalog
> file with the DTD.
> 
> >| >  Seems to me that if we're documenting current
> >| >practice (approximately) we can't very well mark anything
> >| >Obsolete,
> >| My view is that in fact, XMP and LISTING are obsolete in current
> >| practice. Their actual definition is not expressible in SGML, and
> >| they are only supported through backwards-compatibility hacks.
> >
> >I see that from the doc, but then to be strict about it they
> >can't appear in any DTD.  Would it not do just as well to leave
> >them in and deprecate their use?
> 
> Good idea.
> 
> >  Are we willing to say that
> >for Level 2 these elements may not be used at all?  If not,
> >let's eliminate these marked sections.  Proposed stuff should
> >fall into Level 1 or 2 or be eliminated.  
> 
> I agree. I just wasn't willing to zap things without consensus.
> 
> >The use of Prescriptive might be avoided by making these changes
> >between versions of the DTD.  As it stands now, these categories
> >(Proposed, Prescriptive) crosscut the 0, 1, 2 DTD structure,
> >making it possible to have Level 1 with or without Prescriptive,
> >etc.  Let's collapse those categories into the 0, 1, 2 sequence
> >of changes, if possible.  Then the only marked sections would
> >be the ones including Level 1 stuff and (within that) including
> >Level 2 stuff.  
> 
> I'm willing to get rid of Proposed and Prescriptive stuff by now,
> I guess. Things have been stable for long enough to "write it
> in stone" by now.
> 
> >| Perhaps I could maintain a separate DTD for testing purposes, but
> >| I don't think that's a good idea, and I hope you don't either.
> >
> >I quite see your point, but the published DTD(s) don't need these
> >testing constructs if the point of the testing is to figure out
> >how to arrive at the published DTD(s).  Or do you see this as a
> >useful feature, to be included in the published DTD?
> 
> I guess we should decide out which way to flip the switches, and
> eliminate the marked sections. This seem pretty stable by now.
> 
> 
> Dan