Re: HTML 2.0 editing status

"Daniel W. Connolly" <connolly@hal.com>
Date: Mon, 5 Sep 94 11:52:53 EDT
Message-id: <9409051549.AA10149@austin2.hal.com>
Reply-To: connolly@hal.com
Originator: html-wg@oclc.org
Sender: html-wg@oclc.org
Precedence: bulk
From: "Daniel W. Connolly" <connolly@hal.com>
To: Multiple recipients of list <html-wg@oclc.org>
Subject: Re: HTML 2.0 editing status 
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
X-Comment: HTML Working Group (Private)
In message <199409031449.HAA06296@rock>, Terry Allen writes:
>
>The DTDs don't parse together without generating errors
>about duplication.  This may be unavoidable given the structure
>involved, and they're really warnings rather than errors,
>but it will be unsettling to many.  It would be much nicer
>if no errors or warnings were generated; I'd almost prefer
>3 DTDs, at least for the final cut.


Could you give some details? I agree that the usage of the
varous DTD fragments is underdocumented, but when used as intended,
they produce no warnings nor errors for me. Try the html validation
service, for example.

>The DTDs are named html-0, html-1, and plain html.  The last
>ought to be html-2, shouldn't it?

The names of the files containing the DTD are arbitrary. The "full"
DTD file is called html.dtd to make it convenient to parse documents
that start with:

	<!DOCTYPE HTML ...>

Where ... might be any number of idioms, including nothing at all, i.e.

	<!DOCTYPE HTML>

I would take this declaration to mean "gimme the current version of the
HTML DTD."

Anyway... it's just more convenient in practice to have something
called html.dtd. Perhaps html.dtd should be a synonym (implemented
as a symlink?) for a file called html-2.dtd.

>  Seems to me that if we're documenting current
>practice (approximately) we can't very well mark anything
>Obsolete,

My view is that in fact, XMP and LISTING are obsolete in current
practice. Their actual definition is not expressible in SGML, and
they are only supported through backwards-compatibility hacks.

Yet there are lots of instances where the _usage_ of XMP and LISTING
is compatible with an SGML definition. So I left it in the DTD.
It's a coin toss.

I have a test suite that guides me through these issues. I have test
cases with <XMP> tags. If we take <XMP> out of the DTD, I will have
to move those cases to the "errors" section of the test suite. Perhaps
I should have an "obsolete" section of the test suite.

Whether HTML.Obsolete is INCLUDE or IGNORE is a coin-toss, if you ask
me. But I like to have those sections in there for those test cases
involving XMP, LISING, etc.

Perhaps I could maintain a separate DTD for testing purposes, but
I don't think that's a good idea, and I hope you don't either.

> and that Proposed elements don't belong in these
>DTDs.

I could be convinced of that.

>  Eliminating these entities would also reduce the
>number of errors reported due to redefinitions and nesting
>of marked-section entity definitions.

I still think you're just not using the DTD files as intended.
If you're using sgmls, set your SGML_PATH to

	./%N.dtd:%N.sgml

and put all 5 files (html*.dtd, html.decl, ISOLat1.sgml) in the
current directory, and you should be all set. Validate with:

	sgmls -s html.decl foo.html

if foo begins with
	<!DOCTYPE HTML>

or if it doesn't, create a file, html-prologue.sgml:

	<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//2.0">

and validate with:

	sgmls -s html.decl html-prologue.sgml foo.html

Other configurations
are possible... See the sgmls man page for details.


>I don't understand why the DTDs are included as 
>text in the HTML (each line beginning with <BR>).

Because of the quirky way that the HTML was produced from a
FrameMaker document.

>  Why not
>link to the *actual DTDs* and avoid any chance that the 
>real DTDs and the HTMLized DTDs will differ?

Good idea.

>Finally, the doc is so chunked that it is needlessly difficult
>to navigate.  This may be a religious issue, so I don't expect a
>change,

It's a good suggestion, but (if I undstand your suggestion correctly)
it's a sweeping editorial change, the kind of thing that takes a lot
of time to implement, and greatly destabilizes the document, creating
a need to completely re-review the document. Think carefully before
you advocate this.

On the other hand, if you're just talking about cutting the document
into fewer, larger, HTML nodes, I suppose this can be easily accomplished
through a WebMaker configuration option.

Dan