Re: comments on the DTD in Nov 16 draft

Paul Grosso (pbg@texcel.no)
Fri, 25 Nov 94 14:31:06 EST

> From: "Daniel W. Connolly" <connolly@hal.com>
>
> In message <9411211524.AA24726@texcel.no.texcel.no>, Paul Grosso writes:
> >
> >Use of public text display version field in FPIs
> >------------------------------------------------
>
> > In other words, I would recommend that the
> >version info (be it version number and/or revision date) should be
> >part of the public text description field. This would mean that
> >the FPIs in the DTD and in various places in the spec (including
> >in the sample SGML Open entity catalog) would look more like:
> >
> > PUBLIC "-//IETF//DTD HTML//EN" html.dtd
> > PUBLIC "-//IETF//DTD HTML 2.0//EN" html.dtd
> > PUBLIC "-//IETF//DTD HTML Level 2//EN" html.dtd
> > PUBLIC "-//IETF//DTD HTML 2.0 Level 2//EN" html.dtd
>
> Now that I can see more clearly what all this means, I second
> the above proposal. I'll make the change this week, barring objections.
>

i think this will be an improvement. the Nov 22 spec still uses the
older form, but i'm assuming you and Eric will get it in sync before
the IETF meetings.

>
> >Creating a copy of ISO Added Latin 1 character entity declaration set
> >---------------------------------------------------------------------
> >
> >Why is the HTML spec creating a new character entity set that is
> >basically identical to the ISO set (as far as I can tell) and giving it
> >a different FPI?
>
> With the above clarification, could you "get into the details of
> a suggestion" please?

after discussion with Dave Raggett and James Clark, i stand down from
my earlier misgivings. provided that the idea is still to put &Agrave;
in the document rather than &#192; or whatever. that way, if i receive
an HTML document, i can process it with MY HTML doctype which i will
create by reading my copy of ISO Latin 1 SDATA entity declarations
in place of the HTML Latin 1. likewise, a document i've produced
using the ISO Latin 1 declarations will be parsable by a system that
uses the HTML declarations.

all i would repeat here, then, is to be sure to change the public
identifier and copyright in HTML Latin 1 and preferably don't use
%ISOlat1; as the entity name to refer to them because it might lead
to unnecessary confusion.

>
> >Non-compliant use of parameter entities
> >---------------------------------------
> >
> >There are several occurrences of non-compliant use of parameter entities
> >in the latest DTD. In brief, you cannot define a parameter entity with
> >"dangling" connectors such as "| FORM | ISINDEX".
>
>
> OK... I'll try to incorporate something like that. But I don't have
> any way to be sure I've gotten it right. Perhaps I'll mail it to you
> so you can run it through a sufficiently anal-retentive parser before
> my next release.

as i've indicated in private email, Exoterica has offered to parse the
DTD through their parser to help catch all potential glitches their
parser notes, also note that anyone with access to Omnimark,
an Exoterica product, could do the same for you (for example, i note
Jeff Suttor mentioned omnimark in his "adding ICADD archforms" mail).

>
> >CDATA terminated by any ETAGO
> >-----------------------------
> >
> >The comment associated with the declaration for %literal; is somewhat
> >misleading.
>
> Please suggest an alternative.
>

In place of:

<!ENTITY % literal "CDATA"
-- historical, non-conforming parsing mode where
the only markup signal is the end tag
in full
-->

i'd suggest

<!ENTITY % literal "CDATA"
-- historical, non-conforming parsing mode where
the only markup signal is the first end tag (for
any element) encountered and this terminates the
<literal> element
-->

of course, this assumes you want to describe what SGML will do.
if the comment is supposed to be describing what browsers do,
then ignore my suggestion.

>
> > Since the use of CDATA for element content models can lead
> >to surprises [I'm glad this is deprecated], I think it's important to
> >make the comment clearer.
> >
> >The fact is that an element whose content model is CDATA is terminated
> >by ANY end tag--even an invalid one! That is, if the string "</"
> >occurs in CDATA content, the current element will be terminated.