Re: marked section in new HTML DTDs

Daniel W. Connolly (connolly@hal.com)
Sun, 25 Sep 94 11:56:14 EDT

In message <199409241753.KAA17065@rock>, Terry Allen writes:
>I've been working out the combinations of marked sections in Dan's
>set of 3 new DTDs, which leads to a discussion of Recommended
>and Deprecated features. All aboard!
>
>In html.dtd:
>
> - HTML.Recommended excludes HTML.Deprecated, so there are only
> two branches there (assuming that only HTML.Recommended
> is to be switched; otherwise you could switch them both
> off and have three branches).

The idea is that there are three branches: you can validate that your
document is standard html, or by setting HTML.Deprecated to "IGNORE,"
you can validate that your document uses no deprecated idioms, or by
setting HTML.Recommended to "INCLUDE" you can validate that your
document uses only recommended idioms.

I have had comments from at least two HTML editor implementors that
said they liked having the HTML.Recommended distinction. I think
it's useful.

> - HTML.Forms is set to INCLUDE, and is to be switched by use
> of html-1.dtd, whose sole purpose is to switch it off.

Yes: the only difference between level 1 and level 2 is forms.

> (It also defines an %html; parameter entity that is unused
> elsewhere but whose value, "-//IETF//DTD HTML//EN//2.0",
> matches the value of %HTML.Version in html.dtd. It appears
> that this is supposed to invoke the html.dtd, but the FPIs
> need synchronization.

They seem to be perfectly in sync to me:

connolly@austin2 ../html-spec[505] grep -n 'DTD HTML//' html-1.dtd html.dtd
html-1.dtd:29:<!ENTITY % html PUBLIC "-//IETF//DTD HTML//EN//2.0">
html.dtd:14: "-//IETF//DTD HTML//EN//2.0"

> - HTML.Highlighting is set to INCLUDE, but is not toggled anywhere
> else.

Except for:
html-0.dtd:30:<!ENTITY % HTML.Highlighting "IGNORE">

The DTD Reference makes the distinctions pretty clear. There
are no B, I, EM, STRONG, etc. elements in the level 0 reference
at:
http://www.hal.com/users/connolly/html-spec/L0index.html

> Is this entity
> needed? was it meant to be toggled in html-0.dtd?

I think so, and yes.

> You
> can't set it to IGNORE, or you lose much of the inline
> markup; in Highlighting.html it says that the ability
> to render this highlighting is a Level 1 requirement.

That's exactly the point. Level 0 has no inline markup.

> However, html-0.dtd does not appear to invoke either of
> the other DTDs, and includes IMG,

Did you get some funky version of html-0.dtd? It's right there:

38 <!ENTITY % html PUBLIC "-//IETF//DTD HTML//EN//2.0">
39 %html;

The version is $Id: html-0.dtd,v 1.7 1994/07/20 16:24:27 connolly Exp $

> which Specification.html
> says is not required in Level 0.

A bug in the prose (unless we decide otherwise ;-)

> The sole difference here
> seems to be that ALT is required in Level 0 and not in
> the other Levels. This has been questioned, with good
> reason.
>
> Eventually we want to require ALT so as to aid the text-
> impaired, but there must be scads of docs that don't
> have it now

So they're not level 0 documents. That's the point.

> (thus is properly IMPLIED in the other two
> DTDs). If the idea is that Level 0 is so mingy that it can't
> render IMG and so the user must rely on ALT, then we're
> talking about imposing new requirements on authors.

Bingo.

>if HTML.Highlighting and HTML.Forms may be toggled independently
>of each other, there are of course 4 possibilities; neither
>is toggled by HTML.Recommended or HTML.Forms, so that makes 8
>possibilities.

Well, there are technically zillions of possibilities: you can
redefine all sorts of parameter entities and end up with all
kinds of crazy stuff.

But since folks can't (yet) effectivly write:

<!docytpe HTML [
<!ENTITY HTML.Forms "IGNORE">
]>
<html>...

The question is: how many possibilities get public identifiers?

They can write:
<!docytpe HTML PUBLIC "-//IETF//DTD HTML Level 0//EN">
which is useful at least for local batch-validation.

Currently, there are three public identifiers: level 0, level 1, and
level 2.

I actually have files html-0P.dtd, html-1P.dtd, and html-P.dtd
(P for prescriptive...) that set HTML.Recommended to "INCUDE"
and invoke the respective DTD. I suppose if we're going to
support these distinctions, I might as well give those their
own public identifier and distribute them as public text.

Ack! I forgot the SGML-Open-style entity catalog. Crud.

>As we are documenting current practice, I see no point in having
>HTML.Recommended and HTML.Deprecated.

They're in there to support editing of new documents. We may
decide that this is not a compelling reason, but it's more than
"no point."

>If we
>could decide what to keep in html.dtd and get rid of these two,
>we'd have two DTDs, one with Forms and one without, corresponding
>to Level 1 and Level 2, which is pretty much what's desired.
>Then html-0.dtd is just a DTD on its own, with somewhat different
>aims (okay by me).

See above. Level 0, 1, and 2 are currently defined.

>Of the Recommended items,
> - linkName is a good idea, but could be left to 2.1

That's why it's only recommended.

> - I object again to restricting A.content to %text;.
> I want to be able to wrap as a hot spot one
> or more paras, perhaps including heads, and
> so on.

So we should support ID attributes on lots of elements as link
targets. Or support HyTime dataloc links ("link to work 3-8 of
paragraph 7").

> There is no requirement that a link
> be only an inline animal.

There is in a lot of conversion software. There is in HTML+. Plus, it
keeps parsing and processing simple. "HTML is cheap technology."

> - body.content I like, but would be happy to defer to
> 2.1

Again, that's why it's only recommended.

> - head.nextid could go in now; isn't this current
> practice? or is the point that it's not
> supported?

The point is that it's not recommended to use NEXTID nor to
rely on it. The right thing to do is to look at all the
link names and choose something that's distinct from them.

>Of the Deprecated items,
> - preformatted involves sunsetting XMP and LISTING;
> desireable though this may be, let's take
> it up in 2.1.
> - html.content is like body.content (getting rid of
> free-floating text); I like the restriction
> but think it goes in 2.1.

If we're going to change it in 2.1, it's only fair to deprecate
it in 2.0. We've got to warn folks and give them time to
convert their stuff.

> - literal (being CDATA for XMP, LISTING, and PRE)
> is really difficult, besides being weird.

That's why it's deprecated. By the way: PRE has nothing
to do with literal.

Dan