Notes on Oct 16 draft

Daniel W. Connolly (
Wed, 16 Nov 94 16:46:39 EST

As I read it...

Abstract: Change "ISO 8879" to "ISO8879:1986" as per ????'s suggestion.

Overview: The strict mode of the validation service currently rejects


the idea being that just like <body>, text should be in some containter:


Do we want the examples in the spec to obey this convention?

2.1.2 "HTML file" ==> HTML document

3. Delete the last bullet. The element references are not part of the
specification. They're informative "fluf," not normative info.

3.3 is not very clear. I agree with what I think it's trying to say.
It shoulb be clear that in the standard DTD, HTML.Recommened is set to
IGNORE, i.e. off, and HTML.Deprecated is set to INCLUDE, i.e. on.
That is, the recommended restrictions are not standard, but the
deprecated idioms still are.

3.4 nix "Understanding"

3.4 The default for level is 2. There is no default version.

3.4 The paragraph on the charset parameter sets is still goofy. It
should just be:


The charset parameter is reserved for future use.
See section 3.16 for a discussion of character sets and
encodings in HTML.

3.5 The References bullet is goofy. It should be:


Contains the text and markup of the document.

3.6 Public identifier is wrong. The prologue of an HTML document should be:


And the next paragraph should be a note:

NOTE: If the body of a text/html body part does not begin
with a document type delcaration, an HTML user agent
should infer the above document type declaration.

Actually, that note should go in the HTML&MIME section.

3.6.2 is unclear/incorrect about case sensitivity. Please change to:

A name consists of a letter followed by up to 71 letters, digits,
periods, or hyphens. Element names are not case sensitive, but
entity names are. For example, <BLOCKQUOTE>,
<BlockQuote>, and <blockquote> are equivalent, whereas &amp;
is different from &AMP;.

In a start tag, the element name must immediately follow the
tag open delimiter <.

3.6.3 The paragraph "Some implementations..." should be a NOTE:

The suggestion to use &quot; to put quotes inside attribute value
literals might not be a good one, give the current state of the art.
(i.e. the bugs in current browsers) I suggest:

If an attribute value includes double quote characters,
use single quotes to delimit the attribute value literal:

<img src="image.gif" alt='First "real" example'>

If an attribute value includes both single and double quote
characters, you'll have to use character or entity references:

<img src="cartoon.gif"
alt="&#34;It means 'Read the Fantastic Manual' or something.&#34;">

<img src="cartoon.gif"
alt="&quot;It means 'Read the Fantastic Manual' or something.&quot;">

NOTE: Some existing implementations do not parse entity and
numeric character references inside attribute value literals

3.8.1 Why is that note in there?

3.12 You can delete the note now. And IMG is level 0.

3.16 in the section about XMP, change "Its use is obsolete" to
"Its use is deprecated."

We should be clear: Obsolete means _gone_. Deprecated means "please
don't do that any more."

5.2.3 Hmmm... PLAINTEXT is not obsolete, as far as the current DTD
is concerned. I wouldn't mind taking it out, but it's in there
right now. It's deprecated, but not obsolete.

5.2.4 XMP and LISTING are deprecated, but not obsolete. They should
be moved back into section 3 somewhere.

6.2 Should we change the copyright for the ISOLatin1 public text?

7.1 Please include some sort of disclaimer about inclusion/exclusion
exceptions. I suppose I should write it, but I'm feeling genuinly
uninspired right now.

Also include a note to implementors that this is a subset of the
standard grammar, so it's not everything that they need to implement.

8. The definition of "document" is bogus. The SGML defintion is:

4.96 document: A collection of information that is processed
as a unit. A document is classified as being of a particular
document type.

Hmm... perhaps it's better to use some variation of:

4.282 SGML document: A document that is represented as a sequence
of characters, organized physically into an entity structure
and logically into an element structure, essentiall as described
in this international standard. An SGML document consists
of data characters, which represent its information content,
and markup characters, which represent the structure of the
data and other information useful for processing it.
In particular, the markup describes at least one document
type definition, and an instance of a structure conforming
to the definition.

Whew! I'll try to bake that down for the HTML spec:

HTML document: A collection of information represented as
a sequence of characters. An HTML document consists of
data characters and markup. In particular, the markup
describes a structure conforming to the HTML document type

8. Delete MIME, SGML, SGMLs, WWW, and URI, as they are covered in the
References section. Or at least replace the full citations with
suitable definitions.

There doesn't seem to be an "Authors Address" section. Isn't this
a requirement for an RFC?