Comments on June 8 draft (long)

Joe English (
Wed, 14 Jun 95 14:40:58 EDT

Here are some comments and questions on the June 8 draft;
there are two issues (search for '!!!') which I think
are substantive, the rest are only editorial.

I've got a bunch of other minor editorial notes, but
haven't typed them in yet...

[ BTW, Dan -- did you get the plaintext copy I sent you?
It's not on yet. I put it at
in case anyone is looking for a copy. ]

[ BTWII: I've been toying with a LaTeX filter too,
and have a (IMHO) nicer-looking PostScript version
partially ready; let me know if you're interested. ]


1.2.2, "Feature Test Entities", p. 4.

| HTML.Recommended
| [...] This
| feature test entity enables a more prescriptive document
| type definition that eliminates those features.[...]
| HTML.Deprecated
| This
| feature test entity enables a document type definition
| that eliminates these features.[...]

The description of "HTML.Deprecated" is backwards --
this feature test entity *enables* these features.

[ Suggestion: add a note that HTML.Recommended is off
by default, and HTML.Deprecated is on. ]

2.2.1. Data Characters, p. 8:

| Note that the terminating semicolon is only necessary when the
| character following the reference would otherwise be recognized
| as markup:

Should read "would otherwise be recognized as part of the entity name".
[ In "&lt<code>...", the '<' character *is* recognized as markup,
but the semicolon is omissible ]

| NOTE - There are SGML mechanisms, CDATA and RCDATA, to
| allow most `<', `>', and `&' characters to be entered

Insert "CDATA and RCDATA declared content,".

[ The keywords CDATA and RCDATA are *very* amgiguous
in SGML; "CDATA" is used in at least five different
ways in SGML, each with slightly different meanings. ]

2.2.4. Attributes, p. 9

| In a start-tag, white space and attributes are allowed between
| the element name and the closing delimiter. An attribute
| typically consists of an attribute name, an equal sign, and a
| value, though some attributes may be just a value. White space
| is allowed around the equal sign.

In this paragraph, "attribute" should read "attribute _specification_".

2.3. HTML Public Text Identifiers

| This document type declaration refers to the level 1 HTML DTD in
| 8.2, "Strict HTML DTD".

The reference is wrong; should be "8.3"
(<hdref idref=dtd.s> should be <hdref idref=dtd.1> in html-sgml.sgm)

4.2.2. Base URI: BASE

| The optional <BASE> element specifies the URI of the document,
| overriding any context otherwise known to the user agent.

[ Question: Is this correct? I thought that <BASE> was only supposed
to specify the base URI for purposes of resolving partial URLs.
Is there a requirement that the document be retrievable
by the address specified in the <BASE> element?

In general, there's no such thing as "the" URI of a document,
since there can be more than one. ]

4.6.2. Ordered List: OL, p. 22

| The <UL> element [...]
Should be "OL".

Also, should add "typically rendered as a numbered list".

4.6.3. Directory List: DIR, p. 23

2nd paragraph:

| The content of a <OL> element ...
Should be "DIR".

[ Question: aren't <DIR> and <MENU> deprecated? ]

* * * * * * * * * * * *

4.7 Phrase Markup, p. 24, 2nd paragraph:

| User agents must render highlighted phrases distinctly from
| plain text. Additionally, <EM> content must be rendered as
| distinct from <STRONG> content, and <B> content must rendered as
| distinct from <I> content.

The "must"s here make "lynx -dump" and other HTML-to-plaintext
formatters nonconforming HTML user agents. Was this intended?

* * * * * * * * * * * *

4.7.1 Idiomatic Elements.

[ Question: why isn't <DFN> in the DTD? ] Variable: VAR

| The <VAR> element indicates a place holder, typically rendered
| as italic. For example:
| Take a guess: Roses are <var>blank</var>.

[ Suggestion: replace "place holder" with "placeholder variable",
and for the example use:

Type <SAMP>rm -f <VAR>file</VAR></SAMP>
to remove <VAR>file</VAR>.

or, better yet:

Type <SAMP>html-check <VAR>file</VAR> | more</SAMP>
to check <VAR>file</VAR> for markup errors.

4.10. Image: IMG, p. 28-29

Under "Examples of use:", the example

| <IMG SRC="triangle.xbm" ALT="Warning:"> Be sure
| to read these instructions.

appears twice (the second time without ALT).
Was this intended?

6. Hyperlinks

[ Comments: The terms "head" and "tail" to describe
linkends was totally unfamiliar -- this is the first
I've heard these terms used in this way. I quickly
got used to the terminology, but mostly because
I already knew what, e.g., <A HREF="..."> means,
and when it was described as a "tail" that clued
me in to the meaning of the Dexter terminology.

Suggestion: keep the terms "head" and "tail", but
add a brief description of what they mean and
the traversal direction (activating the tail
anchor traverses to the head anchor?)

* * * * * * * * * * * *

6.4. Fragment Identifiers, p. 31, 2nd paragraph:

| The meaning of fragment identifiers depends on the media type of
| the resource containing the head anchor. For `text/html'
| resources, it refers to the <A> element with a NAME attribute
| whose value is the same as the fragment identifier. The matching
| is case sensitive. [...]

If %HTML.Recommended; is on, then NAME attributes are IDs,
so in this case the matching should *not* be case-sensitive.
This could cause problems:

<A HREF="#foo"> ... <A NAME=foo>

will *only* work if HTML.Recommended is *off*. I've seen
lots of documents that use lowercase anchor names; are they
going to have forward-compatibility problems?

* * * * * * * * * * * *

7.1 Form Elements.

I am constitutionally unable to read this section...
Will try again later...

9. Terms.

Under "element", "end-tag", "markup", and "start-tag",
the term "descriptive markup" should be "generic markup"
according to the official SGML terminology.

(FWIW, I happen to like "descriptive" better. It seems
more, er, descriptive.)

--Joe English