Re: Looking toward the IETF meeting
Thu, 1 Dec 94 03:19:19 EST


Copies: Tim Berners-Lee (MIT), Dan Connolly (HAL)
Nico Poppelier and Herbert van Zijl (APD_ITD)

Subject: Comments on HTML 2.0 RFC

The HTML 2.0 draft document has been studied and discussed in some detail by
members of the IT Development team of Elsevier Science in Amsterdam. As you
may know, Elsevier Science is deeply committed to electronic publishing and
quite active on the Web already. The future direction and scope of HTML
development is of obvious concern to us, and we felt that a response to your
draft was required. I am pleased to attach our comments on the HTML 2.0 RFC
for your consideration.

We have limited ourselves to the original purpose of the draft, namely
consolidation of the current practice on the Web. As a consequence the
majority of the comments are SGML related and technical in nature, and you
will not find many functional comments. If adopted, the SGML related comments
will make the standard easier to work with in a full SGML environment.

The current consolidation effort through HTML 2.0 is a good one. However, we
are aware of a number of alternative efforts to extend HTML and concerned
about possible divergence in the development effort, resulting in multiple,
competing standards.

I hope that our comments, based on many years of practical SGML experience,
will result in improvement of HTML 2.0. Please do not hesitate to contact me
if our reaction requires further discussion.


Arie P. de Ruiter, Head
Information Technology Development

Elsevier Science - Comments on HTML 2 draft document
1 December 1994

(NOTE: Text between | ... | is meant to be printed in Courier font)

1. General criticism on the draft

In general we have observed, in our meeting of 11 November, that the
draft document is full of typing errors. Furthermore we noticed that
HTML 1 was designed by someone with little or no knowledge of SGML,
and that traces of this are still visible in HTML 2 (capacities,
literal and name lengths, ...)

The dtd for HTML 2 will probably parse -- at least we assume the
editor has done this -- but it is definitely a badly designed dtd:

- the dtd displays an odd mixture of form and structure;
- there are three elements, |<head>|, |<body>| and |<html>|, that allow
tag omission for both the start and the end tag;
- many deprecated HTML features are included in the dtd, using the mechanism of marked
sections in a dtd.

We do not like the many uses of entities in the dtd; we believe that
removing these will improve readibility.

A great deal of confusion is caused by the various interpretations of
the paragraph tag, |<p>|, within this one draft document; see
page 13 of the draft for a good example. The question is: ``Is
|<p>| the start or the end of a paragraph?''

2. Page-by-page criticism on the draft

11: Descriptions of browsers (applications of the standard) and
exceptions do not belong in an RFC.

17: Why is namelen 72? In standard SGML, i.e. the Reference Concrete
Syntax (RCS), 8 suffices. It will probably suffice for HTML 2 as well.

18: If a feature is deprecated, just don't allow it in the dtd!

19: If browsers do not support short tags etc., why not put ``shorttag
no'' in the SGML declaration?

19: The length of an attribute value is limited to 1024 characters,
whereas in the RCS 12 is used. This will limit the length of the
|alt| attribute, but we believe this should be an element, since
it is really the caption of a figure.

22: What is the relation between links, head, isindex and anchors? How
should links be represented?

24: An anchor must have either an href or a name attribute, but we are
aware that this cannot be expressed in SGML. A remark of this kind
should be inserted in the documentation. On the whole, the explanation
of links and anchors should be made more clearly.

25: We do not understand the semantics of the methods attribute to
|<a>|. The explanation is unclear, and there is no example. Referring
to |<link>| is no use, since that explanation is unclear as well. Are
the |rel| and |rev| each other's inverse (loosely speaking)? Our
understanding is that |<link>| is used for indicating semantic
relationships between whole pages, whereas anchors are used for lower
levels, i.e. pointers to a specific position in another page. It also
looks as if |<link>| is almost the same as |<a>| with |rel| and |rev|

26: Shouldn't there be a |<p>| immediately after the |<h1>|? This will
not parse!

26: Headings h1--h6 do not indicate heading levels, but merely
heading styles. If levels are what you want, the dtd needs to be

27: There is no point in stating ``this is discouraged''. Either allow
something in the dtd or don't!

28: The explanation of nested fonts should be rewritten. The present
explanation is unclear, since it looks as if two different examples
give the same output. It should be made absolutely clear that this
can only happen when a client does not have certain fonts. We suggest
that first an example is given of correct output, and then an example
of what a client can do when certain fonts are missing.

29: Sections 3.8 and 3.9 show considerable overlap. What is the
difference between |var|, |kbd| and |tt|?

30: |alt| should become an element (a caption for a figure).

31: We suggest that HTML 2 has only one generic list element to cover
|dit|, |menu| and |ul|, with an attribute to indicate the presentation
type. This makes maintenance of documents much easier.

33: This example shows a different interpretation of |<p>|.

36: What is |</expires>|? December 4, 1993 was not a Tuesday.

37: There is no |<p>| after the |<h1>|.

38-42: Forms seem to be okay. But what does the server receive when
the user of a client package enters data into a form using certain non-standard keys of his
keyboard? In general, the use of ISO Latin-1 in input boxes, button
labels, check boxes, list boxes, etc. is not explained properly.

40: The word ``as'' in the last line is ambiguous.

51, 2nd paragraph: Is this security???

53: Hyphenation is not explained satisfactorily.

66: Is it really necessary to use |%version_attr| in this obscure way?
Why is the HTML version a public identifier?

89: The list of terms is incomplete.