Re: More comments on the HTML 3.0 draft

Dave Raggett (dsr@hplb.hpl.hp.com)
Tue, 25 Apr 95 11:01:04 EDT

Thanks to Bert for his feedback.

> Nevertheless, here are some comments, followed by some comments on the
> latest TABLE proposal that Dave sent to the list this weekend.
..
> p.16 "The HTML element":

> The ROLE attr. should be removed. A doc. doesn't have a role
> on its own, it can only have a role in relation to another
> doc. Therefore only the source anchor of a link can specify
> the role of the target doc.

I find it hard to accept the position that a document can't have a role
on its own. One advantage of directly associating roles with documents
is that you can then use iconic representations for documents when
displaying histories, hotlists; and 2D or 3D maps of hypertextlinks.
If the roles are only associated with the links, such maps become
confusing to look at, compared with different icons for different
kinds of nodes. Have a look at the work on 3D modelling of the Web's
links that was presented at the Darmstadt conference recently.

> p.28 "NOWRAP":

> In non-wrapping text, you may not only want a forced line
> break (<BR>), but also an allowed line break. While were
> waiting for Unicode, I propose we define an entity &sbsp;
> (cf. Netscape's proposed <WBR> tag).

For the next revision, I am proposing &cbsp; for the Conditional
Breaking SPace. I also propose that the soft hyphen &shy; is also
treated like a conditional break - and can be used to break lines
even when nowrap is true.

> p.50 "The IMG (Image) Element":

> Why is IMG not intended for embedding HTML? Are there any
> other restrictions?

For embedding HTML, I am recommending that HTML3 compliant user agents
support external entities as this is the SGML mechanism for inclusion
of additional markup, such that after inclusion, the resultant document
conplies with the DTD.

> p.51 "WIDTH" & "HEIGHT":

> The size is only "suggested", whereas in <FIG> (p.73) it is
> the size into which the image will be forced. Why the
> inconsistency?

The I-D doesn't say this. For IMG:

Optional suggested width for the image. By default,
this is given in pixels.

For FIG:

Specifies the desired width in pixels or en units
(according to the value of the UNITS attribute). User
agents may scale the figure image to match this width.

I should probably use "suggested" in both cases. As for the UNITS
attribute I am proposing to drop this in favor of specifying the
units as a suffix on the width/height values, since this is the
approach adopted by both Netscape and CALS tables.

> p.78 "Tables", 5th par, 8th par, figure:

> Conflict between the cell counting rules:

> The 4th rule says: "If the column count for the table is
> greater than the number of cells for a given row (after
> including cells for spanned rows),..." The part between
> parentheses conflicts with the 7th rule, that says that cells
> can overlap. If you apply rule 4 to the example, the last row
> will count four cells instead of three (and therefore has only
> one empty cell).

Changing the wording from "including" to "accounting for" removes
the conflict. Note that the example table is defined as invalid,
with an implementation dependent rendering.

> The example exhibits another inconsistency:

> Cell 6 is pushed to the right by cell 1, while cell 7 isn't
> pushed to the right by cell 6. This suggests that the
> possibility of invalid tables (overlapping cells) can be
> completely removed by reformulating rule 7 in terms of this
> "pushing to the right" effect.

I think that would be unwise. The current definition matches the
obvious behaviour and the one exhibited by Netscape 1.1, lets keep
it that way!

> More comments about tables are at the end of this message.

> p.111 "Horizontal rules":

> Is the CLASS attr. a *space* separated list or a *period*
> separated list? The text switches from one to the other.

Ooops!!! I was completely blind to that one!

> p.113 "Preformatted text":

> "The <P> tag should be avoided", but the DTD says that is not
> even allowed. Which is true?

The DTD is correct, I will alter the text to make this clear.

> p.118 "Foototes":

> Footnotes have one indirection too many: first you have to
> click on a word to jump to the place where the footnote is
> stored, then you have to click again to open the footnote.

Where did you get that idea from?

I anticipated that clicking on the link would cause a footnote to
popup with out scrolling the document. You then click on the footnote
to dismiss it - One click to view/One click to hide. However, thats just
one way of supporting the idea - there are other possibilities, but the
HTML3 spec doesn't leglislate one way or the other.

> p.169 in DTD, "style sheets control numbering style":

> The CONTINUE attr of <OL> is difficult to implement, since
> previous lists have already disappeared from the parser stack
> when this attr. is encountered. But I guess that's not
> sufficient argument to drop the attr.

The same argument applies to the sequence number for LI.
I am proposing that user agents track the sequence numbers to avoid
the need to walk back arbitrarily far over the parse tree. This will
speed up formatting as compared with a naive implementation.

> p.171 in DTD, "BODYTEXT":

> Why does this element exist at all, it can't even be omitted!
> Same for FIGTEXT (p.177).

These dummy elements are needed to circumvent problems with SGML
mixed content models. They allow you to place whitespace between
<BODY> and <BANNER>, or between <FIG> and <CAPTION>. The problem
arises because later on in the content model #PCDATA is permitted.

> p.173 in DTD, "SELECT":

> The inclusion exception is superfluous.
> Same for TEXTAREA.

This exclusion is necessary to forbid people putting input fields
within SELECT elements. The same applies for TEXTAREA.

> Dave Raggett proposed a new syntax for the TABLE element, which more
> closely resembles CALS tables. I've never seen a real CALS table, but
> judging from the fragments that have passed this list I think this
> resemblance is rather an argument *against* the new proposal.

> Some people seem bent on introducing into HTML all the errors of the
> CALS table model. What's next? Do we get NAMEST back as well? I've
> stated my views on this before, but here are some additional notes on
> the new proposal.

Think of it like this:

Column widths are a global property of the table, while alignment
properties are really cell based properties. TSPEC is useful to
specifying cell based properties in a concise way.

> BORDER attr:

> Why should the border width be given at all? The style sheet
> can do that. Instead

> BORDER NAMES "" -- any comb. of left, right, top, bottom --

> seems much more useful. The BORDER applies to every TR, TH and
> TD as well (it's "inherited"), but TR, TH and TD can also have
> BORDER attrs. of their own.

Borders are much better handle via style sheets, using cell class names.
You associate particular border styles with boundaries between different
cell classes, with the th/td distinction as the starting point.

> WIDTH attr:

> A default unit of pixels seems the wrong choice. See Jon
> Bosak's message "Widths in tables"
> <9504231936.AA05292@aristotle.sjf.novell.com.SJF.Novell.COM>
> earlier on this list.

I was following Netscape here. Bear in mind that graphics are often
carefully crafted to look good at a particular pixel size, so that
pixel based units do make a lot of sense for graphics artists.

> CELLSPACING, CELLPADDING attr:

> This a matter for the style sheet. Besides, why do we need
> them both?

These are sufficiently generic, and supported by Netscape today.

> TBODY:
> This element is redundant.

No, its useful for CALS style tables e.g.

<TABLE>
<THEAD>
<TR> ...
<TR> ...
<TBODY>
<TR> ...
<TR> ...
<TR> ...
<TR> ...
<TFOOT>
<TR> ...
</TABLE>

> TSPEC:

> The TSPEC element contains information about the way the
> elements in another branch of the SGML tree should be
> formatted. To summarize my earlier objections: there is no
> relation between the TSPEC and the elements to which it
> refers (the parser will already have thrown away the TSPECS
> when it comes to the table cells), and TSPECs are unnecesarily
> verbose.

This is a misleading argument. TSPEC provides a compact way of
specifying cell properties for selected groups of cells.

> The TSPECs introduce another nasty "feature", similar to the
> NAMEST attr. of the former COLSPEC element: the possibility to
> explicitly attach a TSPEC to one or more columns and rows
> introduces countless opportunities for mistakes and
> ambiguities. It makes it harder to parse the table and it does
> nothing whatsoever to make it easier for the writer.

My proposal provides an explicit mechanism for resolving conflicts
in an unambiguous manner. TSPEC makes life easier for the writer
by providing a compact way of specifying cell properties rather
than having to repeat those properties separately for each cell.

> (For example: does TSPEC apply to TFOOT and THEAD as well?
> what to do with <TSPEC WIDTH="4em"><TSPEC COL=1 WIDTH="5em">,
> will it be 4 or 5 em?)

In the order you've given, column 1 would be 5 em's wide and the
other columns 4 em's wide.

> TD, TH:

> Cells can override WIDTH, HEIGHT, ALIGN, VALIGN, NOWRAP, CHAR,
> but why can't they override BORDER as well? (or CELLSPACING,
> CELLPADDING?)

BORDER is a global mechanism for suppressing borders, but a poor way
for specifying their appearence. Netscape supports border=number for
the width in pixels of a border bevel, but this is only one approach.

-- Dave Raggett <dsr@w3.org> url = http://www.hpl.hp.co.uk/people/dsr
Hewlett Packard Laboratories, Filton Road, | tel: +44 117 922 8046
Bristol BS12 6QZ, United Kingdom | fax: +44 117 922 8924