HTML table model suggestion

Paul Grosso (paul@arbortext.com)
Wed, 29 Mar 95 14:46:21 EST

To focus the table model discussion some more, I'd like to post
this message with three DTD fragments to consider:

1. the model Dave is currently suggesting in his draft HTML 3.0 DTD;
2. basically the same as #1 modified by the minimal changes that are
highly recommended by SGML Open;
3. a fragment that is generally functionally equivalent (though even
as it is, it has some added capabilities) to 1 and 2, but that is
basically a subset of the SGML Open recommended subset of the CALS
table model.

I realize that written semantics are important to the careful definition
of any DTD fragment such as this, but I am omitting most of them from this
message so as to avoid overwhelming readers. Most of the semantics
associated with fragment #3 have been pretty well agreed upon for years,
and SGML Open is in the process of writing up the definitive version
of all the details. I'm hoping that the DTD fragments and some minimal
comments suffice for now.

I would like to see HTML adopt the CALS subset. Not only does this have
the advantage of existing implementations as well as years of experience
understanding its semantics, but it has an obvious enhancement path when
HTML wants real headers and footers, multiple tgroups, individual rule
segment types, etc. I hope my presentation shows that it isn't much more
complex than Dave's model.

All three of the fragments are shown in "reduced" form where I expanded
most parameter entity references and tossed most comments declarations
all for the purposed of getting to the heart of the model.

=======================================================
Basics of Dave Raggett's suggestion for HTML 3.0 tables
-------------------------------------------------------

<!ELEMENT TABLE - - (CAPTION?, TR*) -- mixed headers and data -->
<!ATTLIST TABLE
%needs; -- for control of text flow --
border (border) #IMPLIED -- draw borders --
colspec CDATA #IMPLIED -- column widths and alignment --
units (em|pixels|relative) em -- units for column widths --
dp CDATA #IMPLIED -- decimal point e.g. dp="," --
width NUMBER #IMPLIED -- absolute or percentage width --
align (bleedleft|left|center|right|bleedright|justify) center
noflow (noflow) #IMPLIED -- noflow around table --
nowrap (nowrap) #IMPLIED -- don't wrap words --
>

<!ENTITY % horiz.align "left|center|right|justify">
<!ENTITY % vert.align "top|middle|bottom|baseline">

<!ELEMENT TR - O (TH | TD)* -- row container -->
<!ATTLIST TR
align (%horiz.align) #IMPLIED -- horizontal alignment --
valign (%vert.align) top -- vertical alignment --
dp CDATA #IMPLIED -- decimal point e.g. dp="," --
nowrap (nowrap) #IMPLIED -- don't wrap words --
>

<!ELEMENT (TH | TD) - O %body.content>
<!ATTLIST (TH | TD)
colspan NUMBER 1 -- columns spanned --
rowspan NUMBER 1 -- rows spanned --
align (%horiz.align) #IMPLIED -- horizontal alignment --
valign (%vert.align) top -- vertical alignment --
dp CDATA #IMPLIED -- decimal point e.g. dp="," --
nowrap (nowrap) #IMPLIED -- don't wrap words --
axis CDATA #IMPLIED -- axis name, defaults to element content --
axes CDATA #IMPLIED -- comma separated list of axis names --
>

==============================================================
Raggett-like fragment with minimal, highly recommended changes
--------------------------------------------------------------
Changes are:
a. add COLSPEC to the content model of TABLE
b. delete the colspec and units attributes from TABLE
c. add COLSPEC element declaration
d. add "char" to possible "horiz.align" attribute values

<!ELEMENT TABLE - - (CAPTION?, COLSPEC*, TR*) -- mixed headers and data -->
<!ATTLIST TABLE
%needs; -- for control of text flow --
border (border) #IMPLIED -- draw borders --
dp CDATA #IMPLIED -- decimal point e.g. dp="," --
width NUMBER #IMPLIED -- absolute or percentage width --
align (bleedleft|left|center|right|bleedright|justify) center
noflow (noflow) #IMPLIED -- noflow around table --
nowrap (nowrap) #IMPLIED -- don't wrap words --
>

<!ENTITY % horiz.align "left|center|right|justify|char">
<!ENTITY % vert.align "top|middle|bottom|baseline">

<!ELEMENT COLSPEC - o EMPTY -- only exists to hold attributes -->
<!ATTLIST COLSPEC
align (%horiz.align) "left"
char CDATA #IMPLIED
-- character upon which to align ( such as . or , ) --
charoff NUTOKEN #IMPLIED
-- position of character upon which to align --
colwidth CDATA #IMPLIED
-- e.g., 1.5in, 40pt, or 20* (* to indicate "relative") -->

<!ELEMENT TR - O (TH | TD)* -- row container -->
<!ATTLIST TR
align (%horiz.align) #IMPLIED -- horizontal alignment --
valign (%vert.align) top -- vertical alignment --
dp CDATA #IMPLIED -- decimal point e.g. dp="," --
nowrap (nowrap) #IMPLIED -- don't wrap words --
>

<!ELEMENT (TH | TD) - O %body.content>
<!ATTLIST (TH | TD)
colspan NUMBER 1 -- columns spanned --
rowspan NUMBER 1 -- rows spanned --
align (%horiz.align) #IMPLIED -- horizontal alignment --
valign (%vert.align) top -- vertical alignment --
dp CDATA #IMPLIED -- decimal point e.g. dp="," --
nowrap (nowrap) #IMPLIED -- don't wrap words --
axis CDATA #IMPLIED -- axis name, defaults to element content --
axes CDATA #IMPLIED -- comma separated list of axis names --
>

================================================
Subset of CALS tables with similar functionality
------------------------------------------------

<!ELEMENT TABLE - - (CAPTION?, TGROUP*) >
<!ATTLIST TABLE
%needs;
frame (top|bottom|topbot|all|sides|none) #IMPLIED
width NUMBER #IMPLIED -- absolute or percentage width --
>

<!ENTITY % horiz.align "left|center|right|justify|char" -- CALS adds "char" -->
<!ENTITY % vert.align "top|middle|bottom" -- CALS omits baseline -->

<!ELEMENT TGROUP - O (COLSPEC*,TBODY)>
<!ATTLIST TGROUP
cols NUMBER #REQUIRED
align (%horiz.align) "left"
char CDATA ""
charoff NUTOKEN "50"
>

<!ELEMENT COLSPEC - O EMPTY>
<!ATTLIST COLSPEC
colname NMTOKEN #IMPLIED
align (%horiz.align) #IMPLIED
char CDATA #IMPLIED
-- character (e.g., decimal point) for character alignment --
charoff NUTOKEN #IMPLIED
colwidth CDATA #IMPLIED
-- offset from left of cell (in percent) for positioning
of aligned character in the case of character alignment --
>

<!ELEMENT TBODY - O (ROW*)>
<!ATTLIST TBODY
valign (%vert.align) "top"
>

<!ELEMENT ROW - O (ENTRY)*>

<!ELEMENT ENTRY - O %body.content;>
<!ATTLIST ENTRY
namest NMTOKEN #IMPLIED
nameend NMTOKEN #IMPLIED
morerows NUMBER "0"
valign (%vert.align) #IMPLIED
align (%horiz.align) #IMPLIED
char CDATA #IMPLIED
charoff NUTOKEN #IMPLIED
>

=======================================================

Some few comments on the suggested CALS subset:

a. Multiple TGROUPs allow for "stacking" table subunits that
may have different numbers/widths of columns. HTML 2.1
could use (CAPTION?, TGROUP?) for a simpler model that
is still a CALS subset.

b. Aside from allowing a "wrapper" on which to provide some default
values for all rows--as well as maintaining subset compatibility
with CALS--the TBODY element prepares the way for having optional
THEAD and TFOOT row-wrappers in the future.

c. Horizontal spanning is done by using the "nameend" attribute
on an <entry> to give the "colname" (see <colspec>) of the column
in which the horizontal span should end. Optional use of "namest"
allows for disambiguating where this entry will (start to) be
placed; its use is recommended (though not required--if omitted,
the semantics are as you would expect) when something may be
"spanning down" into this row so as to disambiguate where this
entry should go.

d. Vertical spanning is done using the "morerows" attribute which
works much like "rowspan" except that morerows=rowspan-1.

Here's the sample table from Dave's HTML 3.0 draft (except I omitted
the "ROWSPAN=2" on the cell containing "females" since it served
no purpose in this example):

<TABLE BORDER>
<CAPTION>A test table with merged cells</CAPTION>
<TR><TH ROWSPAN=2><TH COLSPAN=2>Average
<TH ROWSPAN=2>other<BR>category<TH>Misc
<TR><TH>height<TH>weight
<TR><TH ALIGN=LEFT>males<TD>1.9<TD>0.003
<TR><TH ALIGN=LEFT>females<TD>1.7<TD>0.002
</TABLE>

This would be rendered something like:

A test table with merged cells
+--------------------------------------------------+
| | Average | other | Misc |
| |-------------------| category |--------|
| | height | weight | | |
|-----------------------------------------|--------|
| males | 1.9 | 0.003 | | |
|-----------------------------------------|--------|
| females | 1.7 | 0.002 | | |
+--------------------------------------------------+

Here is the same table marked up using the suggested CALS subset:

<TABLE FRAME=all>
<CAPTION>A test table with merged cells</CAPTION>
<TGROUP cols=5>
<COLSPEC ALIGN=LEFT>
<TBODY>
<ROW><ENTRY MOREROWS=1><ENTRY NAMEEND=3>Average
<ENTRY MOREROWS=1>other category<ENTRY>Misc
<ROW><ENTRY>height<ENTRY>weight
<ROW><ENTRY>males<ENTRY>1.9<ENTRY>0.003
<ROW><ENTRY>females<ENTRY>1.7<ENTRY>0.002
</TABLE>

Note: I wouldn't have had to have any COLSPECs, except I took
advantage of a COLSPEC for the first column to set the default
horizontal alignment for all cells in the first column to LEFT.

The implied colspecs are equivalent in full blown form to:

<COLSPEC colname=1 colwidth="1*" ALIGN=LEFT>
<COLSPEC colname=2 colwidth="1*">
<COLSPEC colname=3 colwidth="1*">
<COLSPEC colname=4 colwidth="1*">
<COLSPEC colname=5 colwidth="1*">

and the cell whose content is "Average" is appropriately spanned
by specifying a NAMEEND attribute that refer to the implied "colname"
for column 3. Note that "colname"s could be more mnemonic, logical
based names rather than numbers, it's just that the implied default
is the column number making the markup for horizontal spanning in
this simple case very obvious.

For another example, I take what I sent to the list earlier
(I received there answers: one that it was obviously one way,
another that is was a different way, and Dave's below which
said the cells overlapped):

How should the following table be rendered?

<table>
<tr><td rowspan=2>1<td>2<td>3<td>4<td>5
<tr><td rowspan=2>6
<tr><td colspan=2>7<td>8
</table>

Dave's response:

+--------------------+
| 1 | 2 | 3 | 4 | 5 |
| |---------------| The cells labelled 6 and 7 overlap!
| | 6 | | | |
|----|...|-----------|
| 7 : | 8 | | |
+--------------------+

Here is how I would mark it up in the CALS subset:

<table>
<tgroup>
<tbody>
<row><entry morerows=1>1<entry>2<entry>3<entry>4<entry>5
<row><entry morerows=1>6
<row><entry namest=3 nameend=4>7<entry>8
</table>

which would give me:

+---+---+---+---+---+
| | 2 | 3 | 4 | 5 |
| 1 +---+---+---+---+
| | | | | |
+---+ 6 +---+---+---+
| | | 7 | 8 |
+---+---+-------+---+

If my last row were instead:
<row><entry namest=1 nameend=2>7<entry>8
then I would have overlapping cells (for which the presentation is
implementation dependent).

Note that, with the use of NAMEST and NAMEEND to do horizontal spanning,
it's easier to avoid ambiguous placement (though it's certainly still
possible to specify overlapping cells).

paul

Paul Grosso
VP Research, ArborText, Inc.
and
Chief Technical Officer, SGML Open

Email: paul@arbortext.com