Comments on html 3.0 draft

Dan.Oscarsson@malmo.trab.se
Mon, 24 Apr 95 03:04:06 EDT

Comments to html draft: <draft-ietf-html-specv3-00.txt>
By: Dan.Oscarsson@malmo.trab.se

Html version 3 is beginning to look good. Below are a few comments
to it.

Style sheets
-------------
I have looked at DSSSL Lite. I recommend that DSSSL Lite is not
used. It is not easy enough to use.

Character sets (general,page 9, page 142, page 143, page 151)
-------------------------------------------------------------
That text/html is by default ISO 8859-1 is good.
But is should be made clear that when writing html on a system
supporting ISO 8859-1, you do not need to use the character entity
references. A lot of people beleve so.

Regarding URIs and URLs the matter is vague. In RFC 1630 URIs are
recommended to encode the not ascii part of ISO 8859-1, but says that
the safe characters that need not be encoded if the environment allows
it. In RFC 1738 it appears that all characters outside the ascii range
must be encoded.
The need to encode normal letters is a pest!
The WWW is hopefully going to be used (and is already being used) by
non computer hackers. A non computer hacker has no understanding for that
at normal letter must be encoded with a difficult and obscure encoding
that they must have tables for to handle.
When your information system uses ISO 8859-1 as the normal character set
(or ISO 10646) and all your text files use ISO 8859-1, it is impossible
for a normal user to understand that when they edit a html text file
with their normal text editor (it will take quite some time before
a good html editor appears), they cannot write letters in the normal way!
URIs inside a html document must be able to use the same character set
that the rest of the html document is in. If ISO 8859-1 is used, the URI
must be able to use ISO 8859-1 characters without encoding (except for
the special ones like %).
And when a normal user uses a www browser, the user must be able to use
ISO 8859-1 characters in the dialogs when accesing URIs, if ISO 8859-1
is the character set used there.
I know this is a difficult matter, but the non computer hacker must be able
to use what is normal for them. I know there are other character sets, but
if it was defined that, if characters outside ascii is used in URIs,
ISO 10646 coding is expected. This would allow printed URIs to be
unambigous. Of course, util we have a protocol for sending http requests
with a character width orther tham 8 bits, only the ISO 8859-1 subset can
be used.

page 142: To my knowledge there are 96 printable characters in the upper
ISO 8859-1 range, of which two are non-breaking space and soft hyphen.
There should be 94 graphical characters. ISO 8859-1 has 65 code positions
reserver for control characters: 0-037 and 0177-0237. I do not know
which ISO definition you are referencing when saying that 8 are unassigned,
I have one draf that only has 3 unassigned. I suggest you just say
32 control characters in the upper range.

page 143: &nbsp; should be equivalent with ISO 8859-1 character 0240 though
this is not clear from the text. A browser should treat them as the same.
In last part change 55 control characters to 62.

page 151: &#160; is not unused. This is the non-breaking space!

URIs and URLs
--------------
In some parts it says URI and in some URL. Somewhat confusing. Use
same thing at all places. Also include in the beginning a short
description of what a URI is (or a URL if you intend to use that in the
text).

Case sensitivity versus insensitivity
--------------------------------------
It is somewhat unclear what names, tags, classes etc. that are
case insensitive.
I suggest you include a part in the beginning defining the general
behaviour, with exceptions noted in the text at those places.
For simple writing I recommend that as much as possible be
case insensitive. For example ID, NAME, REL, CLASS, LANG and ALIGN.

LANG value
-----------
Is there an ISO standard saying that a period should be used between
language and country? Unix locale standard uses "_".

Pictures and text flow
-----------------------
It is somewhat unclear how text flows together with pictures (images, tables,
etc). It appears that thet may not flow around a picture it it is centered.
Can text only flow when pictures are aligned to the left and right side?
Can I have something like this?

xx xx xx xx x xxx xx x x xxx xx xxxx xxxx
!------! xxx xx xxxx xxx xxx xxx xxx xxxx
! !!-----! xx xx x xxx xxx !------!
! !! ! xxx xxx xxx xxxx ! !
! !! ! xx xxx xxx !----!! !
! !! ! xxx xxx xx ! !! !
!------!! ! xx xxxx xx ! !! !
!-----! xxx xx xxx ! !! !
xx xxx xx xxx xxx xxxx xxx !----!! !
xx xxx xxxxx xxx x xx xxxxx xxxx ! !
xx xxxx xxxx xxxx xxxx xxxx xxxx !------!
xxxx xx x xx xxxx xxx x xxxx xxxxx xxxx x

Include a text in the beginning explaining how text flows around
pictures.

Page 11
-------
Names have up to 72 letters. What type of letters? In Swedish letters include
characters outside ascii!
Also when setting up such curious limits like 72 characters, give the
reader a good reason why! Why not 132?

Page 12
--------
Max length of attribute value is 1024, why not more? Give a good reason in the
text.

Page 52
--------
What is the format to send clicks on images to the server with?

Page 70
--------
In FIG mainmenu: I suspect the HREFs for News, Products and
Worldwide Contacts are wrong.

Regards,

Dan

--
Dan Oscarsson
Telia Research AB                       Email: Dan.Oscarsson@malmo.trab.se
Box 85
201 20  Malmo, Sweden