Re: HTML 2.0 comments (First of two)

Bert Bos (bert@let.rug.nl)
Wed, 23 Nov 94 15:37:07 EST

Sandra, you're absolutely right that i18n isn't handled very well by
HTML 2.0. This will be addressed, but it's too late for HTML 2.0,
which is supposed to document `current practice'.

The next version (3.0) will have attributes on all elements to set
the character set (and an attribute to set the language as well.)
The value of this attribute will probably be a string defined by the
OSF registry you mentioned.

This per-element character set is not completely satisfactory, but
it will get us closer to the next step, which is ISO 10646 in any of
its guises (probably UTF-8). (People have talked about ISO 2022, but
nobody really seems to want it.)

With respect to character entities in HTML 3.0, there are basically
three types of those:

1. entities for characters that are also in the current
charset. These are not very useful, except to people who have
to create documents in a hostile environment, like a text
editor under DOS.
2. entities for characters in a different charset. We'll need
these to refer to, e.g. Latin-1 characters in the midst of an
HTML element with charset Latin-8.
3. entities for characters that are in no (standard) charset at
all, i.e., that have no number. Clearly we'll need these. (An
example is the WWWicn set of common icons, included in HTML+.)

Until we can confidently specify Unicode as the character set for
HTML, we'll have to support as many entities as possible. I'd say we
should simply include the whole ISO set. Maybe HTML 4.0 or 5.0 can
drop all entities, except for group (3).

HTML 3.0 isn't ready yet. The version implemented by the Arena
prototype browser is clearly not complete. My own prototype isn't
much better. Arena-HTML lacks many important entities and
elements. But HTML 2.0 will hold out for a few months, so we have
time enough to get 3.0 right.

Bert

|I recently had a chance to read the HTML 2.0 specification, and
|have some serious concerns about its design with respect to
|internationalization (I18N) issues. I handle I18N at OSF, and
|have some suggestions for ways to change the HTML spec to
|accommodate international requirements better. My suggestions
|fall into two categories -- an overall design issue covered in
|this message, and a separate set of comments on individual
|sections in the spec (emailed separately). Please let me know
|if you have comments or questions.
|
| Best regards,
| -- Sandra
|
|---------------------------------------------------------------------
|Sandra Martin O'Donnell email: odonnell@osf.org
|Open Software Foundation phone: +1 (617) 621-8707
|11 Cambridge Center fax: +1 (617) 225-2782
|Cambridge, MA 02142 USA
|---------------------------------------------------------------------

-- 
___________________________________________________________________________
####[ Bert Bos                     ]####[ Alfa-informatica,           ]####
####[ <bert@let.rug.nl>            ]####[ Rijksuniversiteit Groningen ]####
####[ http://www.let.rug.nl/~bert/ ]####[ Postbus 716                 ]####
####[                              ]####[ NL-9700 AS GRONINGEN        ]####
####[______________________________]####[_____________________________]####