Re: ISO/IEC 10646 as Document Character Set

Glenn Adams (glenn@stonehand.com)
Sun, 30 Apr 95 11:33:10 EDT

Date: Fri, 28 Apr 95 22:17:27 EDT
From: connolly@w3.org (Dan Connolly)

Gavin Nicol writes:
> >I was under the impression we needed this (10646 as the standard
> >document character set) for 2.0 in some form to resolve the
> >question of numeric character references, or else we needed to
> >remove some language about other charsets and make 2.0 talk
> >about Latin-1 only.
>
> Again, isn't 2.0 about current practise, which in the area of
> character sets is: ISO 8859-1 is all we define behaviour for.

That's the way I see it. The 2.0 spec describes a set of features
that are widely deployed. Unicode is not widely deployed in web
browsers (and certainly wasn't in June '94...)

The internationalization issues deserve their own document, not
a "quick sneak" into the 2.0 document.

I think its important to recognize that:

1. changing the doc charset to 10646 doesn't have any implications on
currently deployed browsers, etc.; that is, it doesn't invalidate any
current practice (other than the current hodge-podge of usage re: numeric
char refs -- which is broken in any case).

2. the 2.0 spec as an RFC will the first official spec as a standard [as
far as I'm aware]; therefore, it is very important to not artificially
limit its applicability, as would be the case of it goes out using 8859-1
as the doc charset.

If people agree with these two points, then I see no reason not to
resolve this issue in the first 2.0 RFC. The paper describing the
larger issues re: I18N, i.e., how to make use of the full capabilities
of 10646, etc., is largely independent of this decision. I would vote
that we don't tie the two together (that is, publishing an i18n doc and
deciding to use 10646 as the doc charset).

Glenn