Re: ISO/IEC 10646 as Document Character Set

Glenn Adams (glenn@stonehand.com)
Wed, 3 May 95 18:37:37 EDT

Date: Wed, 3 May 95 16:47:39 EDT
From: erik@netscape.com (Erik van der Poel)

But can we Westerners really dictate what the Japanese should do with
their "corner" of the Internet??? Especially since the default is
iso-8859-1, which means that we are not impacted.

I have an alternative proposal that just may satisfy the largest
community (just maybe).

We could say that the default encoding scheme is base line "ISO-2022".
Furthermore, we could say that, by default, the following initial state
is to be assumed:

1. ISO 2022 level 1 (ESC 2/0 4/12), i.e.,
- 8-bit code
- C0 code element
- G0 code element having GL shift status
- SPACE & DELETE
- optionally a C1 code element in CR
- G1 code element having GR shift status

2. an initial designation of 8859-1 to G0/G1, i.e.,
- ESC 2/8 4/2 (ASCII -> G0)
- ESC 2/13 4/1 (8859-1 -> G1)

In addition, we could say that, for any embedded DOCS (designations of other
coding systems) data in which byte order is not specified, that a big-endian
byte order is to be assumed.

Given this specification of a default, both ISO 8859-1 and ISO-2022-JP
systems could be conformant in the default case. In the latter case, however,
additional announcers would have to be transmitted or assumed.

Furthermore, since 10646 coded as UCS-2, UCS-4, UTF-8, and UTF-16, etc. all
can be designated through DOCS, this would also allow the latter to operate
in the default case (assuming a client could grok 2022 escapes).

How does this compromise sound? Brain dead or what?

Glenn