Re: ISO/IEC 10646 as Document Character Set

Albert Lunde (Albert-Lunde@nwu.edu)
Thu, 4 May 95 19:30:27 EDT

>OK... now I've got exact changes I can make to the document. I'll
>have to get and install SP so I can test the DTD from now on, but
>that's long overdue anyway.
>
>The real question is: what does this mean to information providers?
>Does it solve any of their problems?
[...]
>I don't see how putting half the solution -- ISO10646 as a document
>character set, with no deployed support and no specification for
>support of other encodings -- in the 2.0 document is better than
>leaving 2.0 as is and providing a complete specification in another
>document.

It implies an interpretation of numeric character references and an SGML
declaration that shouldn't break when used with other MIME charsets, but I
kind of agree that implementers really don't have enought detail to work
with it unless they read between the lines a lot (or read the WG archives
;).

It may lead experiementation with other charsets in an interoperable
direction, by avoiding a legacy of documents with incompatible numeric
character references. (SGML mavens? is this the main issue it impacts?)

The argument seems to be that it is easier to specify ISO10646 than it is
to leave in Latin-1 and word the document to make clear the direction of
future internationalization. I'm not _sure_ this is true; if so it may be a
consequence of trying to say _anything_ about other MIME charsets: it's
easier to intepret other charsets with a document character set of ISO10646
than it is when we use Latin-1.

We might as well admit in the RFC that this (ISO10646) is a "feature", the
uses of which will be specified further later, which was introduced because
it was upward compatible the current usage.

---
    Albert Lunde                      Albert-Lunde@nwu.edu