This question did get raised (in some form) a month or two back,
and died down again with little effect. Discussion of Unicode has
been happening in fits and starts for months, but nobody
has come close to making a proposal that is _more_ comprehensive.
(I'd also suggest that the technical issues differ from MIME,
because we need a scheme that will work well with SGML.)
The most serious objection raised (how to render the asian languages)
was addressed by proposing markup for languages (coming in HTML 3.0
at least).
The idea of using Unicode as the document character set was
motivated in part by fixing the numeric references issue, but now that
I understand it (sort of) I think it will clean up a bunch
of loose ends in SGML and improve interoperability between
various encodings.
Our problem is not how to encompass that largest possible writing
system imaginable, our problem is how to write a standard that
goes beyond ISO Latin-1 and ISO-8859-X. I think the proposed
direction of using Unicode as the document character set
combined with a wide choice of MIME encodings does this
well, and increases the scope of possible characters from
255 to over 30 thousand. If this turns out not to be sufficent,
I'm sure we can do an extension mechanism or format negotiation
to allow for use of a different document character set.
But I don't see why hypothethical objections in the absence
of a concrete counter-proposal should stand in the way of the
large improvement in internationalization we get by adopting
ISO 10646 as a document chacacter set, at least for the
"next step".
I also wonder a little that more hasn't been said sooner: we
got to the present proposal out of repeated discussions over
several months. Any objections that would apply to ISO/IEC 10646
as being too restrictive (rather than too general) apply also
to the use of ERCS, which was floated by Gavin as far back
as Dec 94.
-- Albert Lunde Albert-Lunde@nwu.edu