Re: Character sets

Albert Lunde (
Tue, 7 Feb 95 00:32:24 EST

> Given that we really do want a multilingual WWW, and that we do want
> HTML to be a conforming application of SGML, and that we want to
> acknowledge that yes, there is data in things other than ISO8859-1, we
> have this little problem of the SGML declaration.
> I have stated time and again that ERCS (Unicode) is the best answer,
> and as yet, we have no resolution. We also have no alternative
> proposal as yet.

I'd suggest that we should allow use of any of the character sets
referenced in the MIME RFCs (ISO-8859-X for X=1 to 9), for
interoperability with MIME, if nothing else. (In addition
to Unicode.)

I didn't hear a consensus the last time this subject broke out,
but it seemed that many of the objections raised to Unicode
as a device for multi-lingual documents were addressed
by Unicode plus some explict way to indicate changes in language:
either a tag or some low-level mechanism.

(I'm not sure which aspect of the SGML declaration you are
seeing as the problem.)

It seems like a language tag would be simple, if a bit free with
bandwidth. Would the options of Unicode or the MIME ISO charsets
plus a language tag for markup be flexible enough to satisfy
all parties?

    Albert Lunde