Re: ISO/IEC 10646 as Document Character Set

Amanda Walker (amanda@intercon.com)
Thu, 4 May 95 11:45:53 EDT

> My question is not "Can we do 2022 please?"
> It's "How do we get from the current situation to one where the charsets
> are labelled?" This is the pressing issue that I think Amanda is also
> concerned about.

Actually, no. My pressing issue is simpler than that. My pressing issue
is "I already have a multilingual WWW client running on a multilingual OS.
How do I figure out how to interpret content which uses character outside
of the repertoire of ISO 8859/1?"

That's it. Given an arbitrary HTML document, I want to know how to interpret
it even if it doesn't come from western Europe. Now, the mechanism I prefer
is getting people to put Content-Type: headers into their HTTP transactions
that declare the character encoding for the document which follows. However,
I have also seen a lot of evidence that this strategy has not yet met with
a lot of success (thus leading to conversations that devolve to "but you
HAVE to support SJIS!"/"but unlabelled SJIS is evil!" and so on). The
only advantage to some kind of in-band declaration, even a hacked one like
using <META>, is that people revise content much more often then they revise
or reconfigure servers. That's all I've been meaning to say. I don't think
that in-band labelling is a good thing--I just think that the status quo is
even worse :).

Amanda Walker
InterCon Systems Corporation