charset labelling [was: ISO/IEC 10646 as Document Character Set]

Dan Connolly (connolly@w3.org)
Wed, 3 May 95 19:07:07 EDT

Amanda Walker writes:
> > But can we Westerners really dictate what the Japanese should do
> > with their "corner" of the Internet??? Especially since the default
> > is iso-8859-1, which means that we are not impacted.

If the japanese had developed and deployed a freely available
information system that was rich enough to support all the
applications that the web supports, then we'd be faced with this
problem, and not they. But here we are. The web supports ISO-8859-1
in a standard fashion, and other encodings on an ad-hoc
basis. Standards in this area are emerging.

> >From a purely pragmatic standpoint, this is why I am interested in
> some convention (even an admitted hack, like putting in a
> specially-formatted SGML comment that I can look for) which could be
> deployed by content authors,

Hack all you want. Put out a proposal. Lobby for support. But by
the time you've done this, the servers will be reved to support
a non-hack solution, ala:

foo.msg:

Content-Type: text/html; charset=x-shift-jis

<html>
<title>...</title>
...

By the way: I prefer "foo.msg" to "foo.mim" or "foo.www" or whatever:
when you stick headers on top of a html file (or gif file or ...),
what' you've got is now a "message entity" from the MIME and HTTP
specs. The data format that folks tend to call "MIME format" is really
just the internet media type "message/rfc822".

Hmmm... maybe this is how META should have been implemented in the
first place...

> rather than being dependent on server
> administrators revving their software.

Let's be careful we don't let business issues cloud the technical
topics here. There's a standard for the protocol, and a standard for a
data format.

There are also a bunch of wideley deployed conventions about how HTTP
servers work, which are quickly turning into products. If you want to
ensure interoperability between those products, a public specification
(e.g. a standard) is the way to go.

For example, look at the CGI spec: despite the fact that it hasn't
really reached the status of published standard, it does represent an
open discussion between the developers of the various servers, so that
server administrators could preserve their investment in scripts
across server implementations.

Dan