Re: ISO/IEC 10646 as Document Character Set

Gavin Nicol (gtn@ebt.com)
Wed, 3 May 95 18:56:48 EDT

>The HTTP spec already has some wording about charsets, but it seems
>that hardly any servers out there are appending the charset parameter
>to the content-type header.

Sad, but true.

>It's easy and tempting to say "They're broken and must be fixed".

This is the best route, or we'll face bugward combatibility problems
forever.

>But there's the installed base and the interoperability currently being
>enjoyed (yes, even in Japan).

Most companies here don't even know what the Internet is, let alone
the WWW (they know the *names*, but that's it). The interoperability
you refer to is:

"Hey. This squiggly stuff looks like EUC. Now, pull down the Options
menu, change the font... ahh, now it looks OK."

>A server administrator can't just add the charset parameter without
>thinking about the consequences. What if there are browsers out
>there that currently display Japanese just fine, but have problems
>when there is a charset parameter?

Then they are BROKEN. The most popular browsers in Japan are Mosaic
L10N, Netscape, and a few others. I believe they will all support the
charset parameter in the near future (if they don't already). They
should also support the Accept-Charset: field.

>Again, it's easy for this working group to say "Those servers/clients
>are broken. Fix them."

Yes, and we should.

>But can we Westerners really dictate what the Japanese should do with
>their "corner" of the Internet??? Especially since the default is
>iso-8859-1, which means that we are not impacted.

Yes, you can, and should. Anything else is equivalent to sticking your
heads in the sand, like with the document character set issue.
I live in Japan, I have lived here for 10 years, and I probably did
the first Japanese version of Mosaic (maybe a month or 2 ahead of the
NTT guys) when I worked at NEC. I know the state of the net here, and
my primary motivation for being on the list is to get the state of the
Japanese WWW improved (though it seems education play a large part).

The WWW here is just now starting up. The user base is small *now*,
but in 6 months, it will be much, much larger. At that time, it will
be a much, much, bigger problem to solve. If we strike *now*, and say
that servers *must* label the data correctly, and that clients
*should* send an Accept-Charset field, they will become widespread
practise.

There is a boom starting, and we have one chance *now* to get it
right. Let's not blow it.

>It might be a good idea to have clients tell servers that they are capable
>of parsing the charset parameter. This is similar to Dan's proposal
>to have clients tell servers that they can do HTML 3.0 (tables, etc).

This is already *in* HTTP. We thrashed around on this 6 months
ago. Clients *should* send an Accept-Charset(perhaps poorly named)
field if they can accept anything other than ISO-8859-1, but
servers *must* label the documents correctly.

>Please let me know what you all think. Perhaps the Japanese should be
>involved in this discussion?

Why don't you go over to the www-mling mailing list and invite them
over? Last time anything was Cc'd there, nothing happened.

Considering how popular Netscape is in Japan, your article exhibits
fundamental misunderstanding of the market, the network scene, and the
issues involved here.