Re: ISO/IEC 10646 as Document Character Set

Gavin Nicol (gtn@ebt.com)
Thu, 4 May 95 03:04:36 EDT

>>Most companies here don't even know what the Internet is, let alone
>>the WWW (they know the *names*, but that's it).
>
>So you're saying that it's OK to destroy the interoperability being
>enjoyed by the companies that *do* know what the Internet and WWW are?

Name some. Most of the leading companies in this area would be more
than willing to follow the IETF lead, so long as they felt the IETF
actually offered a solution.

>Have you asked them how they feel about this?

I'n my line of work, I talk to many companies, and yes, I know their
(general) opinions.

>Netscape 1.1 (Win/Mac) automatically distinguishes between EUC, SJIS
>and ISO-2022-JP. The user simply sets a preference for autodetection
>(the first time they run Netscape), and Netscape does the rest.

Sure. Autodetection works some of the time. Why do you think there is
autodetection?

>If there is so little interoperability in Japan, how do you explain
>the existence of so many documents on the many servers listed in
>NTT's list of Japanese Web sites?

Those sites are primarily there for experimental purposes, or to just
show that the company "is on the net". Most of that data is specially
prepared such that it will work.

>Netscape 1.1 supports the charset header and a few charsets. Does
>Mosaic L10N support the charset header?

Not yet, but it soon will from all accounts. I should note that access
from PC's is increasing very rapidly here due to magazines
etc. publishing articles on the WWW, and because NIFTY etc. have PC
browsers readily available. As such Mosaic L10N is becoming less
influential.

>>Yes, you can, and should.
>
>OK, so that's your opinion. Now we need to find out what the Japanese
>themselves think.

Come here an talk to them then. I talked to a fellow who did the
localisation work on LYNX. I explained most of the I18N ideas to
him. He said both the labelling idea was good, and after I explained
just what was meant by using ISO 10646, he said that was a good idea
too.

>>The WWW here is just now starting up. The user base is small *now*,
>>but in 6 months, it will be much, much larger. At that time, it will
.
>I hope you're right, but shouldn't we check with the Japanese?

I talk to major publishers, and software companies almost every day. I
have discussed WWW issues with many people here. Most people
understand and agree with what I say.

>>There is a boom starting, and we have one chance *now* to get it
>>right. Let's not blow it.
>
>I agree 200%.

Then do it right.

>The Accept-Charset header is quite different from what I have in mind.

If a client can send an Accept-Charset, don't you think it would also
be able to understand the charset parameter? The amount of work
involved in adding support for another field would probably be roughly
equivalent to supporting the charset parameter on the MIME type.

>Yeah, one of the problems is that it is difficult for them to engage
>in discussion in English.

This is actually not true. Most Japanese can communicate quite well
via email. They can be a little intimidated by English however.

>Can you read/write Japanese? Are you on the infotalk mailing list at
>NTT? The www-mling list seems quiet these days. But infotalk is a
>bit more active.

Yes, I can of course read and write Japanese, or I'd not be working
here.

I'll tell you what. I'll talk to the people working on Mosaic L10N,
and make sure it supports the charset parameter in the next
release.

The biggest problem is the servers not labelling things correctly. Can
server authors please make sure there is some way to do this in the
software you release?

If we can present a solution, I think most people here will be happy
to upgrade.