Re: (Fwd) Re: ISO/IEC 10646 as Document Character Set

Gavin Nicol (gtn@ebt.com)
Thu, 4 May 95 21:21:10 EDT

>Here is an example, we have support for a ISO-8859-X language
>using a localized browser, but this browser only runs on platforms
>A and B. We have external viewers for the language that can be
>invoked by correct setup in .mailcap .mime.types files on platforms C
>and D. Now how is a server to distinguish between these cases?

Why is it necessary to? The data can be displayed anyway (though it
might be nice to distinguish, I cannot see the absolute need).

>Finally on Point 3. I think 10646 is conceptually nice, however given
>the fact that we accept different "representations" such as 8859-X
>being actually sent to the browser via HTTP, then this becomes the "HTML
>document" that the client sees. Servers may pre-map the 10646 to 8859-X
>if only one none-Latin1 language is used and so basically the
>10646 is really not "visible" to the outside world. Thus we are back to
>HTML markup in US-ASCII and everthing else as data.

Yes, that is the whole point. By using ISO 10646 we get:

1) A formal foundation for the treatment of multilingual data.
2) A unified numeric charset model, even in the face of arbitrary,
blind, encoding transformations.
3) A single SGML declaration.
4) A base for a transition to a truly multilingual WWW.

The document character set proposal doesn't buy us a lot at the moment,
except peace of mind, and a reasonable way to handle numeric character
references. It is what it enables which is really important...