Re[2]: Language hints in UNICODE private use area

pandries@alis.ca
Thu, 19 Jan 95 19:27:41 EST



>This discussion originated from my *real* proposal which is to have
>Unicode be the core character set that every browser should
>understand. This other issue is of less import, but without
>something, the Japanese will be reluctant to accept Unicode. Having
>Unicode be the common character set does *not* mean that iso8859-1
>could not be used. The Accept-Charset: parameter (which should
>appear in http 1.1, and the charset= parameter on the text/html mime
>type will provide ways of allowing character set negotiation.

The need for language tags in html exceeds the simple unicode CJK
rendering problem. I don't deny that it would be useful to have
presentational hints in Unicode to accelerate its acceptance in Japan.
However this is a different problem than the one we are trying to
solve here. We need language tags that are independent of the charset
(Thus outside Unicode) to do any kind of useful processing on the
multilingual data beside the CJK representation. (Spelling,
hyphenating, indexing, translating...) These tags should not only be
available in the new Unicode documents but also in all those other
documents still coded in ISO-8859-1.

>We are not discussing the interpretation of Unicode characters, but
>rather the transfer encoding of text/html and other textual data
>sent via http.

And therefore the definition of language tags should not be restricted
to Unicode.

Patrick Andries
Alis Technologies