Re: Charsets: Problem statement/requirements?

Luke 路客 (ylu@ccwf.cc.utexas.edu)
Mon, 13 Feb 95 14:07:06 EST

On Mon, 13 Feb 1995 yergeau@alis.ca wrote:

>Luke Y. Lu <ylu@mail.utexas.edu> wites:
>>On Fri, 10 Feb 1995 yergeau@alis.ca wrote:
>>>I think this is seriously wrong. Encodings and languages are pretty much
>>>orthogonal, with a single encoding being able to represent several
>>>languages and a single language being representable in a single encoding.
> ^
> I meant multiple, of course
>
>>Well, nothing prevent you from adding <lang lc="en" enc="whatever"> to give
>
>Which means using one tag for two orthogonal purposes. Bad idea, if you
>ask me.

You're right, they're technically orthogonal.

>>hints to dumb and dumber language robots.
>
>I'd like to see a non-dumb hyphenator, by your definition. My word
>processor doesn't have one, despite its being a rather sophisticated
>word processor. In fact, I'd like to see any language processor that is
>not dumb by your definition.

Not until we have break thru in language processing and/or AI. ;-)

>>I agree. But a language tag is useless if don't know it's encoding scheme,
>
>Just like a heading tag is useless if you don't know the encoding of the
>heading. That's not a reason to bundle the two together.

The reason I bundled them together, because I was thinking of changing from
one language family to another where multiple charsets are necessary, which
more often than not implies a change in encoding. Thanks for the
correction.

__Luke