I think this is seriously wrong. Encodings and languages are pretty much
orthogonal, with a single encoding being able to represent several languages and
a single language being representable in a single encoding.
>I think it's not necessary to differentiate
>languages using the same encoding scheme (e.g. french and german).
There are many reasons why this is highly desirable: glyph disambiguation,
translation, hyphenation, indexing...
>One
>usage to to differentiate particular languages is to facilitate automatic
>translation.
That's only one usage.
>But I think if a translator can't figure out which language
>by looking at the raw bytes of a known encoding scheme, it's pretty much
>useless.
A very debatable opinion. Automatic translation is already very much wanting,
adding a hard to satisfy requirement will only make it more costly and less
reliable. Should we also add this requirement to every hyphenator, every
indexer, every rendering engine? I don't think so. A language tag is a
language tag is not en encoding tag.
-- Francois Yergeau <yergeau@alis.ca> Alis Technologies Inc., Montreal +1 514 738-9171