Re: charset parameter (long)

Gavin Nicol (gtn@ebt.com)
Mon, 16 Jan 95 18:04:56 EST

Dave Ragget writes:

>Another idea is to use the internal character code as an index into
>data about the character, e.g. its directionality, character set
>and code in that set, its intended language etc. To SGML the character
>is just another code, to the display routines its an index into this
>extra info.

And you have to manage such tables for all possible character sets.

>> But there are practical considerations: how does an author put one of
>> these "direction change" characters into a document? I suppose the
>> issues are already addressed in existing multilingual composition
>> interfaces, and we just need to find a reasonable representation of
>> the idioms.
>
>thats what I was thinking ...

This is easily solved. Most current multilingual systems are based on
either ISO-2022, or on Unicode, and one never specifically says
anything about direction at all: it is contained within that data
itself. ISO-2022 probably doesn't require any additional "hints" at
all, and in Unicode, and can only really think of glyph disambiguation
as a reason for addition "hints".

I should note that these "hints" are not really required: the text is
still legible, and in many cases, people don't even notice something
wrong (ie. the differences are often within the boundaries of normal
handwriting variances).

ISO-2022 is not what we want though. It is a bandaid on a festering
wound.