>Another idea is to use the internal character code as an index into
>data about the character, e.g. its directionality, character set
>and code in that set, its intended language etc. To SGML the character
>is just another code, to the display routines its an index into this
>extra info.
And you have to manage such tables for all possible character sets.
>> But there are practical considerations: how does an author put one of
>> these "direction change" characters into a document? I suppose the
>> issues are already addressed in existing multilingual composition
>> interfaces, and we just need to find a reasonable representation of
>> the idioms.
>
>thats what I was thinking ...
This is easily solved. Most current multilingual systems are based on
either ISO-2022, or on Unicode, and one never specifically says
anything about direction at all: it is contained within that data
itself. ISO-2022 probably doesn't require any additional "hints" at
all, and in Unicode, and can only really think of glyph disambiguation
as a reason for addition "hints".
I should note that these "hints" are not really required: the text is
still legible, and in many cases, people don't even notice something
wrong (ie. the differences are often within the boundaries of normal
handwriting variances).
ISO-2022 is not what we want though. It is a bandaid on a festering
wound.