Re: Revised language on: ISO/IEC 10646 as Document Character Set

Larry Masinter (masinter@parc.xerox.com)
Thu, 11 May 95 11:26:50 EDT

Gavin asked:
>> How many character sets in the world contain characters not in
>> ISO 10646?

And I answered:
>How many ever there were, there's now one more, because I just made
>one up. There's no way to stop anyone else from making one up, either.

And Gavin replied:
> Yes, I am quite aware of the lack of extensibility. For the moment at
> least, I am not aware of any widely used character sets that contain
> characters not in ISO 10646.

Companies frequently make up fonts with 'private' codes in them.
Xerox has a font with the Xerox logo, the Xerox private data stamp,
etc. The mapping from code to character in private fonts could well be
handled by a 'special' character in the system character set that
wouldn't be part of the document character set.

Similarly, mathematicians and physicists frequently make up special
symbols. Assigning these codes for a 'special' system character set
would actually make sense. I'd what the special symbol to appear
larger in a <h1> than in the body, etc.

If I were to encode a document in unicode/10646 which used such codes,
I'd probably map the special characters into the 'private use' space,
but it wouldn't really _be_ 10646.

Also, as far as I know, neither Klingon (widely used in the Klingon
empire) nor Tolkein runic are in Unicode or ISO 10646.

In any case, I see no reason for requiring a restriction on the
charset designation or the system character set when we assert that
the document character set is 10646; we don't need to list _all_ of
the limitations of current browser technology when we define HTML 2.0.