Re: ISO/IEC 10646 as Document Character Set

Gavin Nicol (gtn@ebt.com)
Mon, 1 May 95 13:34:22 EDT

>>it). Given this, it seems that almost anything is allowed, but certain
>>things are obviously undesirable (like assuming numeric character
>>references *will* be mapped using the system character set).
>>
>The latter assumption is not only undesirable, it is incorrect.

Perhaps, though it depends on whether you make the assumption about
the way the numeric character reference is resolved, or the way the
system resolves ISO 10646 characters to thier internal
representation. In the latter case it's system dependent; kind of like
"Mosaic or Netscape" compatible.

>Translation between a document charset and a system charset is not
>the same as a translation between two distinct document charsets. The
>system charset is merely the charset used in the representation of a
>document (or an entity). This has nothing to do with the document charset
>used by the document (or entity).

Yes, I know.

>A numeric charref only has to be translated when translating between
>different document charsets. This has nothing to do with the
>representation of a document (or entity) using a system charset.

OK. Perhaps what I said was poorly phrased (though I believe I used
the SGML terminology correctly). Irrespective, if I have a system
character set of ISO 8859-1, and a numeric character reference of
① the result is undefined by the SGML standard, and is
therefore, system dependent.