Re: ISO/IEC 10646 as Document Character Set

Glenn Adams (glenn@stonehand.com)
Mon, 1 May 95 13:03:15 EDT

Date: Mon, 1 May 1995 12:21:40 -0400
From: Gavin Nicol <gtn@ebt.com>

Yes, referencing the system character set is incorrect. However, there
is an implicit conversion from the document character set to the
system character set (if they are not the same), and this conversion
is *not* defined (or at least, I cannot remember anything for defining
it). Given this, it seems that almost anything is allowed, but certain
things are obviously undesirable (like assuming numeric character
references *will* be mapped using the system character set).

The latter assumption is not only undesirable, it is incorrect.

Translation between a document charset and a system charset is not
the same as a translation between two distinct document charsets. The
system charset is merely the charset used in the representation of a
document (or an entity). This has nothing to do with the document charset
used by the document (or entity).

A numeric charref only has to be translated when translating between
different document charsets. This has nothing to do with the
representation of a document (or entity) using a system charset.

Glenn