Re: ISO/IEC 10646 as Document Character Set

Gavin Nicol (gtn@ebt.com)
Mon, 1 May 95 12:21:20 EDT

> Well, as the mapping from document character set to system character
> set is not specified in SGML, and hence, is system dependent, I think
> most current practices, even numeric character references, are legal
> with a document character set of ISO 10646.
>
>No, numeric character references *are* broken if they are used in
>reference to the system character set rather than the document character
>set. The standard is quite clear on this.

Yes, referencing the system character set is incorrect. However, there
is an implicit conversion from the document character set to the
system character set (if they are not the same), and this conversion
is *not* defined (or at least, I cannot remember anything for defining
it). Given this, it seems that almost anything is allowed, but certain
things are obviously undesirable (like assuming numeric character
references *will* be mapped using the system character set).

Page 452 of Goldfarb has a few notes regarding this.