Re: HTML Character Representation/Transmission Model

Gavin Nicol (gtn@ebt.com)
Tue, 11 Apr 95 15:11:45 EDT

> It should be noted that if the system character set is ISO-8859-1,
> then numeric character references like ҽ can be resolved
> in a system dependent manner, because SGML does not define any
> behaviour here.
>
>I'm not sure what you're saying here. A numeric character reference
>is not resolved in relationship to the system character set but in
>reference to the document character set. These two may differ, i.e.,
>one might have a system character set (what I earlier called the client's
>storage object character set SCS') of Windows Code Page 1252 and
>a document character set of ISO/IEC 10646.
>
>This implies that, in general, a numeric character reference cannot
>be resolved in a system dependent manner. Perhaps you mean that the
>internal representation for the reference's replacement character is
>system dependent? In that case, I agree.

Yes this is what I was trying to say. If the system character set
(internal representation) does not contain all the characters of the
document (coded) character set, then the behaviour is undefined in
SGML (or at least as far as I can tell from reading standard. I seem
to remember a specific passage in Goldfarb which says this, but it's
not exactly something one can commit to memory and remain sane ;-)).