Re: ISO/IEC 10646 as Document Character Set

James Clark (jjc@jclark.com)
Sat, 6 May 95 11:31:38 EDT

> Date: Fri, 5 May 95 17:02:30 EDT
> From: connolly@w3.org (Dan Connolly)
>
> Section 9.5, "Character Reference" says that a numeric character
> reference should be treated just like the character it references. But
> if the number isn't in the domain of the document character set, what
> character does the reference refer to? I'd say this is a reportable
> markup error.

The note in clause 9.2 says:

A non-SGML character can be entered as a data character within an
SGML entity by using a character reference.

So the number can be one that was described as UNUSED in the document
character set.

It is an interesting question what restrictions there are on the
character number. 13.1.1 says:

The described character set portions must collectively describe
each character number in the described character set once
and only once.

Given this and given that the number is a "character number", I think
one could argue that the number in the character reference must be one
that was described (even if only as UNUSED) in the document character
set section of the SGML declaration.

James