I'm willing to expand the section on "Undeclared markup error
handling" to talk about numeric character references, but I'm not sure
it's worth it.
My concern here relates the behavior I described earlier re: NetScape
and Mosaic where they were truncating a numeric character reference;
i.e., they were effectively doing:
char ch = (char) atoi ( numCharBuf )
I believe this behavior should be discouraged; they should be doing
something like the following on a Latin1 platform:
int ch = atoi ( numCharBuf )
if ( ch > 255 || ! isprint(ch) )
ch = <your favorite substitution character code that produces a box>;
There are no SDATA entity references in HTML 2.0.
Ah, I see you've changed from the official ISOLat1 set to a new set that
only uses CDATA. Clever.
Glenn