Re: HTML 2.0 LAST CALL: Numeric character refs

Glenn Adams (glenn@stonehand.com)
Sat, 3 Jun 95 14:59:40 EDT

Date: Sat, 3 Jun 95 14:17:01 EDT
From: "Daniel W. Connolly" <connolly@beach.w3.org>

>Which ones? both Mosaic and Netscape display &#54321; as "1".
>This is *new feature,* not current practice.

Oops. After all this hooey, it turns out to be a simple mistake
on my part: I thought I had done some testing (or maybe just
read somewhere) that showed current browsers showed &#54321;.

As I said in my message of 8 May 95 (excerpted below), current
practice (among Netscape & Mosaic) is to do the following:

char ch = (char) atoi ( numCharBuf )

Thus, you got a '1' displayed in your test since:

54321 % 256 == 49 == '1'.

I happen to think current practice is quite broken in this regard
and should be discouraged.

Regards,
Glenn

----------

From: Glenn Adams <glenn>
Date: Mon, 8 May 95 16:15:03 -0400

Date: Mon, 8 May 1995 15:43:50 +0500
From: connolly@w3.org (Dan Connolly)

I'm willing to expand the section on "Undeclared markup error
handling" to talk about numeric character references, but I'm not sure
it's worth it.

My concern here relates the behavior I described earlier re: NetScape
and Mosaic where they were truncating a numeric character reference;
i.e., they were effectively doing:

char ch = (char) atoi ( numCharBuf )

I believe this behavior should be discouraged; they should be doing
something like the following on a Latin1 platform:

int ch = atoi ( numCharBuf )
if ( ch > 255 || ! isprint(ch) )
ch = <your favorite substitution character code that produces a box>;