Re: format nego in HTML/10646?

Dan Connolly (connolly@w3.org)
Mon, 8 May 95 17:33:22 EDT

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Roger Price: "How can we prohibit..."
Previous message: Erik van der Poel: "Re: Revised language on: ISO/IEC 10646 as Document Character Set"
Maybe in reply to: Terry Allen: "format nego in HTML/10646?"

Glenn Adams writes:
>
> My concern here relates the behavior I described earlier re: NetScape
> and Mosaic where they were truncating a numeric character reference;
> i.e., they were effectively doing:
>
> char ch = (char) atoi ( numCharBuf )

This is conforming behaviour, unless the document is somehow
labelled as having a document character set with code positions
above 255.

> I believe this behavior should be discouraged; they should be doing
> something like the following on a Latin1 platform:
>
> int ch = atoi ( numCharBuf )
> if ( ch > 255 || ! isprint(ch) )
> ch = <your favorite substitution character code that produces a box>;

We can't go back in time and change the installed base, so this
is largely a moot point. But I don't want to encourage the "it
works in mosaic" syndrome. Those browsers _should_ report an
error:

Bad numeric character reference `ř' at line 27.

or some such, so that the author has incentive to fix the broken document.

An HTML 2.0 user agent must be 8-bit clean, and that's it. Support for
"wide characters" and such is essentially out of scope for 2.0.

Let's get busy on the I18N document!

Dan

Next message: Roger Price: "How can we prohibit..."
Previous message: Erik van der Poel: "Re: Revised language on: ISO/IEC 10646 as Document Character Set"
Maybe in reply to: Terry Allen: "format nego in HTML/10646?"