Re: ISO/IEC 10646 as Document Character Set

Gavin Nicol (gtn@ebt.com)
Sat, 29 Apr 95 04:25:20 EDT

Just to clarify things in my mind, would the following be allowed
>in your world? HTTP headers followed by HTML document:
>
> HTTP/1.0 200 OK
> Date: Saturday, 29-Apr-95 03:53:33 GMT
> Server: ...
> MIME-version: 1.0
> Content-Type: text/html; charset=iso-2022-jp
> Last-modified: Tuesday, 18-Apr-95 16:10:13 GMT
> Content-length: 15132
>
> <TITLE>...</TITLE>
> <BODY>
> Here is some normal text.
> Here is a 10646 numerical entity: &#23598732;.
> Here is some ISO-2022-JP text: ...
> </BODY>
>
>I.e. is the charset allowed to be iso-2022-jp (or any other non-Latin-1
>and non-10646/Unicode charset), and are you still allowed to use 10646
>numeric entities within such documents?

Quite legal,

>If this is allowed, I agree that this would be a good way to migrate
>to the Brave New World of 10646.

This is one reason for using ISO 10646 as the document character set:
all numeric references are resolved using ISO 10646 (numeric character
reference unification).

However, for systems that do *not* support ISO 10646 as the system
character set, it is legal to map such values to something else,
though this behaviour should be regarded as "undesirable". Thus,
current browsers are still legal, though they should be modified in
the future.