Re: Revised language on: ISO/IEC 10646 -- another proposal

Albert Lunde (Albert-Lunde@nwu.edu)
Fri, 12 May 95 23:56:01 EDT

(Bert's conclusions that you quoted don't totally follow from
what I think I said, but this may be another issue.)

> Bert raises a good point.
> Supporting a full set of ISO 10646 NCRs for the various "charset" encodings
> will require many large tables:
> ISO-10646 to SJIS (and vice versa)
> ISO-10646 to JIS (and vice versa)
> ISO-10646 to EUC-JP (and vice versa)
> ISO-10646 to GB (and vice versa)
> ISO-10646 to HZ (and vice versa)
> ISO-10646 to Big5 (and vice versa)
> ISO-10646 to CNS (and vice versa)
> ISO-10646 to KSC7 (and vice versa)
> ISO-10646 to KSC8 (and vice versa)
> ISO-10646 to ISO-8859-2 (and vice versa)
> ...
> ISO-10646 to ISO-8859-10 (and vice versa)
> ISO-10646 to KOI8 (and vice versa)
> ISO-10646 to MacRoman (and vice versa)
> ISO-10646 to MacCentralEuropean (and vice versa)
> etc.
>
> This is OK, as long as supporting NCRs > 255 is NOT required and FULL
> conformance is attained by supporting ISO 8859-1 NCRs. (But if a browser
> supports NCRs > 255, then they map to ISO 10646.)
>
> Otherwise we are requiring a lot of additional resources to support a
> feature that most people on this list have been saying will be rarely used.

I'm not an SGML expert, but I think support for numeric character
references is a matter of conformance with SGML, not HTML.

I'm not sure what it might imply to say that your document character set
is ISO 8859-1 and then try and use data characters outside this: does this
mean you can avoid supporting numeric references?? (would SGML mavens
care to comment.)

With respect to the mappings involved, if you plan to support all
the codes listed and _not_ use ISO-10646, you are going to need
a _lot_ of fonts somewhere, though I suppose the inverse mappings
might be overhead.

-- 
    Albert Lunde                      Albert-Lunde@nwu.edu