Re: ISO/IEC 10646 as Document Character Set

Martin J Duerst (mduerst@ifi.unizh.ch)
Fri, 5 May 95 10:35:58 EDT

Although I think that ISO10646 is very important and the change
should be made very quickly, I can understand that 2.0 should
document the current practice and serve as a kind of "warning".

I can therefore support all compromize proposals, given that
the wording "SGML declarations with other document
character sets.", is changed. I think that words neither describe what
any of us have in mind nor makes technical sense, and only
caries the danger of creating problems that we will have
big problems cleaning up later.

>> http://www.w3.org/hypertext/WWW/MarkUp/html-spec/html-spec_2.html#SEC8
>> |HTML Lexical Syntax
>> |
>> | ... A minimally conforming HTML user agent must support the SGML
>> | declaration in section SGML Declaration for HTML, which specifies ISO
>> | Latin 1 (@@full name) as the document character set; it may support
>> | other SGML declarations, in particular, SGML declarations with other
>> | document character sets.

Why not write it like this (another compromize):
"in particular, SGML declarations with ISO10646 as the document
character set."

This doesn't interfer with the current state of affairs and gives
a positive example of what we mean with other SGML declarations,
and document character sets. For somebody that invents a third
document character set unrelated to the two we see currently,
in contrast to the paragraph as it stands now, it is very difficult
to derive any support.

Here some comments to other postings:
>Right, so a UA which supported, say, Latin-1 and HP-Roman8 and SJIS as
>document character sets - perhaps even with Roman8 as the default -
>would be a conforming UA by this spec but be somewhat screwed when 2.1,
>3.0 or n.n as n tends to infinity specifies 10646 as *the* document
>character set? How would the fact that a given document uses a
>different SGML declaration be communicated to the client?

Fortunately, there can be only one document character set, so
anybody who wants to use what is described above first
has to combine the characters of the three sets into one.
That is the only chance I see that document character sets
won't proliferate too quickly, although I really think it would
be much safer to be on the safe side.

----
Dr.sc. Martin J. Du"rst ' , . p y f g c R l / =
Institut fu"r Informatik a o e U i D h T n S -
der Universita"t Zu"rich ; q j k x b m w v z
Winterthurerstrasse 190 (the Dvorak keyboard)
CH-8057 Zu"rich-Irchel Tel: +41 1 257 43 16
S w i t z e r l a n d Fax: +41 1 363 00 35 Email: mduerst@ifi.unizh.ch
----