Re: iso 8859 or escape sequencies?

"Daniel W. Connolly" <connolly@hal.com>
Errors-To: listmaster@www0.cern.ch
Date: Tue, 12 Apr 1994 17:08:53 --100
Message-id: <9404121459.AA28392@ulua.hal.com>
Errors-To: listmaster@www0.cern.ch
Reply-To: connolly@hal.com
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: "Daniel W. Connolly" <connolly@hal.com>
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: iso 8859 or escape sequencies? 
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 1452
In message <9404121346.AA24247@freya.let.rug.nl>, Bert Bos writes:
>I think the question of ISO Latin-1 character entities in HTML can be
>summarized as follows:
>
>	The following are all equivalent:
>
>	1) &ouml;
>	2) &#246;
>	3) the-8-bit-code-for-o-with-umlaut-that-my-mailer-refuses

True. The equivalence between (1) and (2) is via the definition
of ouml in the version of the ISOlat1 entity set used in HTML:
	<!ENTITY ouml "&#246;"   -- small o, dieresis or umlaut mark -->

The equivalence between (2) and (3) is via the document character set
in the SGML declaration:

     BASESET   "ISO Registration Number 100//CHARSET
                ECMA-94 Right Part of Latin Alphabet Nr. 1//ESC 2/13 4/1"
     DESCSET   128 32 UNUSED
               160 95 32
               255  1 UNUSED

>I understand that HTTP is defined as 8-bit clean, but is the same true
>of HTML or HTML+? It should be, of course, but I don't think it is in
>the DTD. (I may be misreading the <!SGML declaration, though.)

The intent of the <!SGML declaration for HTML was so say "HTML is
defined in terms of the 8 bit characters set ISOLatin1." I think I
made a couple mistakes in expressing that. For example, sgmls complains
when I use &#255; in an HTML document. I think it's responding correctly
to the 
               255  1 UNUSED
line. I think it should be taken out. But I don't fully grok SGML
character set declarations yet, so I haven't nailed it down fully.

Dan