Re: iso 8859 or escape sequencies?
"Daniel W. Connolly" <connolly@hal.com>
Errors-To: listmaster@www0.cern.ch
Date: Tue, 12 Apr 1994 17:08:53 --100
Message-id: <9404121459.AA28392@ulua.hal.com>
Errors-To: listmaster@www0.cern.ch
Reply-To: connolly@hal.com
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: "Daniel W. Connolly" <connolly@hal.com>
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: iso 8859 or escape sequencies?
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 1452
In message <9404121346.AA24247@freya.let.rug.nl>, Bert Bos writes:
>I think the question of ISO Latin-1 character entities in HTML can be
>summarized as follows:
>
> The following are all equivalent:
>
> 1) ö
> 2) ö
> 3) the-8-bit-code-for-o-with-umlaut-that-my-mailer-refuses
True. The equivalence between (1) and (2) is via the definition
of ouml in the version of the ISOlat1 entity set used in HTML:
<!ENTITY ouml "ö" -- small o, dieresis or umlaut mark -->
The equivalence between (2) and (3) is via the document character set
in the SGML declaration:
BASESET "ISO Registration Number 100//CHARSET
ECMA-94 Right Part of Latin Alphabet Nr. 1//ESC 2/13 4/1"
DESCSET 128 32 UNUSED
160 95 32
255 1 UNUSED
>I understand that HTTP is defined as 8-bit clean, but is the same true
>of HTML or HTML+? It should be, of course, but I don't think it is in
>the DTD. (I may be misreading the <!SGML declaration, though.)
The intent of the <!SGML declaration for HTML was so say "HTML is
defined in terms of the 8 bit characters set ISOLatin1." I think I
made a couple mistakes in expressing that. For example, sgmls complains
when I use ÿ in an HTML document. I think it's responding correctly
to the
255 1 UNUSED
line. I think it should be taken out. But I don't fully grok SGML
character set declarations yet, so I haven't nailed it down fully.
Dan