Re: partial draft: "Character Set" Considered Harmful

lilley (lilley@afs.mcc.ac.uk)
Mon, 10 Apr 95 10:54:21 EDT

Gavin wrote:

> Amanda wrote: [about numerical character references]

> >The only reason they exist in the first place is that named character
> >entities could not be relied on in all browsers, and it was a "quick
> >fix."

> Incorrect. They are in SGML so that typists could enter characters
> not directly supported by their keyboards. HTML inherits them from
> SGML.

Um, why is it harder for a typist to type £ than to type £ ?
I suspect Amanda is right; the early browsers did not have an adequate
repetoire of known named character entities.

Speaking of numeric references, the current HTML 3.0 DTD
(Draft: Fri 24-Mar-95 09:46:33) says:

<!ENTITY % HTMLlat1 PUBLIC
"-//IETF//ENTITIES Added Latin 1 for HTML//EN">
%HTMLlat1;

Now, looking at <http://www.hpl.hp.co.uk/people/dsr/html/latin1.html>
which is marked as needing further work, I only see named character
entities for the accented characters, ie

&192; to &214; and &216 to &246; and &248; to &255;

This seems undesirable. I think there should be named entities for all
the characters that might not be on someone's keyboard. To this end,
here is what I use for HTMLlat1:

<http://info.mcc.ac.uk/CGU/staff/lilley/test/HTML_3.0/html-lat1.txt>

Comments appreciated. There are 8 bit characters in there; is that OK?

--
Chris Lilley
+----------------------------------------------------------------------+
|Technical Author, Manchester and North HPC Training & Education Centre|
+----------------------------------------------------------------------+
| Computer Graphics Unit,             |  Email: Chris.Lilley@mcc.ac.uk |
| Manchester Computing Centre,        |  Voice: +44 61 275 6045        |
| Oxford Road, Manchester, UK.M13 9PL |    Fax: +44 61 275 6040        |
+-------------------------------------+ BioMOO: ChrisL                 |
|       URI: http://info.mcc.ac.uk/CGU/staff/lilley/lilley.html        | 
+----------------------------------------------------------------------+
|       "The first W in WWW will not wait."   François Yergeau         |
+----------------------------------------------------------------------+