Further on new "spec-02"

Terry Allen (terry@ora.com)
Tue, 16 May 95 15:22:21 EDT

> * Its document character set includes ISO-8859-1 and
agrees with ISO10646; that is, each code position
listed in 14.1, "The ISO-8859-1 Coded Character Set" is
included, and each code position in the document
character set is mapped to the same character as
ISO10646 designates for that code position.
NOTE - The document character set is somewhat
independent of the character encoding scheme used to
represent a document. For example, the ISO-2022-JP
character encoding scheme can be used for HTML
documents, since its repertoire is a subset of the
ISO10646 repertoire. The crititcal distinction is that
numeric character references agree with ISO10646
regardless of how the document is encoded.

The present doc stops after section 11, the DTD, and doesn't include
an SGML decl (thus section 14.1 is also missing).

I don't think that NOTE is very helpful, especially as the doc charset
isn't 10646 in this version. ISO-2022-JP can be used for this version,
but not because its repertoire is a subset of 10646. For that matter,
"Its document character set includes ISO-8859-1 and agrees with ISO10646"
dances around the question of which it is. I thought we were still
with 8859-1, which means that this sentence should say,

* Its document character set is ISO-8859-1, which
is a subset of ISO10646. Each code position
listed in 14.1, "The ISO-8859-1 Coded Character Set" is
included [I assume], and each code position in the document
character set is mapped to the same character as
ISO10646 designates for that code position.

-- 
Terry Allen  (terry@ora.com)   O'Reilly & Associates, Inc.
Editor, Digital Media Group    101 Morris St.
			       Sebastopol, Calif., 95472
occasional column at:  http://gnn.com/meta/imedia/webworks/allen/

A Davenport Group sponsor. For information on the Davenport Group see ftp://ftp.ora.com/pub/davenport/README.html or http://www.ora.com/davenport/README.html