Re : proposed changes to charset parameter

pandries@alis.ca
Fri, 13 Jan 95 13:33:36 EST

Masinter@parc.xerox.com recently wrote :

>To address the comments made at the meeting on character sets, I
>started with the text version of the HTML draft, edited it, and am
>sending proposed changes as diffs.

(...)
>================================================================
>diff -5c html-orig.txt html-revised.txt
>*** html-orig.txt Tue Jan 10 03:48:34 1995
->-- html-revised.txt Tue Jan 10 03:49:29 1995
>***************
(...)
In section 1.1.8 Character Data in HTML

> Independent of the character encoding used,
> HTML also allows references to any of the ISO Latin-1
> alphabet, using the names in the table ISO Latin-1
> Character Representations, which is derived from ISO
> Standard 8879:1986//ENTITIES Added Latin 1//EN. For
> details, see 2.17.2.

I was just wondering if this sentence would not be clearer if we reserved the
word "encoding" for the transport encoding (a la MIME) and rather simply said
here "character set" - as is explicited in the name of the parameter specifying
the "characters encoding used" .

> 2.16 Character Data
>
> Level 0
>
! The characters between HTML tags represent text. A HTML document
! (including tags and text) is encoded using the coded character
! set specified by the "charset" parameter of the "text/html"
! media type. For levels defined in this specification, the
! "charset" parameter is restricted to "US-ASCII" or "ISO-8859-1".
! ISO-8859-1 encodes a set of characters known as Latin Alphabet
! No. 1, or simply Latin-1. Latin-1 includes characters from most
! Western European languages, as well as a number of control
! characters. Latin-1 also includes a non-breaking space, a soft
! hyphen indicator, 93 graphical characters, 8 unassigned
! characters, and 25 control characters.
!
Stop me if I read this wrongly : you mean to say that we could not legally have
arabic, cyrillic, japanese, korean or chinese in any Web documents. I take that
the mention "for levels defined" this means all valid levels) ? Even though we
here in Canada would not suffer (too much) from this restriction, I wonder what
native speakers from Russia, Japan or Korea would think. I believe, as I
understand it, that this restriction is unacceptable ...

Patrick Andries
Alis Technologies Inc - open to all cultures
1+514+738-9171
e-mail : pandries@alis.ca