Re: HTML-WG digest 107

Gavin Nicol (gtn@ebt.com)
Tue, 18 Jul 95 19:38:23 EDT

>We propose to do all within HTML. (The HTTP protocal does
>allow 8-bit asciis, etc., but the in the SGML DTD
>mandates latin-1 and the current HTML spec madates 7-bits.
>
>UTF-7 is expressed in latin-1 characters. Other than
>sequences so encoded, all latin-1 and named entities can
>remain as is. An escape sequence such as &+ begins an
>encoded sequence and - or end of line ends it. (UTF7 uses a
>sole + as its escape but this would interfere with older
>pages.) The encoded HTML would import, be editable and
>exported from an authoring edition of the Accent
>(multilingual) WP (until a browzer is implemented).
>
>We will propose additons to HTML3 for specifiying other
>encodings, paragraph reading-order for bidi, etc.

Before you propose anything, I suggest you do the following:

1) Read the HTML 2.0 draft specification
2) Read the HTML 3.0 draft specification
3) Read the Unicode standard
4) Read the UTF-7, UTF-8, and other documents describing encodings for
Unicode.
5) Read up on I18N requirements.
6) Read the HTML-WG discussions on this (starting from last December
or perhaps earlier).
7) Read the SGML Handbook, twice.

Then come back and comment. What you have written above contains
outright errors, and implications of terribly wrong understanding of
the issues.

I would also suggest looking at:
http://www.unicode.org/
http://www.ebt.com:8080/
http://www.w3.org/hypertext/WWW/International