Re: HTML-WG digest 107

Albert Lunde (Albert-Lunde@nwu.edu)
Tue, 18 Jul 95 08:12:16 EDT

> We propose to do all within HTML. (The HTTP protocal does
> allow 8-bit asciis, etc., but the in the SGML DTD
> mandates latin-1 and the current HTML spec madates 7-bits.
>
> UTF-7 is expressed in latin-1 characters. Other than
> sequences so encoded, all latin-1 and named entities can
> remain as is. An escape sequence such as &+ begins an
> encoded sequence and - or end of line ends it. (UTF7 uses a
> sole + as its escape but this would interfere with older
> pages.) The encoded HTML would import, be editable and
> exported from an authoring edition of the Accent
> (multilingual) WP (until a browzer is implemented).
>
> We will propose additons to HTML3 for specifiying other
> encodings, paragraph reading-order for bidi, etc.

I think you should read over the current HTML 2.0 spec and the prior
discussions of the working group on this topic.

While the HTML 2.0 spec does require support for Latin-1
the intention is to allow the use of other document
character sets and encodings.

The DTD is expressed in terms of Latin-1, but since this is an
8 bit character set, I'm not sure what you mean by "the current HTML spec
madates 7-bits".

I'd tend to suggest that HTTP is the appropriate place to
specify encodings not HTML (since there is a SGML problem of
parsing before one knows the encoding).

-- 
    Albert Lunde                      Albert-Lunde@nwu.edu