Re: International Document Server Support

atotic@ncsa.uiuc.edu (Alexsander Totic)
From: atotic@ncsa.uiuc.edu (Alexsander Totic)
Message-id: <9312071905.AA23142@void.ncsa.uiuc.edu>
Subject: Re: International Document Server Support
To: dsr@hplb.hpl.hp.com (Dave_Raggett)
Date: Tue, 7 Dec 1993 13:05:42 -0600 (CST)
Cc: www-talk@nxoc01.cern.ch
In-reply-to: <9312071141.AA10450@manuel.hpl.hp.com> from "Dave_Raggett" at Dec 7, 93 11:41:38 am
X-Mailer: ELM [version 2.4 PL23]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 846       
> I am including an optional CHARSET attribute in nearly all elements in
> the next revision to the HTML+ DTD. This will allow browsers to switch
> char sets for a paragraph etc,  e.g. <P charset="ISO-2022-JP"> ....
> as described in RFC 1468 which is used for Japanese character encoding
> for email and network news.
> 
> Should I also include a LANGUAGE attribute for the ISO3316 language codes?

Are these character sets all 8bit sets, or is there support for
multi-byte characters. I am not familiar with the way things work on X,
but on a Mac, Chinese and some other languages use 2-byte characters.

How is the parsing going to be done for character sets where all the 
characters are used? Current parser depends on special characters, such
as '<', and '&'. In different character sets, I do not think that we can
depend on this.

Aleks