Sure, this is a very important first step, as is the Accept-Charset
parameter.
>On the other hand, it suggests, that to satisfy SGML mavens we at least
>need to specify a mechanism/algolrithm to derive an SGML declaration for
>other character sets than ISO-Latin-1. ERCS may be a way to do
>this. (Define character classes for Unicode and project downward to
>subsets.)
>
>There are simpler mechanisms that may work for US-ASCII-like character sets
>(Define character classes on ASCII or Latin-1 and lump everything else
>together somehow) (which is how I guess implementations of multilingual WWW
>are actually working now.)
This is how things work now.
>One problem I see is that choosing this simpler method to derive the SGML
>stuff is that it might foreclose or complicate options to use ERCS later to
>provide better SGML stuff for Unicode.
Yes. I'll elaborate a little later, but I believe we can adopt ERCS,
and still maintain backward compatibility, with very few, or no
changes to current browsers.
>We could also abandon an attempt to define what the charset parameter
>really means in the HTML 2.0 spec and indicate that clients should not
>choke on it (thought this is really an HTTP issue). But this would make it
>rather urgent to deal with for 2.1, at least in the simple MIME-like case.
Yes. I'd really appreciate putting this off until 2.1, which would
give us perhaps 6 months more to work it all out.
I may come across as very pro-Unicode, but the core issue I am
interested in is multlilingual support. This includes things like EUC,
SJIS, Unicode, ISO2022 etc. and many other issues for browsers as
well.