You know, I've come to the conclusion that 'international character
sets' are relatively easy to handle by requiring an additional
"charset" parameter to the "text/html" MIME type. E.g.,
"text/html; charset=unicode-1-1-utf-7" would be a way of saying 'a
HTML document using unicode' (as per RFC 1642), while "text/html;
charset=iso-2022-kr" would identify a HTML document that uses the
Hangul encoding scheme for Korean as per RFC 1557. The default charset
depends on the transport mechanism; for HTTP, the default might well
be "text/html; iso-8859-1".
I'm considering proposing in the HTTP working group adding a
"Accept-charset: " header for clients to send to servers which
charsets (other than US-ASCII and ISO-8859-1) that they are willing to
accept; of course, it is mandatory that servers identify the charset
of any text/* document which isn't the default; however, this is no
longer a HTML issue.