>Since Unicode is not really being used today in HTML, couldn't we
>stipulate that Unicode HTML use UTF8 encoding?
No. I for one, would like to use UCS-2. Anyway, it would still violate
the processing model for SGML.
>Do we care about docs of mixed encodings? With many Mac word
>processors, I can create documents in mixed-encodings. The
>Mosaic-L10N folks have been doing a lot of work with ISO-2022-xx
>encodings. X-windows has compound-text which is similar to 2022.
>How do I put these types of data on the Web?
>
>One answer is that these docs must be converted to some form of Unicode
>(ucs, utf8).
This is the simplest answer I can think of, hence my earlier
proposal. Having the browser support all these encodings is almost
impossible.
>Another answer is to support have encoding tags.
We cannot do this within an HTML document without complicating things
immensely. The document should be converted to a single coded
characters set before the parser proper ever even sees it.
>If we do convert mixed-encoding text to Unicode, then we will need to
>use the LANG tag to diambiguate unified CJK characters for rendering
>in the "proper" fonts.
Or use an encoding containing "hints", which could possibly include
gkyoh image specification hints as well...