Actually, I think it is quite reasonable to expect authors etc. to use
their native coded characer set and encoding (EUC, SJIS....) and then
expect the server to perform the conversion. With caching, I do not
think performance will be an issue. Besides, the actual conversion
itself would not be expensive: basically read in the document in it's
native forms, read in the conversion table, and for each code in the
document, perform a table lookup.
For a truly multilingual SGML system, something like this will take
place somewhere in or before the entity manager. This simply
acknowledges the fact, and makes it the responsibility of information
providers, rather than the consumers (who potentially face a far wider
range of conversions).
>What (briefly) is the problem with requiring Web browsers to
>understand and display Unicode? Is it simply the availability of
>Unicode font sets, or are there some deeper architectural issues
>here?
There are many small issues, but from my experience, and Amanda and
others will verify this, implementing a Unicode based application is
*far* easier than trying to support even a small number of coded
character sets and encodings.
I hope to expand/revise my paper in the next month or so to cover some
of these issues in detail.