I thought I grasped that just fine.
Encoding is one thing, glyphs are another. I need glyphs to render.
If I can encode Hindi in the document charset 10646 using iso-2022-jp,
which I have been led to believe I can do, how does this format
negotiation work? or will any encoding of any 10646 content
using iso-2022-jp be limited somehow to the Japanese portion
(if there is such a concept) of 10646?
Charsets smaller than Unicode have, mostly, natural relations to
languages and to fonts. For those charsets one could infer from the
charset parameter what fonts might be needed.
Unicode is a different story. If the
document charset of HTML is to be Unicode, then anyone can hand
me a valid, conforming HTML doc that has characters in it I won't
be able to render unless I have a full set of glyphs for all
65,500+ characters. Most of us won't. How do we manage that
practically? How do I determine, without parsing the doc, what
range of 10646 it uses? or do I have to live with not being
able to do that? (I'm just exploring this issue, not taking a
side.)
-- Terry Allen (terry@ora.com) O'Reilly & Associates, Inc. Editor, Digital Media Group 101 Morris St. Sebastopol, Calif., 95472 occasional column at: http://gnn.com/meta/imedia/webworks/allen/A Davenport Group sponsor. For information on the Davenport Group see ftp://ftp.ora.com/pub/davenport/README.html or http://www.ora.com/davenport/README.html