[This conversation is getting oddly neo-Platonic for an IETF working group :)]
One of the difficulties in this whole discussion is that there are mulple 
levels of abstraction, and different people find different boundaries between 
them to be significant.  I, for example, find much of Dan's discussion of the 
theoretical underpinnings of coded character sets to be precise but largely 
irrelevant to the issues at hand.  On the other hand, there are people who no 
doubt view my focus on multilingual capability and a single universal 
character encoding (namely IS 10646) to be peculiar, since it seems obvious 
that the actual encoding(s) used is of no theoretical import--it's a purely 
pragmatic issue, and hence not interesting :).
I'll quickly admit that on the particular issue of coded characters sets that 
I am being purely pragmatic.  I am not particularly concerned (in this 
context) with the essential nature or philosophy of text, or even of 
electronic representations of text.  I am, rather, concerned with a small set 
of pressing pragmatic issues.  Principal among them is simply being able to 
determine unambiguously what characters are being represented in an HTML 
document so that I can display them.  This is mostly a labelling issue, 
although numeric character references are a problem--a problem can be 
pragmatically solved by restricting HTML to a single (large) document 
character set.
The status quo in this regard is broken.  As anyone who has tried to implement 
Japanese support in their browser can confirm, there is a lot of content out 
there whose interpretation cannot be determined unambiguously by software.  
This is bad.
To give a concrete example, the Macintosh on which I am typing this message 
can handle multilingual text just fine.  At the moment, it has fonts & input 
methods installed for European, Russian, Hebrew, Arabic, and Japanese.  There 
are HTML documents in existence that contain content in one or more of these.  
All I want right now is some method for determining how to match them up.  So 
far, what we do is cheat.  ISO 2022 is easy to automatically detect even in 
mislabeled text, and is reasonably popular, so we've started with Japanese.  
There's only so far we can go with clever inferences, though.
I don't mind translating between the transport representation and IS 10646, so 
that the SGML layer only sees a sequence of IS 10646 code points.  That's 
simple.  What I do mind is endless discussion about the distinctions between 
characters, glyphs, codes, and the essential nature of reality, even though in 
other contexts I may care greatly about such issues.  They simply do not 
address the issue at hand (which Gavin's proposal does, as I see it).
I'm not trying to squelch anyone, I just think we're getting a bit far afield.
Amanda Walker
InterCon Systems Corporation