Why is it necessary to? The data can be displayed anyway (though it
might be nice to distinguish, I cannot see the absolute need).
>Finally on Point 3. I think 10646 is conceptually nice, however given
>the fact that we accept different "representations" such as 8859-X
>being actually sent to the browser via HTTP, then this becomes the "HTML
>document" that the client sees. Servers may pre-map the 10646 to 8859-X
>if only one none-Latin1 language is used and so basically the
>10646 is really not "visible" to the outside world. Thus we are back to
>HTML markup in US-ASCII and everthing else as data.
Yes, that is the whole point. By using ISO 10646 we get:
1) A formal foundation for the treatment of multilingual data.
2) A unified numeric charset model, even in the face of arbitrary,
blind, encoding transformations.
3) A single SGML declaration.
4) A base for a transition to a truly multilingual WWW.
The document character set proposal doesn't buy us a lot at the moment,
except peace of mind, and a reasonable way to handle numeric character
references. It is what it enables which is really important...