Re: partial draft: "Character Set" Considered Harmful

Gavin Nicol (gtn@ebt.com)
Mon, 10 Apr 95 11:22:34 EDT

>> Your proposal will make it impossible to use numeric character
>> references without facing the risk of them having different meanings
>> in different browsers,
>
>Only as a result of defects in browsers. Browsers whos implementations
>are consistent with the model will interpret text representations
>consistently. Arguments (other than argument by assertion, as a bove)
>to the contrary are welcome.

Say I have a document in ISO-8859-1, and another in UNICODE-1-1-UTF-8,
and they both contain ҽ how should this be interpreted?
How about if I have a document in ISO-8859-1, another in
UNICODE-1-1-UTF-8, and another in X-JIS0201, and they all contain
ª? If I want to change the encoding what needs to happen? What
needs to happen if I change the coded character set, as well as the
encoding? How should we deal with documents using ISO-2022-JP?

Please explain to us all what will be necessary to build a system that
supports more than 10 coded character sets and encodings using a
document character set implied from the charset= parameter.

>If you don't have time to back it up, don't waste our time with the
>conjecture.

I? Waste *your* time...

So far the 2 things you have against my proposal are:

>* your proposal requires all HTML documents to have a document
>character set of ISO 10646. I believe this is gross overspecification.
>
>* you use the same lack of precision in your discussion of characters
>and their encodings as all the other ISO documents that got us
>in the current mess.

Opinion, and something I would also agree with. For the latter, I
would just like to say that I did note that it was a very early draft,
and stated that the wording should not be considered, only the
concepts.

Have a nice day.