Re: SGML/MIME charsets, ERCS/Unicode [was: New DTD (final version?) ]

Gavin Nicol (gtn@ebt.com)
Fri, 10 Feb 95 12:13:20 EST

(This message is probably counterproductive but...)

>Finally, regarding character set issues.... they don't belong here.
>HTML should be defined independently of the document character set to
>whatever extent is possible under SGML.

That is what ERCS is all about.

>Under no circumstances will this group ever require that Web clients
>and/or browsers use a specific character set other than ISO-8859-1 --
>making it easy to use other character sets is desirable, but defining
>a lingua franca is absolutely out of the question for this working
>group.

Why? Why choose ISO-8859-1? Why ignore the fact that we live in a
multi-lingual world and bury our heads in the sands of ignorance and
denial?

>The same goes for HTTP -- it should be possible to transmit documents
>in any character set using HTTP. Under no circumstances will the
>http-wg ever require that Web clients and/or browsers use a specific
>character set other than ISO-8859-1.

Why not? Because *you* say so? Do you have the authority, or the
audacity to make an arbitrary decision which could potentially cripple
the interoperability of the WWW for years to come? To make a decision
that could affect hundreds of thousands of people?

At the very *least* you could offer a nice clean solution.

>The reason ISO-8859-1 is required is because at least one character set
>must be required, and ISO-8859-1 was the most appropriate 8-bit,
>ASCII-inclusive set when the web was invented.

Oh. ASCII is god's gift to mankind I assume? You should read the
scripture regarding Bable...

>If you want to talk about lingua franca's and
>what-the-parser-should-do and the future of the web, etc., it should
>be done on www-talk. Setting standards for internal browser and
>server implementations is not a job for the IETF.

Oh? And exactly what do you think you are doing? Saying "well folks,
here's a nifty idea, but hey, you don't really have to do this. I
mean, this is all just an idea after all."

>If people are looking for something to fight for regarding Unicode,
>let me suggest that they first get the three (4?) variations of Unicode
>registered with IANA such that I can include their official names in
>the HTTP/1.0 specification. It's damn difficult to provide for
>character set negotiation when there is no single standard for the
>character set name.

In my list there are:
ISO-10646-UCS-2
ISO-10646-UCS-4
ISO-10646-UTF-1
UNICODE-1-1
UNICODE-1-1-UTF-7

Which are really 2 different character sets, and 5 different
encodings.

May I suggest that you go back to you isolationalist world, study a
little about character sets, encodings, multilingual issues, SGML, and
then decide whether to come back and play in the global sandbox, or
whether you should just bury your head deeper in the sand alluded to
earlier.

Your attitude is inexcusible, irresponsible, and verges very close to
incorrigible.

---
Gavin "Easily angered by bigots at 2am" Nicol
NOT speaking for EBT!