Re: SGML/MIME charsets, ERCS/Unicode [was: New DTD (final version?) ]

Roy T. Fielding (fielding@monge.ICS.UCI.EDU)
Wed, 15 Feb 95 22:51:48 EST

Albert writes:

> The followups to prior discussion on the two lists didn't lead me to expect
> a response as strongly worded as Roy Fielding's comments. On a second
> reading, I noted that Roy said:
> "Under no circumstances will the http-wg ever require that Web clients
> and/or browsers use a specific character set other than ISO-8859-1."
> The word "require" may be important here. Gavin can speak for himself; I'm
> not so much trying to _require_ the use of Unicode as to ask questions to
> explore what's needed to _allow_ the use of Unicode for multi-lingual
> documents. I am looking for simple, "non-violent" changes to the spec.

Yes, "require" was the operative word, as was the notion of a "specific
character set". Both HTTP (within transported entities) and HTML should
be capable of supporting any character set (providing that the receiver
can know what it is getting and is capable of recognizing the markup within
that character set). We should endeavor to make the products of this WG
charset and language-friendly. At the same time, we need to keep in mind
that discussion on this WG mailing list should be limited to things that
the WG is capable of resolving.

My words were not directed at you (or anyone else in particular), but
rather at the group in general and the type of discussion that is
appropriate. We have many problems to solve, and many tasks to complete,
so the discussion must be directed toward those tasks. Gary did a
much better job of saying that in his message.

If there is a specific aspect of the 2.0 specification that needs to be
fixed to allow for current practice, then it should be identified and
the specific fix proposed. The only thing that is important regarding
2.0 is confirming these two statements:

a) The specification accurately describes current (June 1994) HTML as
a legitimate application of SGML.

b) No current (June 1994) practice which can be described in SGML
is made illegal by the specification.

For 2.1, we can ask additional questions along the lines of "what small
changes can be made to broaden the application of HTML along the lines
already being implemented?"

For 3.0, the changes can be more fundamental. In particular, we can assume
that a 3.0 application can parse full SGML (even if it just throws away
any element it doesn't understand). Thus, solutions involving <? thingies>
and <!whatsits> can be discussed without the fear of parser meltdown.

BUT, all of these discussions need to be grounded to something specific
in the specifications, because the purpose of the WG is to produce the
specifications, not invent a new system. Advocacy, where it is needed,
should take place on the newsgroups and general WWW mailing lists.
Technical solutions to the world's problems should be discussed, designed,
and implemented external to this list (even though most of the people
doing that work will be present on this list). Only once a solution has
been found, and presented as a written document, does it become something
concrete enough to sustain a rational discussion on an IETF WG mailing list.

One final note: I do not set the rules for this (or any) WG.

I am not even the Chair of this WG (Eric and TimBL are), so to a certain
extent I am speaking out of place. However, I have enough experience
with these things to recognize when a discussion has gone off the deep end
and is not making any progress -- worse, such discussions often hinder
progress because they consume the time of people like me who are supposed
to be completing the specifications.

This is not an unusual event. I've made the same mistake several times,
and sometimes the Chair of that WG has had to remind me of the priorities.

...Roy T. Fielding (
Department of Information & Computer Science tel:+1(714)824-4049
University of California, Irvine, CA 92717-3425 fax:+1(714)824-4056