Re: New draft: charset, conformance cleanup

Francois Yergeau (yergeau@alis.ca)
Tue, 4 Apr 95 17:32:41 EDT

>Date: Tue, 4 Apr 95 12:19:01 EDT
>From: Gavin Nicol <gtn@ebt.com>
>
>Well, I live in Japan, so I think I have an idea of what goes on
>here. It seems to me that for the most part, people are hacking away,
>getting things to run, *with no knowledge of SGML at all*.

Isn't that a good reason to tell people to stay as close to SGML as
possible, instead of telling that what they're doing is illegal, so
that they have a choice between not doing it (don't hold your breath)
or doing whatever they like?

>How does Mosaic-L10N resolve
>numeric character references for example? Try
>
> &#63;&#64;&#65;&#66;
>
>in data containing EUC (JIX NNNN document character set) and in plain
>ASCII data. You get the same thing.

Seems like correct behaviour to me. In EUC-JP, those are ASCII
characters and nothing else, so you *should* get ASCII when using
these.

> [About Mosaic-L10N and other non-Latin-1 browsers]
>
>As I noted above, they do not exhibit correct SGML behaviour,

See above.

>unless
>one also says that in the process of altering the SGML declaration,
>they are also converting numeric character references.

As I see it, you don't need to convert the NCRs if you adjust the SGML
decl to match the actual document character set. You would need to do
that if you translated an IBM850 document to Latin-1 to match the SGML
declaration.

>been fighting the same battle since last year!). However, we need to
>be *very* careful with the wording so that we do not *commit* to any
>solution at the moment, and that we do not give open license to
>implementors until we *do* have a solution.

In the absence of a complete solution, telling them not to do it
amounts to giving open license. The avowed purpose of this document
(HTML 2.0 draft) is to describe "current" (mid-94) usage, cast in SGML
language; as I see it, what is done is the equivalent of minimal SGML
decl modification, just what it takes to match the charset, so that
should be what appears in the 2.0 draft.

-- 
François Yergeau <yergeau@alis.ca>