Re: ISO charsets; Unicode

Chris Lilley, Computer Graphics Unit (lilley@v5.cgu.mcc.ac.uk)
Mon, 26 Sep 1994 18:23:59 +0100

In message <9409261441.AA12654@midway.uchicago.edu> Richard L. Goerwitz said:

> The project head would be happy to plug it into the Web,
> but again the Web only knows ASCII.

Not so, the Web knows only ISO 8859-1 (so if you send it ASCII it will work) but
that is not the same thing.

I agree with much of the posting, but:

> Then, of course, there's the giant database project called ARTFL, which
> essentially attempts to make the entire French literary corpus availa-
> ble online. It's already here, and tied to the Web. But they have no
> standard specs for how to allow users to input things as simple as an
> accute accent over an "a".

I suggest you check this. ISO 8859-1 covers most western European languages and
should certainly do French. "A acute" is doable already and has been since the
Web started. See for example

<http://info.mcc.ac.uk/CGU/staff/lilley/charset.html>

> They have an extremely competent staff to
> work on such problems - but I wonder: Should this _be_ a problem?

Not in this particular instance, no. In the general case of Aramaic etc yes it
is currently a problem. There has been some discussion on the list before about
this: I seem to remember that we learned that SGML does not have the expressive
power to say that this here paragraph is in ISO 8859-9 or shift-JIS or whatever.

--
Chris