Re: HTML-WG digest 107 - harmful encodings

Daniel W. Connolly (connolly@beach.w3.org)
Tue, 18 Jul 95 22:40:37 EDT

In message <9507182042.AA19662@sqrex.sq.com>, lee@sq.com writes:
>
>How would you handle beta coding for Ancient (Polytonic) Greek?
>
>This is essentially a little language for combining accents, where
>
> *`\a
>is an upper-case alpha with rough breathing and a grave accent.
>This is commonly called an encoding scheme, although it is certainly not an
>encoding vector. Ioto-subscript is handled, as are little dots under letters
>representing reconstructions or uncertain readings.

[lee: I hope you don't mind my copying the list. I hope this example
will illustrate the mechanism to some other folks, too.]

I'm not sure I understand this encoding. But it seems that there is a
set of charactes, and a way to map sequences of octets into sequences
of those characters; i.e. this is a character encoding scheme. So I'd
write:

Content-Type: text/html; charset=x-ancient-greek

<title>example</title>
<p>abc*`\a</p>

The content of the P element above has 4 characters, not 7.
That is:

x-ancient-greek : SEQ(Octet) -> SEQ(greek-char)

and
x-ancient-greek("abc*`\a") = 'abc?'

where ? is the funky character described above.

Dan