Re: language hints in code stream vs. SGML markup

Peter Flynn (pflynn@curia.ucc.ie)
Fri, 20 Jan 95 15:28:59 EST

At 1:30 PM 1/20/95, Larry Masinter wrote:
>Language doesn't shift on a character by character basis. It does
>shift on a section-by-section basis if it shifts at all. It doesn't
>make sense to support <german>multi</><french>ling</><english>ual</>
>words on a character-by-character basis.

Albert added:
It does shift on a word by word and phrase by phrase basis, in multilingual
docs, i.e. English, Hebrew, and Greek in one paragraph.

We might also want to consider inter-relations between language changes,
font changes and character code changes. If we can find a common mechanism
for all it might be nice.

Read the recommendations of the TEI. #50/copy (2 vols, 1300pp total)
Mail lou@vax.ox.ac.uk for details (or Mike Sperberg-McQueen at UICVM,
whose email address I can never remember).

One of my requirements for the Thesaurus Linguarum Hiberniae will be
to monitor implementation of SGML encoding of linguistic analyses,
which may very much want to tag letters within a word to mark (for
example) <foreign lang="greek">tele</><foreign lang="latin">vision</>
and eventually find a way to map as much of the TEI markup as possible
into whatever successor to HTML3 we have by then, for display in the
Web, much as I have tried to do on an experimental basis at
http://www.ucc.ie/curia/texts/oengus.html

UniCode for Ogham, anyone? :-)

///Peter