Re: The future of meta-indices/libraries

Joseph Wang <joe@MIT.EDU>
Errors-To: listmaster@www0.cern.ch
Date: Wed, 16 Mar 1994 04:27:30 --100
Message-id: <9403160323.AA10948@theodore-sturgeon.MIT.EDU>
Errors-To: listmaster@www0.cern.ch
Reply-To: joe@MIT.EDU
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: Joseph Wang <joe@MIT.EDU>
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: The future of meta-indices/libraries
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 2079

(This message was sent to www-talk)

I second Rob Raisch's call for a consortium to fix the resource
indexing problem.

I'd be willing to contribute whatever the GNA meta-library has an
effort at solving the resource discovery problem.  It seems that there
are about a dozen different meta-library out there each with its
strength and weaknesses.  

The strengths of the GNA meta-library are:

1. It has a very sophisticated search engine that can do keyword searchs
   on authors, titles, and keywords
2. It's entries are indexed in a hierarchical topic structure rather than
   be a keyword only search.  This makes it possible to be searchs of 
   resources in a general area.
3. (And IMHO most important) it has a concept called "coverage code" or 
   resource breadth.  For example, if you want to be a search of resources
   on "Korea," chances are that you would prefer to scan an entire WWW site
   devoted to Korea rather than an single article on the subject.  With
   archie like searches, there is no way you can tell if the resource you
   have found is an entire library or if it is merely a paragraph in an
   article.

The disadvantages of the meta-library approach is

1. The sophsiticated search engine (i.e. the postgres database) is extremely
   fragile and breaks down about once a week.
2. Because meta-library entries contain coverage code and resource breadth
   information, they must be entered by hand and are therefore EXTREMELY
   out of date (i.e. most entries are from October).

What would be interesting would be to try to combine the ALIWEB and
meta-library approaches.  The ALIWEB index format could be amended to
include a hierarchical topic index and a coverage code.  It would then
be easy to write a PERL robot that would take the indexes and enter
them into the meta-library search engine.

Any thoughts?

Incidentally, I can get accounts on the meta-library machine for
anyone who wants to contribute to the global indexing initiative, and
would be happy to donate the meta-library database system and contents
for such an initiative.