Re: The future of meta-indices/libraries?

john@math.nwu.edu (John Franks)
Errors-To: listmaster@www0.cern.ch
Date: Tue, 15 Mar 1994 22:37:58 --100
Message-id: <9403152134.AA13443@hopf.math.nwu.edu>
Errors-To: listmaster@www0.cern.ch
Reply-To: john@math.nwu.edu
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: john@math.nwu.edu (John Franks)
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: The future of meta-indices/libraries?
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 2108      
Content-Length: 2108      
Content-Length: 2108      
Content-Length: 2108      
Content-Length: 2108      
Content-Type: text/plain; charset=US-ASCII
Content-Type: text/plain; charset=US-ASCII
Content-Type: text/plain; charset=US-ASCII
Content-Type: text/plain; charset=US-ASCII
Content-Type: text/plain; charset=US-ASCII
Mime-Version: 1.0
Mime-Version: 1.0
Mime-Version: 1.0
Mime-Version: 1.0
Mime-Version: 1.0
X-Mailer: ELM [version 2.4 PL23]
X-Mailer: ELM [version 2.4 PL23]
X-Mailer: ELM [version 2.4 PL23]
X-Mailer: ELM [version 2.4 PL23]
X-Mailer: ELM [version 2.4 PL23]
According to Martijn Koster:
> 
> > In my opinion, one of the most important design criteria should be to
> > eliminate the need for indexers (of whom there will likely be many) to
> > walk the entire server tree.  This can be annoying and it the worst
> > cases disruptive.
> 
> I couldn't agree more. This is why I don't welcome the Robot trend,
> and hope to help keep an eye of them by gathering information on the
> Robot page (http://web.nexor.co.uk/mak/doc/robots/robots.html)
> 
> > A second important criterion would be giving the maintainer control
> > over what is indexed.
> 
> > I would argue for a very simple document ....
> 
> ALIWEB does that.
> 

There are many good things about ALIWEB.  However, my impression from
reading the documents referenced above is that the templates must be
human generated.  I am firmly convinced that any scheme which is not
almost completely automated is doomed fail.  Many maintainers will
simply not create the templates and the ones who do will not keep them
up to date.  I have no doubt that a human writing an ALIWEB form will
do a better job than software, but the unfortunate fact is that most
maintainers will simply not make the effort (often they cannot).

> 
> I'm not sure it is going to be sensible
> to index all titles on a server and search those, even though it sounds
> attractive. You do need to retain the context of the titles.
> 

I think this should be the default.  Of course, the maintainer should
be given as much flexibility as possible in eliminating titles from
the index.  Of course retaining the context is desirable, but the time
for doing this is when the document is created, not when it is indexed.

The bottom line choice is between an index of 50 servers with
carefully hand-crafted templates and an index of 5000 servers with
machine generated templates which are less well constructed but up to
date.  I would certainly opt for the later.  I would also do everything
possible to encourage maintainers to massage their templates to improve
them.

John Franks 	Dept of Math. Northwestern University
		john@math.nwu.edu