Re: The future of meta-indices/libraries?

Martijn Koster <>
Date: Tue, 15 Mar 1994 21:34:06 --100
Message-id: <>
Precedence: bulk
From: Martijn Koster <>
To: Multiple recipients of list <>
Subject: Re: The future of meta-indices/libraries? 
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 2717

> I think the WWW community should have addressed this long ago.  This
> is the main area in which we are well behind the gopher community.

I think this is one of the examples of the lack of a Working Group.
It is really easy to discuss problems and come up with solutions,
but even if solutions are proven to work there is no mechanism
for standardising it. As a result all the same problems keep arising,
and people keep coming up with the same solutions.

In this case the problem has been addressed by ALIWEB. Have a look at
> In my opinion, one of the most important design criteria should be to
> eliminate the need for indexers (of whom there will likely be many) to
> walk the entire server tree.  This can be annoying and it the worst
> cases disruptive.

I couldn't agree more. This is why I don't welcome the Robot trend,
and hope to help keep an eye of them by gathering information on the
Robot page (

> A second important criterion would be giving the maintainer control
> over what is indexed.

> I would argue for a very simple document ....

ALIWEB does that.

> As a server writer I would implement this by having my server create
> this document on the fly when it is first requested and then cache
> it for later use until it expires.  Subsequent requests would get
> the cached version until its expiration after which a new version 
> would be created and cached.  The maintainer would set the expiration
> period and could mark any part (or all) of his tree as not to be 
> indexed.  The cached file would be extremely useful for features local
> to the server also.  For example, a search of all titles on the server
> or WAIS searches which return a menu of *titles* of hits (this is done
> now by WWWWais, for example, but it must search each document corresponding
> to a hit to extract its title)

I am not sure what you mean here. I'm not sure it is going to be sensible
to index all titles on a server and search those, even though it sounds
attractive. You do need to retain the context of the titles.

You mention marking part of a tree not to be indexed. Although it is
not quite what you mean, you may find it interesting to learn about a
proposal on the Robots page to introduce a voluntary mechanisms to
exclude part of trees by robots. I agree robots are the wrong solution
to the resource discovery problem, but they are going to be around, and
it makes sense to reduce problems they cause.

-- Martijn
X-400: C=GB; A= ; P=Nexor; O=Nexor; S=koster; I=M
X-500: c=GB@o=NEXOR Ltd@cn=Martijn Koster