Re: Indexing the List of Lists
"Rob Raisch, The Internet Company" <raisch@internet.com>
Errors-To: listmaster@www0.cern.ch
Date: Mon, 21 Mar 1994 21:25:18 --100
Message-id: <Pine.3.85.9403211153.A24484-0100000@hmmm>
Errors-To: listmaster@www0.cern.ch
Reply-To: raisch@internet.com
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: "Rob Raisch, The Internet Company" <raisch@internet.com>
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: Indexing the List of Lists
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
Mime-Version: 1.0
Mime-Version: 1.0
Content-Length: 2348
Alan Emtage writes regarding Lists of Lists:
>... I don't believe that manual maintenance of this kind of data
>is feasible any longer.... the Internet is now too big for this kind of
>thing.
Alan, in the general case I believe you are right. But in the specific
case of the information required to identify a "resource" on the net, I
don't think so. Let me 'splain...
The issues of indexing Internet content are vast, but we seem to have a
number of pilot projects which attempt to address the issues. But is
this all the user REALLY needs? I suggest not.
In my experience, when I am looking for information on agriculture, I am
not looking for 'grain.tar.Z' or for 'Name=Thoughts on Triticalae and the
Wheat Borer Beetle.' Rather, I am looking for 'things having to do with
agriculture.'
Indexing content is a very large problem and one I'll freely admit most
likely needs to be completely automated. But, identifying collections of
value -- what I refer to as a 'resource' -- is something which can only
be managed and maintained by the agency in authority over that resource.
On The Electronic Newsstand, Out Magazine represents a 'resource' -- a
collection of value on a given topic, but the articles in the Out archive
are not easily identifable by their names and pose large problems of
cataloging. The Electronic Newsstand is a resource, as well, as it
represents a collection of magazines and their content, but the FAQ about
the Enews is not a resource.
While individual files and documents number (perhaps) in the millions,
resources of this kind are still in the very low thousands. Now is the
time to put infrastructure in place to handle the load.
Of course, no project works unless there is a reason for the resource
administrator to provide the meta-information necessary. I believe that
there is sufficient motivation to do so, since an effort of this nature
answers the very question we all attempt to answer by putting our
information up for view: How can I get people to use what I provide?
I strongly suggest that a first step in any effort of this kind must be
the definition of exactly what we are trying to collect information on
because the issues of indexing vs. the generation of a 'table of
contents' are very different indeed.
-- </rr> Rob Raisch, The Internet Company