Re: The future of meta-indices/libraries

waterbug@epims1.gsfc.nasa.gov (Steve Waterbury)
Errors-To: listmaster@www0.cern.ch
Date: Thu, 17 Mar 1994 20:33:36 --100
Message-id: <9403171928.AA13819@epims1>
Errors-To: listmaster@www0.cern.ch
Reply-To: waterbug@epims1.gsfc.nasa.gov
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: waterbug@epims1.gsfc.nasa.gov (Steve Waterbury)
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: The future of meta-indices/libraries
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 3073


Stan Letovsky writes:

> How could topic indexes be maintained?  There are sociological and
> technological components to the answer. The sociological answer is
> that every topic has a curator (group), and an associated network
> host....

This sounds like a good concept, although the actual 
implementation will no doubt be determined by the interplay of 
the sociological and technological components!

> The astute reader should now ask, How is this different from simply
> having keywords associated with documents ...?
> The answer is that the crucial flaw with that scheme is that
> there is no mechanism for coordination of keyword assignments. You say
> tomato and I say tomatoe ....  A topic-curator system would allow
> this problem to be partitioned among a responsible community in a
> nonburdensome manner.

"Curators" of sorts already exist in some areas:  standards groups.  
IEC TC3 is creating an international standard dictionary of "data 
elements" used in the description of electronics, for example.  

The important point is the terms need to be "standardized".  The 
next important point is that context is often critical to terms' 
meanings -- which means there will be needed at _least_ an 
elementary form of "semantic model" -- something like an 
"entity-relationship" model, in which the terms will have their 
proper context, and on which the relationships between topic-servers 
in different domains can be properly understood to enable cross-
domain queries (okay, kind of wild, but you know it will happen ...).  

Anyway, thanks for sharing that vision, Stan.  Great minds rant 
alike!  

Incidentally, I'm still busily implementing my own pet version:  
using non-HTML SGML tags to identify data that needs to be 
indexed in a document, and having special "agents" that would be 
told what sites to go to and pull the info out of documents with 
the tags they are looking for, to be brought back to a local 
database, where the URL's/URN's would be stored along with the 
indexed attribute data, so that local queries could be done and 
the relevant docs summoned from wherever they live.  

If anyone is curious, I have put a real Failure Analysis Report 
into the format I have in mind:

http://epims1.gsfc.nasa.gov/fa/fa_82713.html

Check the HTML source for the SGML meta-data tags that would be 
pulled out (with their instance data) by such an indexing agent.  

This scheme is probably best adapted to engineering/scientific data, 
but might be useful for other forms also. 

BTW, if anyone from Stanford or Lockheed is listening, I would be 
very interested in your thoughts, and whether you have any agent 
software availble or adaptable to this.   


Steve Waterbury
WWW Virtual Library:  Engineering.                                               
                                           oo _\o
                                            \/\ \
                                              /
____________________________________________ oo ____________      
"Sometimes you're the windshield; sometimes you're the bug."