new ? resource discovery idea

Johnson M J <zgee9119@qmw.ac.uk>
Via: uk.ac.qmw.omega; Wed, 1 Dec 1993 14:13:14 +0000
Date: Wed, 1 Dec 1993 14:09:33 +0000 (GMT)
From: Johnson M J <zgee9119@qmw.ac.uk>
Subject: new ? resource discovery idea
To: www-talk@nxoc01.cern.ch
Message-id: <Pine.3.03.9312011433.E15755-c100000@shark>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
I am working on a Resource Discovery tool for the internet as
the final year project of my degree.  The aim of my work is to
develop a system that unites browsing and searching, and
integrates human and automatic indexing/organising effort. 
 
The basic approach is to organsise URL's (Universal Resource
Locator) into sets by subject.  A particular ULR may be in many
subjects.  Searches can be performed using the usual set
operations, and the selected resources are then browsed - ie the
same strategy as WAIS.  Subjects that have common URL's are
considered to be related, hence the subject space can be viewed
as a graph.  The nodes are the subjects, and the links are
common resources.  This gives possibilities of browsing, users
can just browse from general entry points, ala WWW and gopher. 
Browsing techniques can be used to widen search results if the
required resource is not found.  The dual view should allow
existing systems: WAIS; WWW; gopher; (hopefully) TopNode; to map
easily to the subject space. 
 
 
Human resource organisation efforts are by individuals
organising the resources and resource sources that they find
usefull.  Users of NCSA mosaic use a simple list (the Hotlist)
to remember usefull resources.  I intend to build a simple
extension to xmosaic to replace the hotlist, based on the
set/graph subject space.  Users can consider the tool as set or
graph based as they wish, and create their own subjects.  By
joining individual subject spaces to a global subject space, the
resources are discovered and organised by the users, without any
extra effort.  Once a global subject space is populated with
resources, it can be used by individuals to discover resources
by searching and browsing it.  Information about how a global
subject space is used, eg how often a user follows a particular
link, the most/least common paths across the graph to find a
particular resource, may be used for automatic reorganisation of
the subject space.  Automatic indexing can be performed by using
a tool such as Essence (Resource Discovery at the University of
Colorado, Schwartz), which uses file semantics to generate
compact keyword lists. 
 
Currently I am building a subject space tool.  Next I will
investigate the properties of the subject space :- various ways
to weight the links, how several boolean subject spaces (the
individual users) can be merged into fuzzy subject spaces (the
global), searching using fuzzy information retrieval techniques,
GUI's to the subject space.  I aim to build the extension to
XMosaic and a subject space server (probably speaking HTTP). 
 
Do you have any comments?  Do you think my ideas are sensible? 
Do you know of anything that could help me?  Any and all
corrispondance will be greatfully recieved.
 
 
Thankyou
 
Mark Johnson
Queen Mary and Westfield College
University of London
UK