Re: The future... etc

gavin@jupiter.qub.ac.uk (Gavin Bell)

Mail folder: WWW Talk Jan 94-present
Next message: Martin Hamilton: "Stab in the dark"
Previous message: Daniel W. Connolly: "Re: Testing URIs for equality "

Errors-To: listmaster@www0.cern.ch
Date: Thu, 17 Mar 1994 00:17:52 --100
Message-id: <9403170716.AA03168@jupiter.qub.ac.uk>
Errors-To: listmaster@www0.cern.ch
Reply-To: gavin@jupiter.qub.ac.uk
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: gavin@jupiter.qub.ac.uk (Gavin Bell)
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: The future... etc
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 4612

Hello,
        I'm newish to the list, but this area is exactly what I'm starting
to research.  I agree with what Rob Raisch's point that unless you know
that the information you seek is out there, you will have difficuly finding
it at best.  My supervisor and I were discussing this yesterday.  The later
point that a skilled researcher does not just do a simple keyword search
for information, but does a refined search which can be iterative is also
important.
A study was done in the Computer Science department here a few years ago
showing that an initial keyword search found approximately 25% of the
actual books present, a later refined search found approx 40%, finally a
librarian was asked to do a comprehensive search of the subject area and
found I think 60%.  The number of books had been hand counted previously so
even a highly trained researcher using a familiar system was unable to get
all resources.  There was a study in the 70's by a guy called Miller which
was similar to this, sorry I can't remember the ref.  I included this as an
example of the problems we face even if we use a multiple keyword indexing
approach.  Of course this relies on the ability of the author to describe
his material in a suitable manner.  At the end of this message I'll put in
my keywords for this text I'm sure that there will be disagreement on them.

        It has already been pointed out that multiple subject based
catalogues is the best way to move forward on this, I agree with this, but
would like to make a few comments.  I work at an interface of knowledge,
having done a degree in Psychology & Computer Science so finding
information relevant to me involves me looking in a multitude of sources,
often returning the same information.  Many other people work at similar
interfaces, infact nobody is really interested in just one subject domain
so wil need to be aware that people will be searching from any domain for
information.
An example
        A philosophy arts student wants information on Medical genetics,
specifically sex selection.  He/she knows little about the specifics of the
subject domain so will not know the correct keywords to search under.  What
is the best way for him/her to start.  A search for (ethics & sex) or
(genetics and ethics).  I actually have had to do this search for a friend
so this is a real world example.  I tried on the BIDS system which is an
ISI citation index.  Eventually I got back articles via a search for ethics
+ gene as a stem.  This example is used to show that people from different
back grounds need to search in different domains.  What is needed in this
case is a means of guiding the search, a system like NetNews where you can
educate the search engine maybe by a like that / not like that system.
Maybe that is a bad example, but you get what I mean I hope.

        To visualise the information held at each site, in addition to
subject based indices, why not use an ISMAP construct as an alternate home
page.  You could have a normal plain html home page with a link to a graph
of the information content of the server.  This graph could be created
automatically from the information index held at each site.  I envisage
each server with its own index, Martijn's .idx file seems a good idea, all
we need is a common format for the data.  There is information on directed
graph creation in Trinity College Dublin under
http://www.dsg.cs.tcd.ie:1969/afc_draft.html

        What I am thinking of is a system whereby you can graphically
browse the net on a subject based method.  This would allow you to
visualise the information on a non geographical level.  The client software
would allow you  to choose what you wanted presented instead of being
presented with a thousand different subjects.  You could eventually create
your own personal view of the net instead of having what is there presented
to you.  Graphical means on their own I realise are not sufficient, but
graphical means to visualise and textual to specify seems a good
compromise.

        A final point two way linking, locally at least would be a good
idea, it took me 10 minutes to find that href above, as it is at the end of
a link.  A link back to the previous page on the server would be good say
<A HREF="http://www.univ.ac.uk/hyperworld/docs/install.html">link</A>
<RETURN="http://www.univ.ac.uk/hyperworld/docs/about.html>
I'm not sure how you would add this to the existing specification, keys and
double clicking on links will not work, but tis a suggestion anyway.

Gavin

Keywords: WWW navigation searching information retrieval visualisation
hypermedia