WAIS indexing with URLs
kevinh@pulua.hcc.hawaii.edu (Kevin 'Kev' Hughes)
Date: Fri, 26 Nov 93 04:02:43 HST
From: kevinh@pulua.hcc.hawaii.edu (Kevin 'Kev' Hughes)
Message-id: <9311261402.AA15781@pulua.hcc.Hawaii.Edu>
To: www-talk@nxoc01.cern.ch
Subject: WAIS indexing with URLs
Off of Marc's early documentation, I've put together the following
script using URL support in freeWAIS 2.0.2. However, using the feature
seems to break the -nocontents flag, so I've commented out the lines that
index image/code things. (It gets painfully slow if you have a lot of images.)
IMHO, I don't think it works all that well. I suppose I like
seeing the full URL more, but I still want it to refer to a real URL,
not this WAIS docid stuff.
For now, the index is going through
http://www.ncsa.uiuc.edu:8001/www.hcc.hawaii.edu:2010/index
-- Kevin
----
#! /bin/csh
set rootdir = /www
set index = /usr/local/etc/http/index
set indexprog = /usr/local/etc/http/waisindex
set url = http://www.hcc.hawaii.edu
cd $rootdir
set num = 0
foreach pathname (`du $rootdir | cut -f2 | tail -r`)
echo "Current pathname is: $pathname"
if ($num == 0) then
set exportflag = "-export"
else
set exportflag = "-a"
endif
$indexprog -d $index $exportflag -t URL $rootdir $url $pathname/*.html
$indexprog -d $index -a -t URL $rootdir $url $pathname/*.txt
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.ps
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.gif
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.au
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.hqx
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.xbm
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.mpg
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.c
@ num++
end
echo "$num directories were indexed."