WAIS indexing with URLs

kevinh@pulua.hcc.hawaii.edu (Kevin 'Kev' Hughes)
Date: Fri, 26 Nov 93 04:02:43 HST
From: kevinh@pulua.hcc.hawaii.edu (Kevin 'Kev' Hughes)
Message-id: <9311261402.AA15781@pulua.hcc.Hawaii.Edu>
To: www-talk@nxoc01.cern.ch
Subject: WAIS indexing with URLs

	Off of Marc's early documentation, I've put together the following
script using URL support in freeWAIS 2.0.2. However, using the feature
seems to break the -nocontents flag, so I've commented out the lines that
index image/code things. (It gets painfully slow if you have a lot of images.)
	IMHO, I don't think it works all that well. I suppose I like
seeing the full URL more, but I still want it to refer to a real URL,
not this WAIS docid stuff.
	For now, the index is going through

	http://www.ncsa.uiuc.edu:8001/www.hcc.hawaii.edu:2010/index

	-- Kevin

----

#! /bin/csh

set rootdir = /www
set index = /usr/local/etc/http/index
set indexprog = /usr/local/etc/http/waisindex
set url = http://www.hcc.hawaii.edu

cd $rootdir
set num = 0
foreach pathname (`du $rootdir | cut -f2 | tail -r`)
	echo "Current pathname is: $pathname"
	if ($num == 0) then
		set exportflag = "-export"
	else
		set exportflag = "-a"
	endif
	$indexprog -d $index $exportflag -t URL $rootdir $url $pathname/*.html
	$indexprog -d $index -a -t URL $rootdir $url $pathname/*.txt
# 	$indexprog -d $index -a -t URL $rootdir $url $pathname/*.ps
# 	$indexprog -d $index -a -t URL $rootdir $url $pathname/*.gif
#	$indexprog -d $index -a -t URL $rootdir $url $pathname/*.au
#	$indexprog -d $index -a -t URL $rootdir $url $pathname/*.hqx
#	$indexprog -d $index -a -t URL $rootdir $url $pathname/*.xbm
#	$indexprog -d $index -a -t URL $rootdir $url $pathname/*.mpg
#	$indexprog -d $index -a -t URL $rootdir $url $pathname/*.c
	@ num++
end
echo "$num directories were indexed."