Re: CGI server indexing with WAIS

robm@ncsa.uiuc.edu (Rob McCool)
Message-id: <9312171103.AA17773@void.ncsa.uiuc.edu>
From: robm@ncsa.uiuc.edu (Rob McCool)
Date: Fri, 17 Dec 1993 05:03:33 -0600
In-Reply-To: Tony Sanders <sanders@BSDI.COM>
       "Re: CGI server indexing with WAIS" (Dec 16,  6:14pm)
X-Mailer: Mail User's Shell (7.2.5 10/14/92)
To: www-talk@www0.cern.ch
Subject: Re: CGI server indexing with WAIS
Content-Length: 4061
/*
 * Re: CGI server indexing with WAIS  by Tony Sanders (sanders@BSDI.COM)
 *    written on Dec 16,  6:14pm.
 *
 * > Hi, gang, I've taken Tony Sanders' PERL script which uses freeWAIS to search
 * > an index of a server and ported it to CGI. What this means is that if you
 * > index your HTTP server with wais, you can use this script to search it.
 * Neat, please mail me a copy (of your wais.pl).

Attached. I'm not a serious perl user so my techniques may not be the best,
but it appears to work.

 * I'll pick up 1.0 soon and take a peek at it.  Hopefully sometime early
 * next year I'll have more free time to hack on Plexus and make it
 * CGI compliant.  Great work BTW.  I'm glad we did this early on
 * so folks will have plug-and-play servers for many things.

I'm hoping it takes off. I wish I had some time to go and update the
documentation; I'm getting truckloads of confused questions because the CGI
spec assumes a fair bit of previous knowledge.

 * One of my first projects when I get back to is to make a perl package
 * of support functions for people to use to write CGI compliant scripts
 * in perl (you probably have the same thing for C).  Basically this
 * just means packaging up a lot of the funnctions I already have and
 * making some minor changes.
 */

I just got a reference to a perl cgi library from someone, I haven't gotten
a chance to look at it. It's at http://www.bio.cam.ac.uk/cgi-src/cgi-lib.pl

--Rob

#!/usr/local/bin/perl
#
# wais.pl -- WAIS search interface
#
# $Id$
#
# Tony Sanders <sanders@bsdi.com>, Nov 1993
#
# Example configuration (in local.conf):
#     map topdir wais.pl &do_wais($top, $path, $query, "database", "title")
#

$waisq = "/usr/local/bin/waisq";
$waisd = "/u/Web/wais-sources";
$src = "www";
$title = "NCSA httpd documentation";

sub send_index {
    print "Content-type: text/html\n\n";
    
    print "<HEAD>\n<TITLE>Index of ", $title, "</TITLE>\n</HEAD>\n";
    print "<BODY>\n<H1>", $title, "</H1>\n";

    print "This is an index of the information on this server. Please\n";
    print "type a query in the search dialog.\n<P>";
    print "You may use compound searches, such as: <CODE>environment AND cgi</CODE>\n";
    print "<ISINDEX>";
}

sub do_wais {
#    local($top, $path, $query, $src, $title) = @_;

    do { &'send_index; return; } unless defined @ARGV;
    local(@query) = @ARGV;
    local($pquery) = join(" ", @query);

    print "Content-type: text/html\n\n";

    open(WAISQ, "-|") || exec ($waisq, "-c", $waisd,
                                "-f", "-", "-S", "$src.src", "-g", @query);

    print "<HEAD>\n<TITLE>Search of ", $title, "</TITLE>\n</HEAD>\n";
    print "<BODY>\n<H1>", $title, "</H1>\n";

    print "Index \`$src\' contains the following\n";
    print "items relevant to \`$pquery\':<P>\n";
    print "<DL>\n";

    local($hits, $score, $headline, $lines, $bytes, $type, $date);
    while (<WAISQ>) {
        /:score\s+(\d+)/ && ($score = $1);
        /:number-of-lines\s+(\d+)/ && ($lines = $1);
        /:number-of-bytes\s+(\d+)/ && ($bytes = $1);
        /:type "(.*)"/ && ($type = $1);
        /:headline "(.*)"/ && ($headline = $1);         # XXX
        /:date "(\d+)"/ && ($date = $1, $hits++, &docdone);
    }
    close(WAISQ);
    print "</DL>\n";

    if ($hits == 0) {
        print "Nothing found.\n";
    }
    print "</BODY>\n";
}

sub docdone {
    if ($headline =~ /Search produced no result/) {
        print "<HR>";
        print $headline, "<P>\n<PRE>";
# the following was &'safeopen
        open(WAISCAT, "$waisd/$src.cat") || die "$src.cat: $!";
        while (<WAISCAT>) {
            s#(Catalog for database:)\s+.*#$1 <A HREF="/$top/$src.src">$src.src</A>#;
            s#Headline:\s+(.*)#Headline: <A HREF="$1">$1</A>#;
            print;
        }
        close(WAISCAT);
        print "\n</PRE>\n";
    } else {
        print "<DT><A HREF=\"$headline\">$headline</A>\n";
        print "<DD>Score: $score, Lines: $lines, Bytes: $bytes\n";
    }
    $score = $headline = $lines = $bytes = $type = $date = '';
}

eval '&do_wais';