<meta information>
rst@ai.mit.edu (Robert S. Thau)
Errors-To: listmaster@www0.cern.ch
Date: Wed, 1 Jun 1994 17:28:16 +0200
Errors-To: listmaster@www0.cern.ch
Message-id: <9406011526.AA04962@volterra>
Errors-To: listmaster@www0.cern.ch
Reply-To: rst@ai.mit.edu
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: rst@ai.mit.edu (Robert S. Thau)
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: <meta information>
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Date: Wed, 1 Jun 1994 15:55:50 +0200
From: kurlanda@informatik.uni-frankfurt.de
Hi all,
I'm writing a index-tool for indexing the whole Uni-Ffm-WWW-space. That
means indexing several servers. Additional I want to generate meta-info
about the pages to feed it to my site.idx file for ALIWEB. The maintainers
of the servers should add the <meta tags> with information about the files,
wether the file should be announced in ALIWEB or not , and so on. I remeber
a thread some weeks ago about the <META> - tag and some proposals.
So, are there some standards meantime? Is there anybody who has developed
something in this direction I can use too?
regards
--
Jens Kurlanda (Raum:014b) J.W.Goethe Universitaet Frakfurt
At least once such tool is available... see
http://www.ai.mit.edu/tools/site-index.html
This documents a Perl script I wrote which indexes a site; it can
optionally build multiple indices with the disposition of a particular
document controlled by a <meta> tag with NAME="distribution". On the
downside, it only works if you're running the NCSA server (it parses the
server config files to find the documents it has to index).
(If you've got another server, and want to try to adapt the script, you
should only have to change the function '&handle_ncsa_toplev' which reads
the config files, and then does one or several to the '&index_directory'
routine. That routine takes a directory name and partial URL as arguments,
and does a file walk in that directory; it shouldn't have to be changed to
work with other servers).
rst