<meta information>

rst@ai.mit.edu (Robert S. Thau)
Errors-To: listmaster@www0.cern.ch
Date: Wed, 1 Jun 1994 17:28:16 +0200
Errors-To: listmaster@www0.cern.ch
Message-id: <9406011526.AA04962@volterra>
Errors-To: listmaster@www0.cern.ch
Reply-To: rst@ai.mit.edu
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: rst@ai.mit.edu (Robert S. Thau)
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: <meta information>
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
   Date: Wed, 1 Jun 1994 15:55:50 +0200
   From: kurlanda@informatik.uni-frankfurt.de

   Hi all,

   I'm writing a index-tool for indexing the whole Uni-Ffm-WWW-space. That
   means indexing several servers. Additional I want to generate meta-info
   about the pages to feed it to my site.idx file for ALIWEB. The maintainers
   of the servers should add the <meta tags> with information about the files,
   wether the file should be announced in ALIWEB or not , and so on. I remeber
   a thread some weeks ago about the <META> - tag and some proposals.

   So, are there some standards meantime? Is there anybody who has developed
   something in this direction I can use too?

   regards   

   -- 
   Jens Kurlanda 	  (Raum:014b)			J.W.Goethe Universitaet Frakfurt

At least once such tool is available... see

  http://www.ai.mit.edu/tools/site-index.html 

This documents a Perl script I wrote which indexes a site; it can
optionally build multiple indices with the disposition of a particular
document controlled by a <meta> tag with NAME="distribution".  On the
downside, it only works if you're running the NCSA server (it parses the
server config files to find the documents it has to index).

(If you've got another server, and want to try to adapt the script, you
should only have to change the function '&handle_ncsa_toplev' which reads
the config files, and then does one or several to the '&index_directory'
routine.  That routine takes a directory name and partial URL as arguments,
and does a file walk in that directory; it shouldn't have to be changed to
work with other servers).

rst