Re: Future of meta-indices: site indexing proposal and Perl script

"Roy T. Fielding" <fielding@simplon.ICS.UCI.EDU>
Errors-To: listmaster@www0.cern.ch
Date: Tue, 22 Mar 1994 11:27:58 --100
Message-id: <9403220224.aa28075@paris.ics.uci.edu>
Errors-To: listmaster@www0.cern.ch
Reply-To: fielding@simplon.ICS.UCI.EDU
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: "Roy T. Fielding" <fielding@simplon.ICS.UCI.EDU>
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: Future of meta-indices: site indexing proposal and Perl script 
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 3474
Robert S. Thau writes:

> What I'm doing at least provisionally to drive the final version of my Perl
> script (site-index.pl v0.1, pointer to code below), is to put the source
> for the IAFA descriptions and keywords fields inside the document itself,
> by (ab?)use of the <META ...> tag which was discussed on this list some
> time ago to solve a different problem.  For example, the document at
> http://www.ai.mit.edu/events/events-list.html contains, near the top, the
> following:
> 
>   <meta name="iafa-description"
>   value="MIT AI lab events, including seminars, conferences, and tours">
>   <meta name="iafa-keywords"
>   value="MIT, Artificial Intelligence, seminar, conference">

I would have to label this as partial abuse. Yes, this is metainformation
and is thus reasonable to be in a META element.  However, you are using
name attributes which only make sense to your own tool, thereby defeating
the general usefulness of that metainformation.

How about:

    <meta name="Summary"
    value="MIT AI lab events, including seminars, conferences, and tours">
    <meta name="Keywords"
    value="MIT, Artificial Intelligence, seminar, conference">

Also, don't forget that the purpose of META is so that a server capable
(and willing) to parse metainfo can then send the headers

    Summary: MIT AI lab events, including seminars, conferences, and tours
    Keywords: MIT, Artificial Intelligence, seminar, conference

as part of the HTTP response object headers.  Thus, use of the META
element should be limited to things for which headers are desirable.

> (There's one other kind of meta-information my indexer uses --- if it sees
> <meta name="iafa-type" value="service">, it indexes the page in question
> with a SERVICE template, as opposed to a DOCUMENT template.  This is useful
> for cover pages of search engines and the like).

Now that is something which is not of general usefulness.  IMHO, it should
be implemented as just an SGML comment and not a META element.  E.g.

<!-- IAFA-TYPE service -->

(or, at the very least, define a different metainfo name which serves the
same purpose but corresponds to something generally useful).

> This use of <META ...> solves another problem as well, that of determining
> which documents make the index.  Files with the <meta name="iafa-...">
> fields get indexed; the rest don't. ...

That also reflects a tool-specific comment rather than metainfo.

> ... So, once these tags are in the
> documents, the rest of IAFA template preparation (finding the files,
> getting the titles out and the URIs right) can be completely automated
> (which is effectively what my site-index.pl script does).

Sounds like a great script -- thanks for making it available.

> However, the <META ...> tags do raise another problem, that of whether this
> use of <META ...> is appropriate, and if it is, making sure that the uses
> which different tools may eventually make of meta-information don't
> conflict.  A central registry of meta-information names would be a good
> idea, if people are going to start using it.

Most examples of appropriate metainfo names can already be found in
NNTP (rfc1036) and rfc822.  However, you are probably right in that we
should have some sort of specification for what the names mean.


...Roy Fielding   ICS Grad Student, University of California, Irvine  USA
                   (fielding@ics.uci.edu)
    <A HREF="http://www.ics.uci.edu/dir/grad/Software/fielding">About Roy</A>