Re: Initializing HTTP headers from HTML documents
"Roy T. Fielding" <fielding@simplon.ICS.UCI.EDU>
To: www-talk@www0.cern.ch
Subject: Re: Initializing HTTP headers from HTML documents
In-reply-to: Your message of "Wed, 05 Jan 1994 12:59:01 EST."
<9401051759.AA07042@hotsand.dacsand>
Date: Thu, 06 Jan 1994 00:59:09 -0800
From: "Roy T. Fielding" <fielding@simplon.ICS.UCI.EDU>
Message-id: <9401060059.aa16159@paris.ics.uci.edu>
Content-Length: 3675
>> From: Dave_Raggett <dsr@hplb.hpl.hp.com>
>>
>> I would like to propose a scheme that HTTP servers can use to
>> initialize HTTP headers by reading information held at the start of
>> HTML/HTML+ documents. This is intended for fields like Expires: which
>> are best determined by the document author.
>>
>> The generic META element takes the following attributes:
>>
>> NAME the name of an HTTP header such as Expires
>> VALUE the value to be passed with the associated header
>>
>>
>> e.g. <META NAME="Expires" VALUE="Tue, 04 Jan 1994 14:13:25 GMT">
>>
>> This element is only permitted as part of the document's HEAD
>> along with TITLE, ISINDEX and LINK.
>>
>> Any comments?
>>
>> Dave Raggett
I like this scheme since it makes it much easier on the server than
our previous discussions about OWNER and DATE elements. It also allows
for site-specific additions to the headers without requiring special
changes to the server for each added header.
My only concern is that it allows the author to free-format information
which should normally appear in a fixed (specified) format. However,
I think clients should be robust enough to just throw away bad headers.
I do wonder, however, what would happen if an author included
<META NAME="Content-Type" VALUE="text/bogus;"> as an odd joke, but I
can't think of any intentional spoofing that could adversely effect a client
other than just not being able to display the object.
> From: John Ellson <ellson@hotsand.att.com>
>
> I agree that authors would like to expire their documents at some
> point in the future, even if those documents are being served
> from a cache site. But I don't see how a timestamp is sufficiently
> expressive of the reasons an author might have to expire a document.
> The reasons might not even be known at the time that the document
> is written.
I don't understand why the reason is important. The expires header should
indicate the date beyond which that information object may no longer be
"true" (or applicable or useful or whatever). In any case, the only thing
the cache needs to know is when to get rid of it. Since any boolean
expression would only make sense if evaluated on the server side
(i.e. not by the cache manager), I don't see any reason why the cache
manager would want to see the boolean expression in the header.
If the author does not know the expires date, then it should either not
appear at all or be assigned separately by the server (using whatever
tables/logic that such a site would consider desirable). This is, I believe,
how such things are handled in netnews (which is how the HTTP spec defines
the purpose of the Expires: header).
> Could VALUE be a boolean expression so that we could do something like:
>
> <META NAME="Expires" VALUE=EXISTSP(<http://original.host/original.file>)>
>
> or:
>
> <META NAME="Expires" VALUE=GT(DATE, "Tue, 04 Jan 1994 14:13:25 GMT")>
>
> There would need to be a way of preventing caches from cacheing the
> result of EXISTSP.
>
> Also, clients need to do something reasonable if the boolean
> expression cannot be evaluated. Perhaps they could display the page
> anyway but with a warning message saying that "the expiry status
> of this document is unknown."
The vast majority of information objects will have no expires header.
I think it would be more appropriate if the client simply displayed
the expiration date (if any) in a place for secondary feedback information,
such as the information line at the bottom of Mosaic for X windows.
....Roy Fielding ICS Grad Student, University of California, Irvine USA
(fielding@ics.uci.edu)