Re: Holding connections open: an immodest proposal

Roy T. Fielding (fielding@avron.ICS.UCI.EDU)
Thu, 15 Sep 1994 01:47:35 +0200

Phillip writes:

> It is important to distinguish two cases :-
>
> 1) Loading all data segments associated with an object (eg html + inline images)
>
> 2) Contiuous mode connection for realtime response.
>
>
> 1 is solved best through use of MIME multipart type. The browser does a request
> and gets back the complete object as a single document, inline images and all.
> This is currently being added to the library but slowly :-(

This is a terrible way to solve that problem. Being able to serve and
handle MIME multipart type objects is a good thing, but it should not be
used to pre-package all inline images.

> There are two ways of doing this :
>
> 1) The server sends back everything as a unit
> 2) The client requests the inline images separately.
>
> The Server is actually in the best position to know whether an image
> is specific to one html or shared by many.

I disagree. No one knows this, not even the original author. The presence
of hierarchical caches makes any such supposition impossible.

> Thus let the user defide whether
> to run the mime packer on a file or not. If the images are zipped up all
> in a single fred.mime then they will always be sent together. This can also
> be done on the fly if a .mime is requested of a file only stored as .html,
> this is a server special though.
>
> The second method requires a slight chnge to the specs. Where we have at the
> moment
>
> GET /path/fred.html http/1.0
>
> I want to have
>
> GET /path/ http/1.0
> Relative-URI: fred.html
> Relative-URL: jim.html

I don't like the idea of using headers to indicate something that is
clearly a different method. I particularly don't like those two because
they fail to indicate their purpose. I would much prefer an "MGET" method
which can operate on a list of URIs or include that list as request content,
e.g.

MGET <uri>.mget HTTP/1.1

would tell the server to enclose all of the objects listed in the <uri>.mget
object into a multipart response (note that <uri>.mget may be automatically
generated by the server), and

MGET <uri> HTTP/1.1
Content-type: application/x-www-uri-list
Content-length: xxxx

<uri>

<uri>
Optional-Request-Header: blech

<uri>

would do the same except the list would be taken from the request content.
All URIs in the list can be considered "relative" to the URI used as the
first parameter of the MGET method (which means they may also be absolute
URLs). The request headers for each URI would default to those included
with the MGET, unless they are overridden/supplemented by ones immediately
following the URI.

The hitch is that caching proxies would have to parse the request, handle
those that can be fulfilled locally, and issue their own MGET request (or
equivalent for other protocols) for those it can't fulfill. Although this
would be complicated, it is certainly doable, and I doubt that any other
scheme would be less complicated once caches are taken into account.

As for security issues, I am increasingly convinced that they belong in
an entirely separate protocol -- one that may be initiated by an HTTP
request, but only confirmed and maintained via a non-HTTP session.

>...
> A second method of doing MGET is to permit wildcarding in a URL. For example
> it would be nice to be able to specify a hierarchy of directories as is
> possible under VMS.
>
> [hallam...]
> /hallam///
>
> To me it looks like the only way of doing this extension in a compatible
> manner is to use a tripple slash. Weenie UNIX servers then would return the
> root directory only. Extended servers would send back the tree. We saw this in
> htyper-G yesterday and it was very nice. Yes I know that the UNIX rules for
> filename relativity may break but there is no reason why WWW URLs should
> be slaves to UNIX. Since few people are using tripple slashes at the moment I
> suggest that we have an opportunity for extension without backwards
> compatibility problems.

ABSOLUTELY NOT!

The "/" character in URL paths has a very significant meaning -- using it
like this would break every attempt we have made at standardizing the URL
syntax. If you want to make special exceptions for WWW, just use the '*'
character on its own. That character is not allowed unencoded in a URL,
and thus would not likely conflict with normal use anyway.

...Roy Fielding ICS Grad Student, University of California, Irvine USA
(fielding@ics.uci.edu)
<A HREF="http://www.ics.uci.edu/dir/grad/Software/fielding">About Roy</A>