Encapsulating HTTP parts in MIME

Marc VanHeyningen <mvanheyn@cs.indiana.edu>
From: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
To: www-talk@nxoc01.cern.ch
Subject: Encapsulating HTTP parts in MIME
Date: Mon, 28 Jun 1993 09:26:12 -0500
Message-id: <29789.741277572@moose.cs.indiana.edu>
Sender: mvanheyn@cs.indiana.edu
Following is an updated and (I think) better version of my earlier
suggestion.  I have also posted it to the newsgroup; I'd prefer if
people who can access it discuss it there rather than in this mailing
list since it's crossposted so the MIME community can also contribute
suggestions (and since it's not really possible to cross-post on
mailing lists.)

HyperText Transfer Protocol (HTTP) is a transaction oriented protocol
based on request-response.  I believe there are some advantages to
being able to consider these components (the request and the response)
as MIME content-types, so they may be forwarded, gatewayed,
encapsulated, authenticated, encrypted, and the like via standard MIME
techniques.  Since both the request and the response in HTTP/1.0 are
already very close to MIME messages, I believe that "message" is the
appropriate primary content-type.

There are two applications which most readily come to mind for this
kind of definition:

- Allowing the full power of HTTP/1.0 to be utilized via a mailbot,
  for users who cannot use HTTP directly due to limitations of dialup
  links like UUCP, firewalls, etc.
- Allow an HTTP request or response to have some or all of the
  security services made possible for MIME objects by the PEM-MIME
  inter-operation standard (still in draft right now.)

Significant concerns I have in mind:

- As much as possible, the new system should work smoothly with both
  HTTP/1.0 and HTTP/0.9.
- MIME gateways should be able to handle these messages (e.g. change
  the Content-Transfer-Encoding as necessary for various transports)
  smoothly.

A tentative method:

Define a request as content-type "message/http-request".

Required parameters:
  method:	the method of the request (e.g. "GET")
  object:	the object of the request (e.g. "/foo/bar/baz.html")
  version:	which version of HTTP (e.g. "HTTP/1.0", or is it "HTRQ/1.0" ?)

Optional parameters:
  address(?):	the address of the HTTP server (e.g. "info.cern.ch:80")
  id:		a unique identifier to be returned with the response
		(e.g. "12345@foo.bar")

The request itself (minus the first line, which is given in the
required parameters) follows.  Obviously, in HTTP/0.9, these
parameters contain the whole request and thus the content will always
be empty, while in HTTP/1.0 these parameters contain a MIME message
(possibly with some additional RFC822-style headers.)
 
Define as another content type "message/http-response".

Required parameters:
  version:	the version of the response (e.g. "HTTP/1.0")

Optional parameters:
  status:	the numeric result (mandatory in HTTP/1.0 only, e.g. "200")
  reason:	the textual result (also HTTP/1.0 only, e.g. "OK")
  in-reply-to:	the unique identifier from the "id" parameter of the
		http-request for which this is the response

The body contains the remainder of the response; in HTTP/1.0, it will
consist of an MIME message (possibly with some additional
822-compliant headers.)  In HTTP/0.9 it will not consist of any
headers; therefore, a blank line will precede the text (the blank line
is the boundary between the lines of headers [of which there are none]
and the body of the encapsulated message.)

WHY TO DO IT THIS WAY
---------------------

Since message/rfc822 is the primary subtype of message, it seems
reasonable that user agents and (more importantly) gateways should
generally treat unknown subtypes of messages as such.

Hence, the first line of information from both the request and the
response has been removed from the body and placed in the parameters;
the first line is not syntactically a legal 822 header, and thus would
likely confuse gateways.  Gateways need to be able to understand this
subtype properly, so that they can do things like change the encoding
as necessary.  (HTTP goes over direct sockets, and thus normally
"binary" is always available as an encoding, but this will not be the
case when using a mailbot.)  Am I correct in supposing that this
method will make gateway handling relatively painless?

Putting these things into parameters doesn't feel very philosophically
clean, though.  I suppose I could instead make a part like
"multipart/http-request" with two parts, one the "header" (the leading
line containing method, object and version) and the second being the
822 compliant message.  I think the parameter method seems simpler to
understand, implement, and use (though I don't know which is
philosophically cleaner.)

I'm not really sure how to deal with the case of HTTP/0.9 vs. HTTP/1.0
responses, since the latter is a MIME message while the former is not.
Putting the blank line as "end of headers" in 0.9 is the best I can
think of to still allow some generality and give gateways a chance to
do the right thing regarding encoding; I'd love to hear better
suggestions.

Arguably, putting an id and in-reply-to parameter is redundant with
their similar headers in the message itself.  However, normally an MUA
does not make these headers available to the originating and receiving
programs, while parameters normally are.

Obviously, it is best to respond with the same version as the request,
it is acceptable to respond with an earlier version than the request,
and it is forbidden to respond with a later version than the request.

SOME EXAMPLES
-------------

A simple dialogue with a mailbot to do an HTTP/1.0 poll:

  To: mailbot
  From: joe_user@somewhere.com
  MIME-Version: 1.0
  Content-Type: message/http-request; address="info.cern.ch:80"; 
		method="GET"; object="/hypertext/WWW/TheProject.html";
		version="HTTP/1.0"; id="1234@somewhere.com"

  From: joe_user@somewhere.com
  Accept: text/plain; text/html; application/postscript; image/gif
  User-Agent: 
  [ this is a blank line, and it is significant ]

and the mailbot would respond with something like this:

  From: mailbot
  To: joe_user@somewhere.com
  MIME-Version: 1.0
  Content-Type: message/http-response; version="HTTP/1.0";
		in-reply-to="1234@somewhere.com"; status=200;
		reason="Document follows"

  MIME-Version: 1.0
  Content-Type: text/html
		
  <HEADER>
  <TITLE>The World Wide Web project</TITLE>
  <NEXTID N="59">
  </HEADER>
  <BODY>
  <H1>World Wide Web</H1>The WorldWideWeb (W3) is a wide-area<A
  [ ... ]

Naturally, these objects could be encapsulated inside other objects,
though the only use for that which I can think of would be making the
request or response the secured object of a multipart/pem message.

Any comments, reactions, flames, or other thoughts?
--
Marc VanHeyningen  mvanheyn@cs.indiana.edu  MIME, RIPEM & HTTP spoken here