Holding connections open: an immodest proposal

HALLAM-BAKER Phillip (hallam@dxal18.cern.ch)
Wed, 14 Sep 1994 11:54:39 +0200

This is a more than modest proposal, it is very long. I have stolen everyones
ideas that I could and not given any credit for them. This has been churning
round for ages but I think that we are now at the point where we can progress.

For more details see the somewhat outdated now:

http://alephinfo.cern.ch/AL2F01$DKA100/hallam/INFOGEN/WWW/SRC/http_development.html

The protocol synthesiser refered to is not in use at the moment. There
is however a synth for various MIME related objects. The DOOM synth may be
resurected once the MIME handling issues are sorted out.

http://ptsun00.cern.ch/home2/hallam/WWW/Mime/Specification/libraries.html

It is important to distinguish two cases :-

1) Loading all data segments associated with an object (eg html + inline images)

2) Contiuous mode connection for realtime response.

1 is solved best through use of MIME multipart type. The browser does a request
and gets back the complete object as a single document, inline images and all.
This is currently being added to the library but slowly :-(

There are two ways of doing this :

1) The server sends back everything as a unit
2) The client requests the inline images separately.

The Server is actually in the best position to know whether an image
is specific to one html or shared by many. Thus let the user defide whether
to run the mime packer on a file or not. If the images are zipped up all
in a single fred.mime then they will always be sent together. This can also
be done on the fly if a .mime is requested of a file only stored as .html,
this is a server special though.

The second method requires a slight chnge to the specs. Where we have at the
moment

GET /path/fred.html http/1.0

I want to have

GET /path/ http/1.0
Relative-URI: fred.html
Relative-URL: jim.html

This allows multiple requests in one GET. Note that only a single response
is returned, a multipart MIME. I already use the Relative URI tag in Shen
security. The user sends the line

NULL / http/1.0
DEK-INFO: DES, 02361371238
Secret-Header: uuendoded-des-encrypted-header

Where the decode of the secret header has the relative URL in it. NB,
this scheme allows choice of giving the URL on the command line, very usefull
when proxying through non Shen compliant servers.

A second method of doing MGET is to permit wildcarding in a URL. For example
it would be nice to be able to specify a hierarchy of directories as is
possible under VMS.

[hallam...]
/hallam///

To me it looks like the only way of doing this extension in a compatible
manner is to use a tripple slash. Weenie UNIX servers then would return the
root directory only. Extended servers would send back the tree. We saw this in
htyper-G yesterday and it was very nice. Yes I know that the UNIX rules for
filename relativity may break but there is no reason why WWW URLs should
be slaves to UNIX. Since few people are using tripple slashes at the moment I
suggest that we have an opportunity for extension without backwards
compatibility problems.

I would like to have a page /hallam///*.html

I would also like to have a command:-

copy http://ptsun00.cern.ch/hallam/WWW///* http://dxal18.cern.ch/hallam/WWW

[We now know why the UNIX commands are two letters - they provide so little
functionality it is only worth typing half the number of characters. Copy
should work over the whole network in a transparent manner just as it does in
DECNET]

This implementation is a minimal one requirng no substantial changes to the
architecture of the likes of Mosaic. To go to continous connection is rather
more radical since the browser should be capable of receiving async messages.

2 is really a second protocol even though it may be a superset of http. Ie we
expect to use all the same specs except that content length is mandatory for
every block sent. This allows for conferencing and MUD connections and is in
practice a replacement for telnet.

Even here it is not strictly necessary to allow multiple gets. A POST method
with duplex transmission of MIME multipart messages would suffice. I suspect
that a different metod (DUPLEX) is justified though.

The original idea of HTTP was that you did a single send and a single receive
to obtain the object you want. NNTP and FTP negotiation is pretty futile and
the continous connection stuff is a real pain. We certainly do not need to do
an FTP style second connection simply to provide MGET.

I sketched out a suggestion for extending the HTTP protocol to multiple
transaction for the first conference. To sumarise :-

1) Ideally we want another protocol to do transactions BUT we also have to
support TCP/IP. So put TTCP etc in the pending tray.

2) Semantics for accept headers should default to the first set specified.
The user agent can't really change much during the transaction, nor
are many of the other headers relevant so the simplest case is just
to default everything to the last set of headers sent.

3) Multipart MIME, the content length is not required for the outermost body
but must be there for all the innermost objects.

4) Need transaction methods:-
START (Implicit at begin of transaction)
COMMIT
ROLLBACK

NULL - for sending headers etc. This method is needed by the security
extensions in any case.

5) A transaction is rolled back if the connection fails.

It is important to distinguish the original point of HTTPs single shot nature
from the need for continous connections to provide functionality. FTP and
SMTP do not need to keep the connection. The fact that they do slows down
the process significantly. In the same way SMTP should be single shot.

Single shot telnet on the other hand is not a good idea. For a realtime system
there must be a means of sending unsolicited data. Here we are expanding HTTP
to a hyperterminal protocol. Lets face it Telnet is a pretty ugly protocol,
it dosen't even support my VT100 in a satisfactory manner let alone my
Xterminal. I want to connect to my computer in a seamless editing session
with high quality fonts and device independence. A terminal sending a HTML+
stream can provide high quality text, maths and images. Plus you can use the
security system to avoid telnets habit of sending plaintext passwords!

But we do not want to go back to the system whereby to get a news article you
have to send four commands and get back four responses and keep open a channel,
blocking other users from using the system and using resources needlessly. FTPs
throw you out whenever possible is pretty stupid.

One point:
BURN PRAGMAS !
PRAGMAS ? - JUST SAY NO.

Pragma keep alive is NOT acceptable. You are modifying the protocol completely
and not even using a tag. METHOD /url/ HTTP/2.0 is the only acceptable solution.
Continuous connection HTTP is a different protocol and MUST be announced as
such. It is a major revision to the protocol because it does not guarantee
backwards compatibility. The requirements for HTTP/2.0 on sending tags will be
stricter. Although a 2.0 server will be interoperable with a 1.0 one a valid 1.0
message will not necessarily be a valid 2.0 one.

We can create tags as and when needed. Either a tag has meaning or it does not.
Originaly Ari started the pragma because it was to do with the proxy server
which at the time was seen as being somehow peripheral to the grand scheme.
Now we know that proxy servers are very important and in fact a central concept.
Pragma: NoCache should become Prohibit: Cache as soon as possible.


Summary:
Yes we want connectionless and continuous connection HTTP. Possibly the
latter has a different name. But the spec is pretty much the same.

--
Phillip M. Hallam-Baker

Not Speaking for anyone else.