Re: File upload in HTML forms

HALLAM-BAKER Phillip (hallam@dxal18.cern.ch)
Fri, 30 Sep 1994 22:55:47 +0100

In article <89B0@cernvm.cern.ch> you write:

|>In message <2E8AD45E@MSMAIL.INDY.TCE.COM>, Fisher Mark writes:
|>>
|>>2. Define "Content-Type: multipart/file; name=X" where X is the suggested
|>>filename for that part. Agents (mail user agents, WWW browsers, etc.) are
|>>free to rename the file as they deem fit or to reject filenames if they
|>>violate filesystem access permissions (like "Content-Type: multipart/file;
|>>name=/etc/passwd");
|>>
|>>3. Allow the "Content-Type:" header to occur when a "Packet:" occurs;
|>
|>This is mixing methaphors. For one thing, you can't call it
|>multipart/* without using the multipart syntax, including
|>boundaries. And you can't stick any content-transfer-encoding except
|>7bit inside multipart/*. The MIME spec frowns on complex encodings
|>(e.g. base64 wrapped up in base64).
|>
|>Did you see my proposal for aggregate/mixed etc? This allows
|>the features you're after.

I agree that the agreegate/mixed stuff loos interesting but not about the
MIME stuff. In HTTP we junk parts of IETF specs that are simply kludges
to allow the stuff to operate in environments that are not 8 bit clean.

HTTP is 8 bit clean. it extends the use of Content-Length: tags and
8 bit encoding inside MIME multiparts. In the same way the security
extensions, Shen which are effectively PEM do not encode the body in
base64, it goes 8 bit clean.

|>>4. Force a blank line:
|>> printf("%c%c", CR, LF);
|>>between these headers and the packet contents; and
|>
|>Blech. Parsing of packets is supposed to be quick and dirty.
|>Don't muck it up with header/body parsing.
|>
|>>The resulting data would look like this for two files, each with one line of
|>>text terminated by a CR-LF pair:
|>>
|>> Content-Transfer-Encoding: packet
|>> Content-Type: multipart/file; name="capital.txt"
|>> Packet: 28
|>>
|>> ABCDEFGHIJKLMNOPQRSTUVWXYZ
|>> Content-Transfer-Encoding: packet
|>> Content-Type: multipart/file; name="lower.txt"
|>> Packet: 28
|>
|>Your Packet: header is exactly the Content-Length: header in sheep's
|>clothing! You're back to the old problem of having to know how big the
|>body is before you write the headers of the body part.

Not if you send multipart/partial. Then you send out chunks as you can.
It is not pretty but it does at least avoid having to create a new packet
scheme. I think that that scheme belongs in an optimised Binary-HTTP
(B-HTTP). Optimising packet transport without sorting out the RFC-822
nonsense makes little sense. I would prefer to define an optimisation
for the complete set of IETF protocols, a binary mode and as part of that
derrive a packet form of the multipart schemes.

But this is way in the future, at least three months. First we need to get
some other RFCs done, An HTTP2.0 describing current practice, and a trio
consisting of architecture description, security issues and multi method
extensions (MM-HTTP). B-HTTP would be an orthogonal global optimisation,
possibly ASN.1 ish in flavour.

The reason I do not want a mixed binary/rfc-822 scheme is that the code
gets much more complex. I have a scheme for piecewise inserting synthetic MIME
parsers into libwww and leveraging binary transport via a rewrite of the
synthesizer. The result would be an RFC-822 header and Binary encasulation
browser.

I would expec the main scenario to be using P-HTTP (plain HTTP) to estblish
the connection for a conference (main case of not knwing the length in
advance), The accept would mention that it accepted B-HTTP encoding to
a particular spec. The other side would grock it and issue a switch
protocol tag and henceforth the rest of the multi-method session would be
B-HTTP. Other tricks to play with this would be to use HTTP authentication
to establish a connection then pass the secure connection (plus shared
session secret) to another protocol such as X-11. This would be a neat
method of exporting HTTP armouring to other protocols.

Hopefully this would cover the comments made by Simon Spero re efficiency.

I would also like to see other WWW interfaces hardened in the same way,
ie BGI should be congruent to CGI and conversion from one to the other
should merely be a jacket routine. This is very important when you want
a service to run in the client and not force the user to go via a server.

--
Phillip M. Hallam-Baker

Not Speaking for anyone else.