Re: Internet draft for 'file upload' feature proposal

Daniel W. Connolly (connolly@hal.com)
Fri, 11 Nov 94 12:22:08 EST

In message <94Nov10.172647pst.2760@golden.parc.xerox.com>, Larry Masinter write
s:
>Dan said:
>>In HTML 2.0? In a future revision?
>
>Given the fallback strategy outlined for server implementors, it
>*might* be practical to include this in HTML 2.0.

Hmmm... I like Stu's suggestion of having the HTML 2.0 spec refer
to this draft, but not include it.

I'd like HTML 2.0 to have the characteristic that if a document is
HTML 2.0 valid, then it works today on most browsers -- specifically,
all commercially supported browsers. There are some corner cases where
this is not true (comment syntax, attribute value literal syntax...)
but it is a widely held perspective, and it is largely true.
Including TYPE=FILE in 2.0 would be counter to that goal.

>I've been concerned that the HTML 2.0 specification is going to
>proposed standard using "application/x-www-form-urlencoded" as the
>default ENCTYPE for form data; it doesn't seem consistent to propose a
>standard that uses a "x-" type.

The way I see it, x-www-form-urlencoded is not the name of a standard
(or proposed standard) internet media type. But it _is_ the default
value for the ENCTYPE attribute, since it's the default behaviour of
browsers. Hmmm... perhaps it would be better to have ENCTYPE default
to #IMPLIED, but I don't think so. Hmmm... in fact, is ENCTYPE supported
at all in current browsers? What happens if you write ENCTYPE="text/plain"?
I suspect current browsers just ignore it.

By the way... did anyone do any testing to investigate the behaviour
of Mosaic and friends when ACTION isn't specified? Does the alleged
default processing really happen?

>The proposal says:
>>File Transmission from WWW Browsers to Servers
>
>Dan said:
>> Does it bother anybody else that the term "WWW Browser" would be used
>> in specifications when it has no well-defined meaning?
>
>I think what I want to name is "the entity that interprets HTML forms
>and allows clients to interact with it". This is not a "HTTP client".
>Perhaps you could call it a "HTML client", but then, HTML isn't in
>itself a client/server protocol, but merely an element by several of
>them. As I said in response to Liam Quin's message, I could call this
>a "WWW client", or a "HTML interpreter".

"User agent" is the phrase used in other RFCs. Now: are we talking
about a WWW user agent, an HTTP user agent, or an HTML user agent?

Perhaps it's time to coin the term "WWW user agent" and give it a
definition. It seems like a useful term for this file upload proposal,
as well as a critical term for the HTML spec.

Eventually, somebody should write the much-alluded-to "browser spec"
and formally define the beasty. But for now, something like:

===

Since the application that processes HTML forms may support a number
of protocols, we use the term _WWW user agent_ to refer to the part of
this distributed hypermedia system that allows users to visit nodes in
the system and processes the users' navigation requests. For example,
NCSA Mosaic, lynx, Chimera, and Netscape are popular WWW user agents.

===

>The draft says:
>>
>> Currently, a World-Wide Web server can get information from users
>> with HTML forms. These forms have proven useful in a wide variety
>> of applications in which input from the user is necessary. But this
>> capability is still greatly limited because HTML forms don't provide
> ^^^^^^^^^^
>> a way for the user to submit files to the server.
>
>Dan replied:
>
>> I'd feel better if you said "WWW browsers don't provide a way... ," as
>> there's nothing in HTML itself that prevents folks from doing file
>> upload.
>
>But this is not true. The lack is actually in HTML itself: there is no
>way to write HTML that will cause the HTML interpreter to ask the user
>for a file of data. There's a way to cause it to ask the user for some
>text, for the user to select between multiple alternatives, etc.
>
>The wording should more precisely say: "there is no way to write a
>HTML form that will cause the HTML interpreter to ask the user for a
>file."
>
>It is in fact a 'lack' in HTML, and not merely a lack in the browsers.

Adding INPUT=FILE to HTML is a sufficient, but not necessary means
to enable file upload on the web.

I could just as easily propose that all WWW user agents add a
menu item called "Upload Files." When the user selects "Upload files,"
they get a file selection box; they select some files, click OK,
and the user agent does a POST to the address of the current document.

Hence no changes to HTML are _necessary_ to support file upload.

File upload _could_ be implemented simlarly to the way annotations
were done. It could work like the "reply" or "followup" feature of
mailers and newsreaders.

I am convinced that INPUT=FILE is a good idea, but you cannot say it's
the only way to get the job done.

>> The
>> "content-transfer-encoding" for each part should be "binary".
>
>> Implicitly or explicitly? (Explicitly, I take it.) It might be clearer
>> to say "each part should be given a content-transfer-encoding of binary."
>
>I would actually prefer to make the default for
>content-transfer-encoding depend on the context. If the ACTION is a
>mailto:, it should correspond to the MIME default, while if the ACTION
>is a http: URL, it could be binary.

I don't think the special casing is worth the trouble. In fact,
I suspect that if you allow folks to leave out the explicit c-t-e
in HTTP transfers, they'll do the same thing for mailto:, and things
will be broken.

>(re: backward compatibility issue)
>> Is this part just an elaborate suggestion, or a proposed specification?
>> In other words, are we all supposed to implement x-please-send-files
>> the same way, or are folks supposed to work out their own solution along
>> these lines?
>
>I'm not sure that I know what status it is. Those who want to
>participate in this should all implement x-please-send-files the same
>way. I thought the representation of the data in the body was
>reasonably well specified, but I can work to make it more precise if
>you can point out a way in which it is ambiguous.

Looking at it again, I see that is complete. It could use an example
to prevent folks from misinterpreting what you wrote. For example:

> * The entire original application/x-www-form-urlencoded form data

I might take that to mean the whole application/x-www-form-urlencoded
body part, headers and all. I don't think that's what was indended,
but a little redundancy to make things clear might help.

Dan