Comments on File Upload

Francois Yergeau (yergeau@alis.ca)
Wed, 2 Aug 95 17:09:33 EDT

In the midst of working on the HTML i18n draft, I reread the
file-upload draft (draft-ietf-html-fileupload-02.txt) carefully with
regard to i18n issues, and found some comments worth making.

First, this stuff is a Good Thing for i18n.

Nevertheless, there are a couple of things that, IMHO, need tweaking:

>3.2 Action on submit
>
> When the user completes the form, and selects the SUBMIT element,
> the browser should send the form data and the content of the
> selected files. The encoding type application/x-www-form-urlencoded
> is inefficient for sending large quantities of binary data

Nit: 'or text containing non-US-ASCII characters.'

>3.3 use of multipart/form-data
>
> [...]
> Each part has an
> optional Content-Type (which defaults to text/plain).

Change to:

Each part has an optional Content-Type, which defaults to
"text/plain; charset=ISO-8859-1", except when the ACTION is a
"mailto:" URL, where the default is "text/plain; charset=US-ASCII".

> The "content-transfer-encoding" header should be supplied for all
> fields whose values do not conform to the default 7BIT encoding.
> (All characters 7-bit US-ASCII data with lines no longer than 1000
> characters.) Otherwise, file data and longer field values may be
> transferred using a content-transfer-encoding appropriate to the
> protocol of the ACTION in the form. For HTTP applications,
> content-transfer-encoding of "binary" may be use. If the ACTION is
> a "mailto:" URL, then the user agent may encode the data
> appropriately to the mail transport mechanism. [See section 5 of
> RFC 1521 for more details.]

Please let's not impose old SMTP requirements on the Web. Change that
to:

Each part's data should be transferred using a
content-transfer-encoding appropriate to the protocol of the ACTION
in the form. For HTTP applications, content-transfer-encodings of
"7bit", "8bit" or "binary" may be used, all meaning the same
(i.e. no encoding), and thus necessitating no
"Content-Transfer-Encoding" header; such a header should be
provided, however, if some other transfer-encoding is applied. If
the ACTION is a "mailto:" URL, then the user agent should encode the
data appropriately to the mail transport mechanism [See section 5 of
RFC 1521 for more details] and provide a suitable
"Content-Transfer-Encoding" header as necessary.

>7. Registration of multipart/form-data
>
> [...]
> The name of the field
> is restricted to be a set of US-ASCII graphic characters;
>
> [...]
>
> Note that mime headers are generally required to consist only of
> 7-bit data in the US-ASCII character set. This specification thus
> requires that the field names used consist of 7-bit ascii US
> characters.

Unnecessarily restrictive; field names are CDATA in HTML. Strike out
the first sentence and change the second paragraph to something like:

Note that mime headers are generally required to consist only of
7-bit data in the US-ASCII character set. Hence field names should
be encoded according to the prescriptions of RFC 1522 if they
contain characters outside of that set.

> The "content-transfer-encoding" header should be supplied for all
> fields whose values do not conform to the default 7BIT encoding
> (all characters 7-bit US-ASCII data with lines no longer than 1000
> characters.)
>
> Otherwise, file data and longer field values may be
> transferred using a content-transfer-encoding appropriate to the
> protocol of the ACTION in the form. For HTTP applications,
> content-transfer-encoding of "binary" may be use. If the ACTION is
> a "mailto:" URL, then the user agent may encode the data
> appropriately to the mail transport mechanism. [See section 5 of
> RFC 1521 for more details.]

Same comment as for 3.3 above.

-- 
François Yergeau <yergeau@alis.ca>
Alis Technologies Inc., Montréal
Tél: +1 (514) 738-9171
Fax: +1 (514) 342-0318