Re: CERN httpd and libwww 2.14 released

robm@ncsa.uiuc.edu (Rob McCool)
Message-id: <9311172112.AA04016@void.ncsa.uiuc.edu>
From: robm@ncsa.uiuc.edu (Rob McCool)
Date: Wed, 17 Nov 1993 15:12:03 -0600
In-Reply-To: luotonen@ptsun00.cern.ch (Ari Luotonen)
       "Re: CERN httpd and libwww 2.14 released" (Nov 17,  9:21pm)
X-Mailer: Mail User's Shell (7.2.5 10/14/92)
To: luotonen@ptsun00.cern.ch (Ari Luotonen), phillips@cs.ubc.ca
Subject: Re: CERN httpd and libwww 2.14 released
Cc: www-talk@nxoc01.cern.ch
/*
 * Re: CERN httpd and libwww 2.14 released  by Ari Luotonen (luotonen@ptsun00.cern.ch)
 *    written on Nov 17,  9:21pm.
 *
 * 
 * > For simple scripts, escaping and parsing is what you want and it
 * > does make sense.  For complex scripts, you really want the server
 * > to touch the URL as little as possible.  It doesn't know what your
 * > escaping scheme is and it shouldn't know because your script will
 * > have to be responsible for escaping URLs in the HTML output.
 * > Moreover, your escaping scheme may not be the "standard" one
 * > (for http: scheme URLs, there's no requirement you use % hex escapes).
 * > I often do escaping which keeps the URLs smaller and more human
 * > decodable (e.g., translating ' ' -> '_').
 * 
 * ??? I thought HTTP doc specifies how to escape illegal characters
 * in URL?
 * 
 * Tony's private message, however, pointed out that in future my current
 * _parsing_ scheme can lose information.  I still would like casual
 * script programmers have a very clear and easy interface that is not
 * burdened with the ability of coping every possible future feature.
 * 
 * So what I was thinking of just this afternoon was that the _script_
 * should be able to request the server to call it in the way it (script)
 * wants, and not vice versa.  This could be done e.g. by filename extensions.
 * This way the script could, at it's own will, be called with raw URL,
 * or with pre-parsed URL (which is very nice for 90% of the cases).
 * Script could also ask URL to be passed to it from stdin instead of
 * command line.

File name extensions may work. My approach has been to provide C code and an
externally callable program to unescape the URL's.

 * By the way, I must admit I was very busy with AA when the original
 * /htbin discussion went on.  Was there a strong opposition to having
 * scripts just anywhere and not only in bindir directory;  and was there
 * any reason for having the URL start with /htbin/ (I know that can
 * be configured for NCSA server, but it is still constant once it's
 * defined)?

It used to be, now it isn't. See below.

As far as the background of server scripts, there was never a discussion.
Nor was there any particular reason to pick /htbin. The script interface was
something I put together because I saw the gateway capabilities in Plexus
and thought it would be great to have them in our daemon. So I designed an
interface, implemented it, and released it. It was never intended as a
standard, it was just a feature. 

 * I've been thinking of having an exec rule that would, when URL
 * matches a given template, execute a given script.  Just for
 * an example current htbin field in our rule file:
 * 
 * 	htbin /x/y/z
 * 
 * (which gives the physical bindirectory to CERN daemon) could be
 * expressed as:
 * 
 * 	exec  /htbin/*  /x/y/z/*
 * 
 * This would free us from /htbin/ being translated specially in URL,
 * and would introduce more flexibility and power to scripts.
 */

The ScriptAlias directive does that in 1.0a5. It's exactly like what you
describe above.

--Rob