Re: your mail

robm@ncsa.uiuc.edu (Rob McCool)
Message-id: <9312301921.AA27026@void.ncsa.uiuc.edu>
From: robm@ncsa.uiuc.edu (Rob McCool)
Date: Thu, 30 Dec 1993 13:21:57 -0600
In-Reply-To: rst@ai.mit.edu (Robert S. Thau)
       "Re: your mail" (Dec 29,  1:54pm)
X-Mailer: Mail User's Shell (7.2.5 10/14/92)
To: rst@ai.mit.edu (Robert S. Thau),
        Charles Henrich <henrich@rs560.cl.msu.edu>
Subject: Re: your mail
Cc: www-talk@www0.cern.ch
Content-Length: 3504
/*
 * Re: your mail  by Robert S. Thau (rst@ai.mit.edu)
 *    written on Dec 29,  1:54pm.
 *
 * First off, with regard to aesthetics, de gustibus non disputandum est.  My
 * personal 'aesthetic' objection to the semicolon syntax is that it keeps me
 * from changing directories to scripts-with-path_info and back without making
 * the change in status visible in the URLs and making me change all the
 * references.  (I don't think this is a totally wild idea --- I've been
 * chewing over turning the 'people' directory on my server into a script
 * which redirects to ~.../public_html areas if they exist for the user in
 * question, and makes up a default home page if they don't).  

I agree. We need to keep the script/document distinction arbitrary.

 * Still, so long as I can turn an ordinary *file* into a script and back
 * without having to find and change everything that cites it (which can be a
 * real pain in the butt) or doing an Alias or Redirect in srm.conf (which
 * could get ugly if they started to add up), this isn't a *major* issue.  If
 * the new syntax is an optional alternative, I have no real objection (though
 * somebody else might --- two ways of specifying PATH_INFO does add a little
 * complication to the server).  I'm frankly more hung up on the notion of
 * incompatible changes to something which has been announced as a standard,
 * over what I see as quite minor efficiency concerns.

But it is a major issue for confusion.... if we're changing the first / of
path info to a ;, but we still support the old method, then what have we
gained? A prudent server would have to do the stats anyway, although it
could search for a ;, which would alleviate the stats in a few cases (but
not all, which to me is a crucial point in determining if this change is
worth pursuing).

 * This efficiency argument is apparently the nub of the dispute --- I just
 * don't find it easy to see how these few extra stat() calls, which needn't
 * occur unless PATH_INFO is present, can possibly amount to a potentially
 * serious problem, in the context of all the other things the server does
 * when processing a request.

They do need to occur regardless of path_info's presence... however, you're
right, looking for .htaccess files in subdirectories is a larger waste of
time.

As an aside, I find it curious that Charles was bringing up efficiency as an
argument for his changes when a month or two ago I was arguing with him
about why he should run his server standalone instead of from inetd.

 * To try to put this in context, I've appended a system-call trace of my
 * (hacked) httpd processing the request 'GET /cgi-bin/fortune'.  The trace
 * was collected from a server running as 'ServerType inetd', so to keep
 * things fair I've deleted all the initialization, opening of the logs, and
 * so forth, and picked up where it actually starts to process the request.
 * For convenience, I've pointed out the PATH_INFO search in the middle of it.
 * It amounts to one stat() --- it would have been five with the stock httpd
 * (Rob goes top down, I go bottom up); 

Interesting... maybe I should go bottom up, it would probably reduce the
average case of the number of stats required.
 
 * Against this background, I
 * find it difficult to see how another stat() or two, or even ten, done only
 * for URLs which happen to invoke a script in the first place, could make
 * enough of a difference to matter.

It's mostly because they're abysmally slow under AFS.

 */

--Rob