More CGI Comments

rst@ai.mit.edu (Robert S. Thau)
From: rst@ai.mit.edu (Robert S. Thau)
Date: Sun, 9 Jan 94 13:45:33 EST
Message-id: <9401091845.AA09251@volterra>
To: rhb@hotsand.att.com
Cc: www-talk@library.ucsf.edu
In-reply-to: rhb@hotsand.att.com's message of Sat, 8 Jan 94 20:27:36 EST <9401090127.AA28418@hotsand.dacsand>
Subject: More CGI Comments
Content-Length: 3974
   From: rhb@hotsand.att.com
   Date: Sat, 8 Jan 94 20:27:36 EST

   All my scripts have a #! notation at the top.  I would think looking
   for any files of this type would indicate scripts (though this may
   be unmanageable/inefficient).

Good idea.  There are two complications.  First off, to make this work
properly, you need to check for the binary executable magic numbers, in
addition to '#!', so the server can run C programs as 'scripts'.  (Such
programs exist --- imagemap, for one).

The other, more serious, complication is that if some binary data file
happens to begin with a magic number, the server would refuse to serve it
up (instead trying, and failing, to run it).  If I'm not mistaken, most
common binary data formats (Sun .au, AIFF, JPEG (JFIF), MPEG, tar,
compressed data...) define their own series of magic numbers which are
unlikely to conflict, but relying on this would be dicey.

I don't see that efficiency is necessarily an issue --- if the thing in
question turns out to be a script, then the open and read for the magic
number check is minor next to the cost of actually running the script (this
assumes, of course, that open() is cheaper than exec()).  Conversely, if it
turns out to be an ordinary file, then the server would have to open and
read it anyway.

   In any case, it's not clear to me that
   looking through old versions of scripts that may exist in a directory
   is particularly dangerous (especially if, as you say, they are are typically
   saved with a common suffix). 

It's every bit as dangerous (or not) as letting people browse the current
versions, unless the last edit closed *all* the security holes which the
old versions may have had.  Maybe it did, but in my experience, this is
not the way to bet.

   Even if we assume that we segregate scripts into seperate directories
   for users, we can't let all the users use the same bin directory for
   scripts (one possible solution is assuming by default in the httpd
   server a public_html/cgi-bin directory for add-on users...?).

   Rich

At sites where ordinary users can and do put up scripts, something better
than a single cgi-bin is obviously necessary.  In conversation on this list
over the past couple of weeks, people have discussed all sorts of
alternatives...

  *) Let users create (and designate) their own bin directories,
     with a  ~/cgi-bin default, .htaccess file, or some similar mechanism.

  *) Let them mix files and scripts in certain directories ---
     the server tells the files from the scripts by

     - magic number tests (as you suggest, including '#!' as a magic number)

     - naming conventions (as in the server running here)

     - -x bits set on the individual files (praised by some for simplicity,
       opposed by others because of the traps it lays for the bumble-fingered)

     - explicit designation as scripts in some sort of external meta-database,
       such as the GN server's .cache files.

To lay my own cards on the table, I think it's important for access control
that scripts have some sort of mark on them which they cannot easily lose.
If that can be achieved without segregating scripts from ordinary files by
directory, so much the better.

IMHO, a suffix naming convention is all right from this perspective, while
I have my doubts about -x bits (because, in my experience, accidental
chmods are easier to make than accidental renames, and harder to detect).

If the binary-data-format problem can be resolved, then the magic number
idea is better yet.  It helps a lot with the "'*.orig'-and-then-what-else?"
problem of trying to anticipate where outdated versions of script code
might appear.

However, I could live with any of these alternatives.  What would be nice
at this point would be to develop a consensus around one of them, and get
it into the servers, so that users could actually have the capability
(where, of course, the server administrator feels it proper to grant it).

rst