Re: CGI, semicolons, and so on...

robm@ncsa.uiuc.edu (Rob McCool)
Message-id: <9312301935.AA27158@void.ncsa.uiuc.edu>
From: robm@ncsa.uiuc.edu (Rob McCool)
Date: Thu, 30 Dec 1993 13:35:39 -0600
In-Reply-To: john@math.nwu.edu (John Franks)
       "Re: CGI, semicolons, and so on..." (Dec 29,  2:24pm)
X-Mailer: Mail User's Shell (7.2.5 10/14/92)
To: john@math.nwu.edu (John Franks), rst@ai.mit.edu (Robert S. Thau)
Subject: Re: CGI, semicolons, and so on...
Cc: www-talk@nxoc01.cern.ch
Content-Length: 5130
/*
 * Re: CGI, semicolons, and so on...  by John Franks (john@math.nwu.edu)
 *    written on Dec 29,  2:24pm.
 *
 * Look, this discussion has wandered far away from the point I wanted to
 * make initially.  I am not unhappy with the functionality or the
 * flexibility of the current PATH_INFO syntax.  I am unhappy with the
 * design of that syntax because it is inelegant, cumbersome to
 * implement, and conducive to misunderstanding by people who are only
 * marginally familiar with the protocol.

I will agree that it is cumbersome to implement... and perhaps that it is
confusing. 

 * Let me mention the syntax used by the Minnesota gopher server for a
 * similar function.  It is fairly similar to URL syntax.  The path part
 * of the URL looks like
 * 
 * 	exec:args:/path/script
 * 
 * I am NOT advocating that this be adopted as part of CGI.  I offer it
 * only as an existence proof that it is not too difficult to design a
 * syntax whose server implementation will have several important
 * properties:
 * 
 * 1) It is simple and clear.  I don't think I even need to explain it;
 *    its meaning should be self evident.  THE FACT THAT THIS PART OF THE
 *    URL IS OPAQUE DOES NOT MEAN IT SHOULD BE OBSCURE.

OBSCURE HOW!?!?!? If I set up or write a script, I damn well know what that
URL means.

 * 2) It is easy and efficient to parse.  It is not necessary to stat any
 *    files or directories in order to parse it. 
 * 
 * 3) It is not necessary to maintain a configuration file which is read
 *    and processed each time a server starts up.  
 * 
 * 4) It uses no magic directory names like "cgi-bin" and no magic 
 *    extensions like ".doit".  You can name files, scripts and directories
 *    whatever you like and mix files and scripts in any directory.

Please. Magic directory names and magic extensions are the domain of the
server; it's up to the server authors to decide how they're going to
distinguish between documents and scripts. I happen to view our decision to
make the distinction between script execution and document retrieval
transparent as a feature, I'm really sorry that you don't view it in that way.

 * I believe that WWW community can do at least as well as the gopher
 * designers.  That's all I wanted to say.  I concede that it is late in
 * the game to be requesting changes.  If it is too late, so be it.
 * Maybe I am just old-fashioned in my belief that programs (and
 * protocols) should try to be simple, clear, concise and if at all
 * possible elegant.

Well, ``simple, clear, concise, and elegant'' are highly subjective terms,
and what you view as simple and clear may be what I view as limiting.

You want to see something like what the gopher people had? Look at NCSA
httpd 1.0a1, from October. We've been growing since then to try and add what
we felt was needed capabilities to the script interface (the path info
stuff). I propogated those changes to the CGI interface because they're very
useful.

 * Robert Thau points out that it is possible to modify the NCSA server
 * so that some of these properties are achieved within the current
 * protocol.  Fine.  But IMHO, if we had had a better protocol to begin
 * with the original implementation would have been better and his
 * modifications would never have been necessary.

Well, I felt that the PATH_INFO modifications were too powerful to be
overlooked, and since NO ONE COMPLAINED AT THE TIME EVEN THOUGH THEY WERE
ASKED TO, then I'm really sorry you don't like the way it turned out, but I
don't see a problem.

 * Interestingly, the gopher syntax does not address the issue raised by
 * Charles Henrich who, quite reasonably, suggests that putting PATH_INFO
 * in the environment should be independent of indicating that a file is
 * executable.  I would agree.

So would I.

 * It seems to me now that there are (at least) three pieces of
 * information which need to be contained in the path part of a URL:
 * 
 * 1) Name and path of the file/script
 * 2) string to be put in PATH_INFO environment variable (if any)
 * 3) Is this file/script to be executed or treated as text

How about query information?

 * I would suggest as an important design criterion that these three
 * pieces of information should be orthogonal, i.e. the value of one
 * should not restrict the possible values of the others.  For example,
 * as Charles Henrich pointed out, my suggestion to make the existence of
 * PATH_INFO data be the indicator that the file should be executed is
 * not a good idea.  Likewise, I would contend that it is not a good idea
 * to have the fact that a file is intended to be executed restrict
 * either its possible name or possible path. It is certainly not
 * technically difficult to have such a design.  The question is, is it
 * too late.
 * 
 */

Now I'm really confused... how do you intend to determine if something is to
be executed if you don't have a config file directive, filename extension,
and have thankfully abandoned the idea of putting a ; at the end???? I
happen to think our design is not THAT BAD as it stands, and I'm really
sorry you don't feel that way.

--Rob