Re: CGI, semicolons, and so on...

john@math.nwu.edu (John Franks)
From: john@math.nwu.edu (John Franks)
Message-id: <9312292024.AA00540@hopf.math.nwu.edu>
Subject: Re: CGI, semicolons, and so on...
To: rst@ai.mit.edu (Robert S. Thau)
Date: Wed, 29 Dec 1993 14:24:12 -0600 (CST)
Cc: john@math.nwu.edu, www-talk@nxoc01.cern.ch
In-reply-to: <9312291846.AA03803@volterra> from "Robert S. Thau" at Dec 29, 93 01:46:09 pm
X-Mailer: ELM [version 2.4 PL23]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 3291      
Look, this discussion has wandered far away from the point I wanted to
make initially.  I am not unhappy with the functionality or the
flexibility of the current PATH_INFO syntax.  I am unhappy with the
design of that syntax because it is inelegant, cumbersome to
implement, and conducive to misunderstanding by people who are only
marginally familiar with the protocol.

Let me mention the syntax used by the Minnesota gopher server for a
similar function.  It is fairly similar to URL syntax.  The path part
of the URL looks like

	exec:args:/path/script

I am NOT advocating that this be adopted as part of CGI.  I offer it
only as an existence proof that it is not too difficult to design a
syntax whose server implementation will have several important
properties:

1) It is simple and clear.  I don't think I even need to explain it;
   its meaning should be self evident.  THE FACT THAT THIS PART OF THE
   URL IS OPAQUE DOES NOT MEAN IT SHOULD BE OBSCURE.

2) It is easy and efficient to parse.  It is not necessary to stat any
   files or directories in order to parse it. 

3) It is not necessary to maintain a configuration file which is read
   and processed each time a server starts up.  

4) It uses no magic directory names like "cgi-bin" and no magic 
   extensions like ".doit".  You can name files, scripts and directories
   whatever you like and mix files and scripts in any directory.

I believe that WWW community can do at least as well as the gopher
designers.  That's all I wanted to say.  I concede that it is late in
the game to be requesting changes.  If it is too late, so be it.
Maybe I am just old-fashioned in my belief that programs (and
protocols) should try to be simple, clear, concise and if at all
possible elegant.

Robert Thau points out that it is possible to modify the NCSA server
so that some of these properties are achieved within the current
protocol.  Fine.  But IMHO, if we had had a better protocol to begin
with the original implementation would have been better and his
modifications would never have been necessary.

Interestingly, the gopher syntax does not address the issue raised by
Charles Henrich who, quite reasonably, suggests that putting PATH_INFO
in the environment should be independent of indicating that a file is
executable.  I would agree.

It seems to me now that there are (at least) three pieces of
information which need to be contained in the path part of a URL:

1) Name and path of the file/script
2) string to be put in PATH_INFO environment variable (if any)
3) Is this file/script to be executed or treated as text

I would suggest as an important design criterion that these three
pieces of information should be orthogonal, i.e. the value of one
should not restrict the possible values of the others.  For example,
as Charles Henrich pointed out, my suggestion to make the existence of
PATH_INFO data be the indicator that the file should be executed is
not a good idea.  Likewise, I would contend that it is not a good idea
to have the fact that a file is intended to be executed restrict
either its possible name or possible path.  It is certainly not
technically difficult to have such a design.  The question is, is it
too late.


John Franks 	Dept of Math. Northwestern University
		john@math.nwu.edu