CGI suggestion

marca@ncsa.uiuc.edu (Marc Andreessen)

Mail folder: WWW Talk Oct 93-present
Next message: ts: "CGI suggestion"
Previous message: John Franks: "CGI suggestion"
In-reply-to: John Franks: "CGI suggestion"
References: John Franks: "CGI suggestion"
Reply: ts: "CGI suggestion"

Date: Tue, 28 Dec 93 08:31:47 -0600
From: marca@ncsa.uiuc.edu (Marc Andreessen)
Message-id: <9312281431.AA08977@wintermute.ncsa.uiuc.edu>
To: john@math.nwu.edu (John Franks)
Cc: www-talk@nxoc01.cern.ch
Subject: CGI suggestion
In-reply-to: <9312271649.AA03002@hopf.math.nwu.edu>
References: <9312271649.AA03002@hopf.math.nwu.edu>
Content-Length: 3949

John Franks writes:
> Now that I am seriously looking at implementing the CGI interface,
> I find one part problematic.  This is the way that "state information"
> or arguments to a script get encoded in a URL as a sort of pseudo-path
> at the end.
> 
> Here are my objections:
> 
> 1. It is not possible to fully parse the URL without knowledge of the
> server's file hierarchy.  For example, without knowing something about
> the file structure of the server I can't tell whether 

Who is "I" in this context?  If I == the server, then the server's
file hierarchy is in fact known.  If I == some user, then it doesn't
matter one way or the other, does it (since the URL should be
considered opaque anyway)?  I'm probably missing something...

Cheers,
Marc


> 
> http://host.edu/foo1/foo2/foo3
> 
> means script /foo1/foo2 with parameter foo3 or script /foo1 with
> parameter /foo2/foo3.  I am not sure that there won't at some point be
> a need to get this information.  Maybe not, but in any case this syntax is
> cumbersome to implement.
> 
> 2. Assuming in the example above that the parameter is foo3 (or /foo3 ?)
> then the URL actually refers to two files: root/foo1/foo2 and, say,
> root/u/Web/foo3.  Inexperienced users will find this confusing and 
> expect to find an actual file root/foo1/foo2/foo3.
> 
> 3. This syntax overloads the '/' token so it has very different meanings
> depending on context and does this in a situation where the context 
> isn't readily visible.  In my experience this is conducive to errors.
> 
> 
> SUGGESTION:
> 
> I would like to make it a CGI *requirement* that the PATH_INFO data
> at the end of a URL contain an '=' and that this '=' be before the 
> occurence of any '/' in this data.  
> 
> Here is what the example above might be like:
> 
> 	/foo1/foo2/path=foo3
> 
> Other legal and useful URL's might end like
> 
> 	/foo1/foo2/param1=value1&param2=value2
> 
> 	/foo1/foo2/path=foo3/foo4&path2=foo5
> 
> URL's like this existing one from the xerox parc map server would be
> perfectly legal.
> 
> 	http://pubweb.parc.xerox.com/map/color=1/ht=30/lat=38.8/lon=-96
> 
> But I would encourage map/color=1&ht=30 etc. instead of using '/' as
> the separator.  The main reason is that code to parse the '&' version
> should be common since it is necessary for forms.
> 
> If the server knows that an '=' will occur at the begining of the
> PATH_INFO data, (and that any ='s in the actual path are URL encoded)
> then this information can be used to parse the URL without knowledge
> of the server filesystem.  Also it is quite clear that expressions like
> foo1/foo2/path=foo3 refer to two files not one.
> 
> The only significant change in the current CGI implementations that
> this would require is the PATH_TRANSLATED environment variable.  I
> would suggest that this be replaced by a variable containing a
> directory name and then the script could create the translated path.
> For example if the URL ended in
> 
> 	/foo1/foo2/file1=foo3&file2=foo4/foo5
> 
> then the script could read the environment variable to get the directory,
> say, "/u/Web" and could reconstruct the file names /u/Web/foo3 and
> /u/Web/foo4/foo5.  Notice that this allows more than one file name
> to be passed to the script which is not currently possible.
> 
> One final minor suggestion.  If the PATH_INFO data actually starts
> with '=' as the first character, I would have the server strip this
> character before putting the information in the environment variable.
> This would be convenient for very simple scripts that shouldn't have
> to do any parsing.  Thus a URL ending in
> 
> 	/foo1/foo2/=foo3/foo4 
> 
> would have PATH_INFO set to "foo3/foo4".  You could also keep the 
> PATH_TRANSLATED environment variable for this kind of URL and then
> almost no changes would be necessary in current scripts.
> 
> What do you think?
> 
> 
> John Franks 	Dept of Math. Northwestern University
> 		john@math.nwu.edu