Re: BGI-spec 1.4

Guido.van.Rossum@cwi.nl
Tue, 5 Jul 1994 06:13:42 +0200

> The design is somewhat inspired by the Plan 9 file system, and to a lesser
> extent, the extension system used for the System V.4 name resolution library.

Sure... Now tell me how you are planning to implement this. Suppose
I'm using SVR4 style dynamic linking of libraries, and I want to mount
an extension at /cwi/people/ -- where do I put my .so file, and how do
I tell the server that it's there? (As an aside, can I tell the
server to load a new version of the .so file without bringing it
down?) Note that I'm not cynical -- I just like to know.

> A request for "/pictures/simon.gif" would be handled by picture_handler, as
> would a request for "/pictures/simon.jpeg". However, a request for
> "/pictures/office-scene" would invoke the videopix_handler.
> However, asking for "/picture" would invoke the file_handler.

Is this done on a pathname component basis, or on string comparison?
If I had a directory named /pictures-huge/, would it be served by the
picture_handler or by file_handker? (I hope the latter, but somehow
your example doesn't makle this clear -- especially since it
explicitly shows the reverse case.)

> The value returned should either be 0, indicating that a problem
> occured

Who's responsible for logging an error in this case? I'd like to be
able to pass an error string on to the client that's unfortunate
enough to be hit by this, even if the init routine can log its own
error in the error_log file.

> int <module>_umount(char* mount_point, void* cookie)
>
> This function should remove the handler from the indicated mount point,
> and free up any memory allocated for the cookie.

Surely the cookie contains the mount point, so the mount_point
argument is redundant. Also maybe rename to <module>_unmount (no need
to copy UNIX naming craziness) or <module>_cleanup (the mount metaphor
isn't too strong I'd say, and you don't call the _init function
<module>_mount).

> uri: The uri passed for this request. All hex escapes will be replaced
> by the corresponding characters before this routine is called.

I think you will have to leave the hex escapes in. E.g. if a '?'
occurs in a pathname, it should be encoded, but a '?' meaning a search
key should be unencoded. Similarly, Mosaic forms with METHOD=GET use
the form <path>?<name>=<value>&<name>=<value>&... with '=' and '&'
hex-escaped in name and value. You don't want to lost this
distinction!

> version: The version string passed in the request. If no version was passed,
> this string will be set to null.

Just to be sure, this would be "HTTP/1.0" currently, or NULL for HTTP
0.9 GET requests, right?

> buf: This argument is a container for the socket to use for this request
> together with a buffer containing information already read from the
> client.
>
> typdef struct _sockbuf {
> char* buffer; // pointer to start of I/O buffer
> int buf_size; // total size of this buffer
> char* end_of_data; // pointer to character after end of valid data
> // in this buffer.
> char* current_ptr; // pointer to first available character in buffer
> int sock; // the socket
> }

What can I expect to be in the buffer? A random amount of data after
the first line of the request? Can I overwrite the data in the
buffer? (I suppose so, otherwise a pointer and a count would be
sufficient.)

> Result code:
>
> If no errors occur, the handler function should return 0 or 200. If an error
> occurs, the handler should return either 0, or a valid HTTP error code. If
> a status code other than 200 is returned, the server will generate an
> appropriate error message.

I'm sorry, this is totally ambiguous. Does a return value of 0 mean
success or failure? If a handler encounters an error after it has
started writing data to the socket, what should it do? (Since this is
a high performance protocol, that could easily happen!)

> All handler functions must be re-entrant.

Are you planning to use multiple threads, or to call handlers from
signal handlers? Do you provide synchronization primitives (e.g. to
serialize access to the stuff in the *cookie buffer)?

> int http_error(int socket, int code, char* version)
> Generate an error message corresponding to error 'code'

"Generate"... what exactly does this do? Write a complete HTTP error
response? Can I write some data to the socket afterwards?

> 1) It might be better to have separate handlers for each method, rather than
> having the single handler with its operation argument. This would allow
> different handlers to manage GET and POST requests. However, this would
> complicate the interface, since most handlers would only support a single
> method.
>
> Currently, my favourited solution is to go with a single function per
> mountpoint, but to then implement a BGI module that dispatches to other
> BGI modules based on the method.

I agree with the single method approach.

> 2) Adding more functions to the support library will make implementing
> gateways easier. I'm open to suggestions.

Some ideas... Decode % escapes in a string; (shallowly) parse the
next RFC-822 header (something like return a pointer to the name, with
the colon zapped, plus pointers to the start and end of the header
text -- possibly spanning continuation lines); skip to the end of
RFC-822 headers.

(Actually, at supposedly little cost, can't you use stdin instead of a
raw socket? Usi fdopen(sock, "r") to open a FILE and then you can
just use fgets() to read the next line if you really want to parse
headers. Having part of the data in the buffer structure would make
this awkward. Surely fdopen() and fgets() won't be a big performance
hog?)

--Guido van Rossum, CWI, Amsterdam <Guido.van.Rossum@cwi.nl>
URL: <http://www.cwi.nl/cwi/people/Guido.van.Rossum.html>