Binary Gateway Inferface - An API for extensible HTTP servers
Simon E Spero <ses@tipper.oit.unc.edu>
Errors-To: listmaster@www0.cern.ch
Date: Wed, 22 Jun 1994 21:35:11 +0200
Errors-To: listmaster@www0.cern.ch
Message-id: <9406221932.AA01950@tipper.oit.unc.edu>
Errors-To: listmaster@www0.cern.ch
Reply-To: ses@tipper.oit.unc.edu
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: Simon E Spero <ses@tipper.oit.unc.edu>
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Binary Gateway Inferface - An API for extensible HTTP servers
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Type: multipart/mixed; boundary="----- =_aaaaaaaaaa0"
Content-Type: multipart/mixed; boundary="----- =_aaaaaaaaaa0"
Mime-Version: 1.0
Mime-Version: 1.0
------- =_aaaaaaaaaa0
Content-Type: text/plain; charset="us-ascii"
Folks-
Here's a draft of a paper describing the Binary interface used in the
High Performance HTTP. Any comments gratefully recieved.
Simon
p.s.
official pronunciation is "Boogie" :-)
------- =_aaaaaaaaaa0
Content-Type: text/plain; charset="us-ascii"
Content-Description: Binary Gateway Interface
DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT
Binary Gateway Interface -
An API for dynamically extensible HTTP servers
June 22nd 1994
Simon Spero
University of North Carolina at Chapel Hill
ses@unc.edu
Abstract:
Many HTTP servers currently support an interface protocol allowing
them to pass requests on external scripts. This protocol is known as
CGI. This mechanism is extremely flexible, but is unsuited to
high performance applications. In this paper we discuss an alternative
approach to server extensibility and propose an alternative interface
protocol based on dynamically linked functions. We compare the two
approaches and indicate some of the advantages and disadvantages of
each.
Introduction.
-------------
The Common Gateway Interface (CGI)[McCool 93] is a standard way of
allowing the manager of an information server to add extra functionality to a
server without needing to modify the http server itself. This functionality
is achieved by starting an external gateway process, and passing messages to
and from that process. CGI is not specific to the HTTP protocol.
CGI communicates with the gateway process through a number of different
mechanisms. Information about the request is passed through about 20
environment variables. Information about queries is also passed via the
command line. For requests that contain information in addition to the HTTP
header, the additional data will be made available on standard input.
The gateway script responds by sending the result to standard output.
Normally the output is processed on to the client. For efficiency, if
a script name begins with a magic string "nph-", the output is not parsed,
and may be send directly to the client.
This system is extremely flexible; however the design is not suitable for
use in high performance servers. There are several reasons for this. The
first problem is the processing overhead caused by the creation of an
extra process to handle each request.
Secondly, the server is required to process any and all HTTP headers,
and to generate an environment variable for each of them before
passing the request on to the gateway. Most of these headers will not
be needed by the gateway module.
Thirdly, unless the "nph-" escape hatch is used, the server must read and
parse the results of the gatewayed operation before sending them on to the
client.
A Binary Gateway Interface
--------------------------
An alternative way of extending the functionality of a server is to make
use of the dynamic linking facilities available under most modern operating
systems. If a standard set of function calls for handling requests is
defined, then extended operations can be handled as cheaply as standard ones.
Design Goals
------------
The designed presented in the following section is intended to meet several
design goals.
1) Fast. Extensions should be able to run as fast as
built in functions.
2) Lazy. Headers should not be parsed or evaluated unless
absolutely necessary.
3) Portable. Gateways developed for one operating system should
be usable on another system without requiring
extensive modifications.
4) Simple. The gateway author should not spend more time
working on the interface code than she does on
the actual gateway.
BGI design
-----------
The design is somewhat inspired by the Plan 9 file system, and to a lesser
extent, the extension system used for the System V.4 name resolution library.
The BGI model is based on the model of a hierachical name space. Specialised
handlers can be mounted at any point in the name space; these handlers will
be responsible for handling any requests that lie beneath their mount points,
unless a more specific handlers is mounted below it.
Servers do not need to use this model internally; however BGI handlers do
need to be told where they are mounted so that they can determine how much
prefix to remove from a URL.
Example: Suppose we have a namespace with the following handlers
mounted at the indicated points.
Mount point Handler
--------------------------------------------------
/ file_handler
/image-maps map_handler
/pictures picture_handler
/pictures/office-scene videopix_handler
/cgibin cgi_handler
/search-me wais_handler
A request for "/pictures/simon.gif" would be handled by picture_handler, as
would a request for "/pictures/simon.jpeg". However, a request for
"/pictures/office-scene" would invoke the videopix_handler.
However, asking for "/picture" would invoke the file_handler.
BGI handlers are compiled object code modules containing three functions
which are used to mount and unmount handlers, and to handle incoming requests.
Handler Methods
---------------
Init
int <module>_init(char* mount_point)
This function is used to initialise a handler for attachment to a point in
the namespace. The value returned should either be 0, indicating that a problem
occured, or a cookie which will be passed to the handler function.
Unmount
int <module>_umount(char* mount_point, int cookie)
This function should remove the handler from the indicated mount point.
Handler
int <module>_handler(int operation, int cookie, int socket, char* url,
char* header_buf, int buf_size)
This function handles all requests on this mount point.
Arguments:
'operation' indicates the HTTP method that is being invoked. The only
currently defined values are OP_GET=1, and OP_POST=2. If other values are
recieved, the function should signal an error as indicated below.
'cookiee' is the token returned by the initialisation function.
'socket' is the file descriptor for the current connection.
'url' is the URL that is being processed. This url should have any leading
protocol specifiers removed before the handler is called.
'header_buf' contains a pointer to any data that may already have been read
from the connection before the handler was called.
'buf_size' indicates the amount of valid data in header_buf.
Result code:
If no errors occur, the handler function should return 0; if an error does
occur, the handler should either return -1, indicating that the server should
just close the connection, or a valid HTTP result code, indicating that the
server should generate an error message before closing the connection.
Notes:
All handler functions should be re-entrant.
Handler functions should not close the connection themselves.
Library functions
-----------------
Server implementors should make the following functions available to gateway
implementors.
---
int handle_url(int operation, int socket, char* url, char* buf, int size)
Used to handle redirections, so that a handler can simply compute an alternate
url and then have that resolved.
---
int http_error(int socket, int code)
Generate an error message corresponding to error 'code'
---
MORE NEEDED HERE
Comparisons
-----------
BGI offers a much faster alternative to CGI for extending servers; however
there are several disadvantages. The most obvious problem is that BGI itself
uses compiled modules, whereas CGI programs can be written in interpreted
languages. Since a CGI emulation module can be implemented under BGI, this
is problem can be circumvented.
Also, since BGI doesn't automatically handle all header processing,
if extensive header processing is needed, this must be handled by the
application. Adding functions to support header manipulation to the support
library would certainly help this.
Open Issues
------------
1) It might be better to have separate handlers for each method, rather than
having the single handler with its operation argument. This would allow
different handlers to manage GET and POST requests. However, this would
complicate the interface, since most handlers would only support a single
method.
2) Adding more functions to the support library will make implementing
gateways easier. I'm open to suggestions.
References:
[McCool 93] Introduction to CGI, http://hoohoo.ncsa.uiuc.edu/cgi/
------- =_aaaaaaaaaa0--