Re: How can one use '#' in a URL?
"William M. Perry" <wmperry@mango.ucs.indiana.edu>
Message-id: <9306181430.AA24713@dxmint.cern.ch>
To: p.lister@cranfield.ac.uk
Cc: www-talk@nxoc01.cern.ch
Subject: Re: How can one use '#' in a URL?
In-reply-to: Your message of "Fri, 18 Jun 1993 15:06:33 -0000."
<9306181406.AA21659@xdm039>
Date: Fri, 18 Jun 1993 09:28:43 -0500
From: "William M. Perry" <wmperry@mango.ucs.indiana.edu>
>I've thought for some time that a nice feature in WWW servers would be
>the ability to look inside a remote tar (or tar.Z) file (or an ar
>library or whatever), i.e. remotely list the contents or just extract
>one file. I've been thinking how I could build this into a server like plexus.
>
>So that a normal URL which names a tar file will simply retrieve the
>file (as most users want, I assume) there needs to be some way of
>referring to the subunit of the file. The obvious way is to use #, e.g.
>
>http://www/foo/bar.tar.Z#blech.c
>
> - this would indicate to the server that it is to unpack blech.c from
>/foo/bar.tar.Z and send it back as a normal file. However, I have only
>seen the # documented as something which browsers can use for internal
>hyperlinks, not as part of the a URL that can be sent to a http server.
>Is this a problem?
I don't think this is correct. Any url can make use of the # directive.
It just tells a browser to go to a specific part of a document instead of
to the beginning. I don't know of any browsers that actually 'generate'
these internally.
>The question mark would seem better suited to search the list of contents,
>e.g. http://www/foo/bar.tar.Z?*.c
>would return
>
>blech.c erk.c glug.c
>
>A related question is whether I can have slashes in the test following
>the # or ?. e.g. for a tar file with relative filenames
>
>http://www/foo/bar.tar.Z#./blech.c
>
>would be common. Since slashes are meaningful to an http: URL, and the
>browser can interpret them, would this cause confusion?
The only problem with using ./ in the path would be that the browser
is likely to interpret it as a relative pathname. For example,
http://www/foo/bar.tar.Z#./blech.c would more than likely be translated
into http://www/foo/bar.tar.Z#/foo/blech.c by a browser.
It would be fairly easy to tell the replacing function not to replace
any . or .. after a # sign though. (I haven't tried this with mosaic or
lynx yet, but I know that my emacs browser would replace the ./ with nothing
so the url above would be http://www/foo/bar.tar.Z#blech.c)
-Bill Perry