Re: Request: URN support in browsers needed!

Larry Masinter (masinter@parc.xerox.com)
Fri, 21 Oct 1994 23:15:20 +0100

I said:
>In general, any unknown scheme should be parsed to the proxy server
>unscathed!

and Dan asked:
>Are there exceptions for fragment identifiers, and search terms, and
>relative paths, or not?...

pointing out an example where an unknown scheme is followed by a
#section1 fragment identifier.

I think the answers are:

* fragment identifiers are *not* part of the URL (despite what RFC1630
says; this is an area that has been fixed in the current working
draft, soon to be a proposed standard RFC).

* search terms are part of some URL schemes and not others. Those that
have them may use a different syntax than others. Gopher URLs can
contain ? without having the ? treated as the introduction to a
search term, for example. It would be wrong to have URN => URL
remove any '?search' suffix, translate the sub-part, and then
reapply the ?search.

* relative paths are not complete URLs, and thus, the consideration
of what gets passed to the proxy isn't applicable. Rather, a
relative path gets combined with the base URL to produce a new
complete anchor specification; this anchor spec then can be
treated with the general algorithm.


Thus, in:

<p>See <a href="urn:234lkj23/4lk2j#section1">section 1 of "The
Birds and The Bees</a> for more info.
</body>

the data
GET urn:234lkj23/4lk2j HTTP/1.0

should be sent to the proxy. Supposing the resulting document contains

<html>
<head><isindex><title>An Example</title></head>
<body>
<p> See <a href="../overview.html">the overview</a> for
more info.

However, the relative path "../overview.html" is relative to the
'base' location of the result, and not necessarily its original URN
specification.

> Must the process of resolving urn:234lkj23/4lk2j yield some "local"
> address (ala the HTTP URI: or Location: header or the HTLM <BASE> tag)
> for use with relative HREF's?

I believe the processing of relative paths can be described
independently of the scheme of the base URL to which it is being
applied. If the user selects a relative path HREF, it will be combined
with whatever the base reference for the document is. If you want to
use 'urn:' without translation to some more "local" address, then
either the URNs must be spelled with nested a/b/c/d syntax so that the
relative links can be computed against them, or else you must avoid
using them in combination with relative paths, and, for example,
replace all of the relative paths in the document with explicit fully
specified URIs.

I think the same reasoning goes for <ISINDEX> queries, i.e., they only
work for those schemes that support ? as the query method with the
particular syntax defined for it, against the base URI. In this case,
it is even more clear that a 'urn' scheme would want supply a 'base'
against which subsequent queries and relative paths should be defined.

Dan also properly pointed out the irony of my complaint about
'multiple proxy servers' in the light of the experimental urn proxy.

The simplest way to handle the cases of search and relative pointers
would be to have the urn proxy do redirects rather than return the
resource.