Re: URL decisions in Seattle, & changes

"Daniel W. Connolly" <>
Date: Thu, 31 Mar 1994 18:03:25 --100
Message-id: <>
Precedence: bulk
From: "Daniel W. Connolly" <>
To: Multiple recipients of list <>
Subject: Re: URL decisions in Seattle, & changes 
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 2018
In message <>, Larry Masinter write
>>> In future schemes, will '/' and '%2F' mean the same thing or different
>>> things? I gather that the answer is "it depends." This rules out the
>>> idea of having one algorithm for reducing a URI to canonical form.  So
>>> the question of whether
>Well, in fact, the 'canonical' form for any URL must necessarily be
>protocol specific.

I still disagree. It is possible to specify a canonical form for URLs
independent of scheme. The quoting scheme described by Tim and myself
(and implemented in HTParse.c and tested in my test suite...) does
just this.

> This is true for the default port (e.g., that
>http://host:80/ is the same as http://host/ but gopher: has 70 as a
>default port, etc.)

Given the definition of equality I proposed, http://host:80/ is
different from http://host/. The fact that they resolve to the same
thing is not part of the URL spec.

> that the same host might have multiple DNS names,
>or that some FTP servers allow case insensitive file names, any number
>of actual equivalences, symbolic links, etc.

None of these things should be part of the URL spec. But things
that are used in practice today, i.e. the significance of ?, /,
and %xx, should be.

>In the grand scheme of things, if you treat "/" and "%2F" as
>different, then at most you'll treat a few things as 'different' that
>are really the 'same', but in fact, this will be an insignificant
>amount compared to the other kinds of duplications.

In the grand scheme of things, the question is whether there's any
common structure to the "parameter package" of a URL. It sounds like
the decision is that there is not, even though this contradicts current

So the grammar for URLs is just:


with terminals:
	IALPHA =~ /[a-zA-z][a-zA-Z0-9-_]*/;
	CHARS =~ /[^ <>]*/;

I'm interested to know if the most widely deployed URL implementations
(www, Mosaix, ...) are going to change to conform to this.