Re: Testing URIs for equality

"Daniel W. Connolly" <connolly@hal.com>
Errors-To: listmaster@www0.cern.ch
Date: Wed, 16 Mar 1994 23:09:40 --100
Message-id: <9403162156.AA10729@ulua.hal.com>
Errors-To: listmaster@www0.cern.ch
Reply-To: connolly@hal.com
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: "Daniel W. Connolly" <connolly@hal.com>
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: Testing URIs for equality 
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 1300
In message <2D876CE8@MSMAIL.INDY.TCE.COM>, Fisher Mark writes:
>
>Two things:
>1. I really hate to break existing code and/or documents; but
>2. Special cases in code or data are the software engineer's nightmare.
>
>It seems much simpler to allow everything to be escaped.  In that case, 
>multiple escaping would be prohibited, so that:
>     %25%32%30
>is turned into:
>     %20
>but not then turned into " " (single blank).  The problem with this, of 
>course, is -- how many servers and clients would this break?

I think I need more context to completely understand your example.
But if what you're saying is that
	URI_compare("%25%32%30", "%2520") should return TRUE,
then I'm with you -- the first argument just has to be reduced
to canonical form before parsing or whatever...

If, on the other hand, you're saying that
	URI_compare("%25%32%30", "%2520")
should accomplish this by undoing _all_ the escapes and doing:
	strcmp("%20", "%20")
then we've got a problem. It doesn't show up in this case, but
what about

	URI_compare("foo%23xxx", "foo#xxx");

by this suggestion, we end up with
	strcmp("foo#xxx", "foo#xxx")
which returns TRUE, even though the first URI means
	the file "foo#xxx"
and the second means
	the "xxx" fragment of file "foo"
which are different things altogether.
	
Dan