Re: Cache woes.

"Daniel W. Connolly" <connolly@hal.com>
Errors-To: listmaster@www0.cern.ch
Date: Fri, 20 May 1994 17:14:01 +0200
Errors-To: listmaster@www0.cern.ch
Message-id: <9405201511.AA11806@ulua.hal.com>
Errors-To: listmaster@www0.cern.ch
Reply-To: connolly@hal.com
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: "Daniel W. Connolly" <connolly@hal.com>
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: Cache woes. 
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Type: text/plain; charset="us-ascii"
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0
Mime-Version: 1.0
In message <4385.9405201434@daniell.brunel.ac.uk>, Paul.Wain@brunel.ac.uk write
s:
>
>Hrm, okay, so in the future it should be okay. The cache in question was
>a CERN one, no idea what version, that was translating '=' to %3D which
>as Dan said is erm "NOT safe". That was the only thing I could see
>causing problems (BTW from what I am told - not had a chance to check -
>the BNF says that = should be escaped. Is that right?)

In some circumstances. Let me explain:

There are two purposes for the %XX construct:

	(1) distinguish data from markup, e.g.
	distinguish '/' as a pathname-consituent character
	(as it might be on a mac) from '/' as a pathname-separator
	character (as it in POSIX and the URI syntax).

	(2) allow transmission of URIs through transports that
	are only reliable for a subset of the 256 octets.

	I believe (2) was originally actually a hack to represent
	spaces in HREF attribute values, ala:
		HREF=ftp://machost/dir/file%20with%20spaces
	This is clearly bogus, since unquoted SGML attribute
	values have a much more limited syntax, and the simple
	way to represent the above is:
		HREF="ftp://machost/dir/file with spaces"
	What about URL's with " in them? SGML syntax includes:
		HREF="ftp://machost/dir/file with &#34; in it"
	So it is actually possible to represent an arbitrary
	sequence of characters in an SGML attribute value.

	But the (2) issue is still motivated by mail transport...

So ~ and %7E mean exactly the same thing. As long as the transport
is one in which ~ characters make it through OK, there's nothing
wrong with writing http://www.hal.com/~connolly/index.html, except
for the fact that some stupid implementation might copy that
into a mail message without changing the ~ to a %7E, and then
an ASCII/EBCDIC translation would munge the ~ char.

On the other hand, / and %7E do NOT mean the same thing. Nor
do = and %3D. %3D means "= as a data character", whereas plain = 
means, for example:
	ftp://host/dir/file;type=image


I hope that makes things clear.

By the way... these issues are the province of the URI working group,
whose discussion forum is the uri mailing list. Contact uri-request@bunyip.com.

Dan