Re: HTML Validation Form [Was: Nesting of HTML elements ]
Mon, 31 Oct 1994 17:57:38 +0100

[Regarding why tilde is not allowed in a URL]

MV> Actually the spec has varied a bit over time, but both RFC 1630
MV> describing WWW URLs and the latest standards-track URL definition from
MV> the URI WG declare that the tilde may not appear in URLs unencoded.

CL> Why?
CL> Given that HTTP is 8 bit clean, the only reason I can think of is
CL> people passing URLs around in mail, through something awfull like an
CL> ASCII-EBCDIC gateway. Is that it?

You're correct; the reasons for the placement of tilde on the 'unsafe' list
is ostensibly to allow safe passage through gateways. I've asked a couple
of times on this list for specific instances of situations where tilde will
get corrupted, but have never received a reply. I'm taking that to mean
that the situation is extremely rare, if indeed not even known. As a
result, I'm continuing to use tilde, even though it violates the

In any case, the appropriate section from the URL draft (version 8) is:

Characters can be unsafe for a number of reasons. The space
character is unsafe because significant spaces may disappear and
insignificant spaces may be introduced when URLs are transcribed or
typeset or subjected to the treatment of word-processing programs.
The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used
to delimit URLs in some systems. The character "#" is unsafe and
should always be encoded because it is used in World Wide Web and
in other systems to delimit a URL from a fragment/anchor identifier
that might follow it. The character "%" is unsafe because it is
used for encodings of other characters. Other characters are
unsafe because gateways and other transport agents are known to
sometimes modify such characters. These characters are "{", "}",
"|", "\", "^", "~", "[", "]", and "`".

-- | Harvey Mudd College |

"The universe is made of stories, not atoms." - Muriel Rukeyser