Re: Globalizing URIs

Larry Masinter (masinter@parc.xerox.com)
Thu, 3 Aug 95 03:42:12 EDT

Let me try again:

> <P>URL's often point to files on a file system, which increasingly,
> may <EM>not</EM> have a name that uses printable ASCII characters. For
> example, on a Japanese systems, a file might have the name
> "insatsu.html", in which the "insatsu" might be represented in
> romanji, katakana, hiragana, or kanji. In such cases, the octets that
> fall outside the range of printable ASCII would be encoded as per the
> specification, resulting in something looking like the following on
> EUC-based systems:
> <PRE>
> http://www.jacme.co.jp/%B0%F5%BA%FE.html
> </PRE>

How the http server for www.jacme.co.jp decides to translate strings
into files in its local file system is COMPLETELY up to the
implementation of the http server. www.jacme.co.jp could be running
some object-oriented database operating system which doesn't have
files at all. It could be running a file system where every file and
directory was 'named' with a bitmap image rather than a string of
characters.

The URL standard makes no claims about the mapping of URLs to anything
at all in the local file system of the local operating system. It
defines how URLs are written, and how URLs are translated into
sequences of octets that are sent in the protocol for the particular
scheme chosen.

If you want to build a HTTP server that accepts strings of the form

[character-set:encoding]:name-string

then feel free; however, it would have to be written

http://www.jacme.co.jp/[EUC]%B0%F5%BA%FE.html

This convention requires no changes to the HTTP or URL standards.