Re: Project Gutenberg's Roget's Thesaurus

Tim Berners-Lee <timbl@www3.cern.ch>
Date: Thu, 15 Apr 93 09:41:14 +0100
From: Tim Berners-Lee <timbl@www3.cern.ch>
Message-id: <9304150841.AA06587@www3.cern.ch>
To: marca@ncsa.uiuc.edu (Marc Andreessen)
Subject: Re: Project Gutenberg's Roget's Thesaurus 
Cc: www-talk@nxoc01.cern.ch
Reply-To: timbl@nxoc01.cern.ch

| Date: Tue, 13 Apr 93 12:42:01 -0500
| From: marca@ncsa.uiuc.edu (Marc Andreessen)

 

| Guido.van.Rossum@cwi.nl writes:
[...]
| > I see a problem coming here: how does an unreplicated document  
(say my
| > own home page) mane a reference to such a replicated document?   
If I
| > have a reference to the closest replica, a user far away who  
follows
| > such a link will get pointed to the replica closest to *me*, not
| > closest to her.
| > 

| > Some possible solutions:
| > 

| > - a translation scheme whereby clients "know" (e.g. from a local
| > configuration file that may be updated automatically as mirror  
sites
| > are added) that information at host X is identical to info at  
host Y

Yes, this is possible with the 2.0 library.  The line mode client
doesn't have the command line option yet, but the client library
can use a rule file, just like a server.

| > - a magic string in hostnames that is translated dependent on the
| > geographical position of the client (e.g.
| > http://info-cern.closestmirror/...)


This has been discussed, in fact having a host name which translates
info many IP addresses -- all the apparatus is there already in DNS
and most people say it won't break anything, we just need code to, if
DNS returns >1 IP address, ping them all to get the closest and  
remember which one it is.

| > - upon first contact with a server, it might respond with "please
| > try the following mirror site which is closer to you" (this could  
be
| > put in HTTP2 I suppose).


This is already the HTTP2 spec.  The reply field can be a "forward"
reply containing a pointer to the real object.  That is, HTTP2  
servers can be used as name servers.  No code yet in the library.
See http://info.cern.ch/hypertext/WWW/Protocols/HTTP/HTRESP.html#z9

| > This is a real problem with embedding location information in  
URLs...
| 

| Sounds like it's time to move URN's (or whatever persistent  
Internet
| resource identifiers are being called these days) out of the theory
| stage and into practice.... anyone know what the status of the URN
| work that was/is apparently going somewhere in the IETF?

I thought it was going to get closer, but basically it was agreed  
that the format urn:publisher/id would work with a finite number of  
publishers.  I think you might see Peter Deutch and Chis Weider maybe  
putting some code together???

Remember that if you have to contact a name server every time,
you slow things down anyway. Basically, a little common sense 

in the client would guess that a related document was avalable from
the same server as last time.

Tim