Re: proxy and Next browser

Tim Berners-Lee <timbl@ptpc00.cern.ch>
Errors-To: secret@www0.cern.ch
Date: Wed, 9 Feb 1994 12:20:49 --100
Message-id: <9402091118.AA03829@ptpc00.cern.ch>
Errors-To: secret@www0.cern.ch
Reply-To: www-talk@www0.cern.ch
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: Tim Berners-Lee <timbl@ptpc00.cern.ch>
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: proxy and Next browser
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 7427


I'm distributing this reply to a private mail from Kevin Altis to the
list because lower down there are some nitty gritty points about
assumptions about access rights in HTTP which affect gateways, and
I will change the spec to reflect that. So ignore this message 

unless you're into that sort of thing. :-)

> we're pretty close to announcing the new proxy support that Ari, Lou, and I
> have been working on. Much of the credit should go to you though since I
> got the idea last year based on your earlier GATEWAY code. Any comments you
> have, especially any gaping holes in the method would be greatly
> appreciated;

I understand from Ari that it works just like the WWW_xxxx_GATEWAY
method, except the environment variable is different.  Is that
right?  What are the differences?

>            the only thing that comes to mind right now are protocols
> (Z39.50 ???, maybe DCE stuff?) that can't be handled by HTTP transactions
> today which would mean that the proxy can't handle that kind of request.
> Should HTTP/2.0 cover those cases?

Yes -- in  fact, we should have a well-defined method of defining
how any arbitrary method maps onto HTTP.  The idea was that HTTP should
be self-extending: the Allowed: header comming back would give a list
of operations, and the client could then query the server to get
a description of new operations in some (NETGOTIATED) language.
See teh SHOWMETHOD method.
That negotiation is a key.  Suppose you have a z39.50 gateway to
a server doing a boolean search of some special variety. The
"Z2345678SEARCH" is mentioned in the "ALLOWED:" header
and the client checks it out with the the server.  Current person-oriented
clients canoinly handle HTML+ and get back a form. Fine -- if the
it contains fields and instructions, and links to explanation of what the  
parameters mean.  Future smart clients can get back a semantic
description of the operation (pre- and post-conditions) as well as
a more formal description of the parameters.  I see this extensibility
as being the direction for not only gateways but also arbitrary OO
systems out there.


So we just need to
say how any arbitrary parameter set would be sent by HTTP.
Obvioulsy a MIME/multipart would be a way when the values are big
objects, but it is ugly(-ier even) when the parameters are small --
like integers.  A halfway house would be to allow both like

Param:  <parametername> <type> <value>

which can include

Param: <parametername> SEE <URL>

where URL could be a cid:<content-type> referring to an enclosure.
Make sense?  Use SGML Instead?  ASN/1? Mapping should be
bidirectional. Shouldn't matter whether DCE or HTTP is underneath.


> I think this will make a huge difference in the usage of the Internet and
> the Web in particular.

Fortunately, it won't clobber the nackbones -- in fact will reduce
the long distance traffic.  Because HTTP is now a significant share of
the traffic, so we think twice before introducing fetaures which
may up the trafic overnight.  Caching may even cut the figures right down.
We will need good monitors on the caches to be able to estimate the
effective traffic which would be generated were it not for them.
That will be interesting to plot vs total line capacity!

> Suddenly Mac and PC based users behind firewalls
> that have never been able to use the Internet will be able to reach out. Of
> course, Unix and VMS users will be helped as well, but at least they've
> usually had some grungy ways of getting out to the rest of the world. On
> the other hand, this solution should go over real well with administrators
> since the level of logging and restrictions on the type of things the user
> can do are much greater than SOCKS or other halfway solutions without
> really censoring a user (maybe PUTs).

Yes.

> It was also intentional on my part to do the proxy this way to elevate the
> importance of HTTP on the Internet (HTTP is the vehicle for proxying) and
> make it possible to put much of the smarts in the caching server rather
> than the client. For example, if we have to support gopher+, it can
> probably be done on the proxy server, not on every client.

This decision really ends up being made separately for each case.
Some people want ta totally capable client.  They have a good net
connection, lots of local cache, and no sysadmin support.
They want fast response and software which works out of the box.
For them, the totally-equipped client. For others, the firewall
and a good sysadmin and maybe a slow connection from the firewall
out make use of the proxy a must.  So from the code point of
view, I am happy that the same code can be used in server and
client.  We should stick with this model with any new protocol
additions, I think.

A cool thing might be to use dns to find the nearest proxy. Just
as I hope new clients to look for a local www.dom.ain server for
a default home page, maybe someone needing a wais gateway could
look for www-wais.dom.ain and then www-wais.ain in an attempt to
find one lying around. This would make things work better out of the box
and reduce configuration bugs.

> You can also
> make a lightweight client that only speaks HTTP if you want, but since
> clients will probably speak native ftp, wais, etc. within an organization
> rather than always going through a proxy that might not be important.
> Clients don't have to speak TCP/IP to a proxy either, which opens up some
> interesting options.

We ran HTPP over Decnet through a proxy.  Then the folks who asked
about it ended up getting TCP/IP on their vaxes!  We didn't want to
supprt it.  But the Novell guys might take that on. (Or maybe we should
not encourage it too much if we believe IP on everything is the best
for the world)

> I might be able to get Web clients onto handhelds yet.

How come my phone has no wires but my notebook pc does? Crazy.
I'd be happy with portable phone technology -- don't really
need cellular. Much cheaper too. Don't understand why I can't buy it...

> Doing HEAD, expires, etc. is suddenly going to get important. Might put
> some pressure on the URNs issue as well.

Yes.  Also, the Public: is important.  We must get the default understanding  
completely clear.  At the moment in HTTP is seems as though Public: is
just informational, as in fact if anyone really wants to test access then
they can just try it.  With caching, the Public: allows the caching server
to return it directly.  If we specify the current assumption that
if nothing is specified then the document is public, This is 

NOT fail-safe.  Would it be better to make that assuption ONLY if no
Authorization: header was sent?

IE
    If Public: present, it is definitive.
    Else	if authorization was needed, then assume NO public access
    		else	if Allowed: is present, assume public access is same
			else assume public access is GET only.

I'll put that in the spec -- if anyone has any troubles with this
say now.

> Ari mentioned you're working on a Next browser again. I would like to see
> that sometime. I'm going to be running Nextstep 3.2 on my Pro/GX as soon as
> the software arrives.

It is a "spare time" activity, and right now the thing craches. A number
of people have expressed interest in helping, but not come up to
speed yet.  The state is embarassing right now, but soon I will
be able to give source out but only to people who cancommit to
a particular aspect of improvement.

> ka

Tim BL

> 

>