Re: some of the stuff on ftp.cs.toronto.edu:/pub/emv/ is in WWW format

jfg@bernd.cern.ch (Jean-Francois Groff)
Date: Tue, 12 Nov 91 16:36:46 -2300
From: jfg@bernd.cern.ch (Jean-Francois Groff)
Message-id: <9111131536.AA05532@bernd.cern.ch>
To: Edward Vielmetti <emv@ox.com>
Cc: www-interest@nxoc01.cern.ch
Subject: Re: some of the stuff on ftp.cs.toronto.edu:/pub/emv/ is in WWW format
References: <m0kgVyd-000Bt4C@cato.aa.ox.com>
>>>>> On Mon, 11 Nov 91 02:22:35 -0500, Edward Vielmetti <emv@ox.com> said:

Ed> I'm slowly but surely converting the files on ftp.cs.toronto.edu:/pub/emv
Ed> to be in the WWW format.  Right now the stuff in news-archives.README is
Ed> referred to that way, and some of the rest of the things in news-archives too.

I just tried to read your news-archives.README with the line-mode
browser through the traditional file: access. First-minute comments :

- Currently, any file retrieved through the file: access, local or
remote, is considered a plain text file unless its name ends with
`.html'. As a consequence, the anchors that you have inserted in
news-archives.README are not interpreted by the browser, so they
cannot be jumped to, except by cutting the reference and pasting it to
another www command line. Moreover, the text is just echoed in its
original format, which sadly happens to be double-spaced (CR-LF ?).
The easy fix is to append `.html' to the name of any file that
contains HTML tags, but I understand that it will bother people who
look at your files without www. The upcoming format negociation could
help with this, especially in the case of a dedicated www server that
could pass and possibly negotiate the document type. For anonymous
ftp, the browser should run simple heuristics to try and guess the
type of the file from its name.extension. We'll think about it.

Ed> This aftp: tag is new.  I'm not completely happy with the use of
Ed> the file: tag to refer to remote files, since it can lead to
Ed> situations where references are ambiguous depending on whether
Ed> you're dealing with a file on the local system or that same file
Ed> accessed via anonymous FTP on the local system.  Adding an aftp:
Ed> tag should help that.

- We agree that the current syntax can be ambiguous, but we want to
keep references to local and remote files in the same format, because
the very notion of a `remote' file should disappear with wide-area
hypertext (remember the new WAN cliche: the network IS the computer).
A less philosophical reason for that is to avoid referring to a
particular retrieval protocol : the reference to the file should be
the same regardless of whether it is retrieved through anonymous ftp
or through the Andrew file system, for instance. Of course, we would
like to introduce X.500 naming in the (more or less) long term.

Ed> It's useful (even necessary) to include the anonymous@ bit; there are some
Ed> sites (lib.stat.cmu.edu and research.att.com) with two parallel
Ed> "anonymous FTP" trees that have different user names to get to them;
Ed> a reference to
Ed>     <a href=aftp://netlib@research.att.com:/> </a>
Ed> is quite different than
Ed>     <a href=aftp://anonymous@research.att.com:/> </a>

So we want to keep `file:' for both local and remote file, but we must
take into account your other suggestion : allowing for a different
user name. I suggest the following :

        * allow an optional `user@' part before a host name.
        * if the user is not specified, make it the current user name
          if the host is the local machine, and `anonymous' otherwise.
          (this avoids the ambiguity that you mentioned)

Examples :
        file://ftp.cs.toronto.edu/pub/emv/news-archives.README.html
        file://netlib@research.att.com/

Ed> The format //user@host:/filename/ is quite similar to that used by
Ed> ange-ftp, so these references are immediately quite usable by
Ed> existing code.

- Currently, a colon after the host name is used to specify an alternate
TCP port number, but a good browser should ignore it if no number is
present. In this way, www can be compatible with ange-ftp syntax.

- Your examples make me think of another feature we should add for the
browsers to support them : the ability to display a directory as a
list of references, with maybe the README file (if any) prepended as
introductory text. Currently, on your reference to
        file://pit-manager.mit.edu/pub/usenet/
the browser would try to `get' the directory through ftp and fail. So
I'll add this to the wish-list for the `file:' access method :

        * if the address ends with a `/', try `ls' instead of `get'.
        * try to get an appropriate README file. Try those in order :
          README.html, *README*.html, README, *README*, *readme*
        * Display that file if found, then build a list of references
          for all the files contained in the directory.

Note that if you supply both a README.html and a traditional README,
you won't have to apologize about `all those funky angle brackets' !

- From your news-archives.README :

  blah blah blah. Check out
  <a href=aftp://anonymous@pit-manager.mit.edu:/pub/usenet/> </a>
  for lots more information.

With the line-mode browser, this will look fine :

  blah blah blah. Check out [1] for lots more information.

But with any mouse-driven browser (NeXT, X-Windows, emacs, Mac), the
anchor should sit on a piece of text that will serve as a button. With
your current example, your reader would only see :

  blah blah blah. Check out for lots more information.

with possibly a tiny highlighted space between `out' and `for'. Some
human-readable description of what the anchor points to will do fine.
For instance :

  blah blah blah. Check out the
  <a href=file://pit-manager.mit.edu/pub/usenet/> MIT usenet archives
  </a> for lots more information.

would yield

  Check out the MIT usenet archives[1] for lots more information.

or a highlighted `MIT usenet archives' on a mouse-driven browser.
Before that in your README, it would be nice to have an anchor
associated with the `List of periodic informational postings' and to
the archive that you mention. Same for the `news.answers' group (the
`news:' access is implemented in the new architecture. Use this simple
syntax : `<a href=news:news.answers> news.answers </a>'.)

- As an aside, the `name=' part of the anchor tag is not necessary in
your context : it is needed if someone wants to make a link TO that
particular anchor, not to the whole document.

Ed> I'm also using
Ed>     <a href=wais://wais.domain.org:210/database?>
Ed> in anticipation of that tag being supported, it should be a matter of
Ed> a simple sed or perl script to convert those tags to their current
Ed> preferred format.

- Agreed. OK for the `wais:' access.

Thank you for all your suggestions. Please continue to provide
feedback as you write more html. We're looking forward to read your
data seamlessly and pave the way for other ftp site managers.

--- Jean-Francois