Re: Mail addresses as URLs

John C Klensin <KLENSIN@infoods.mit.edu>
Date: Tue, 11 May 1993 13:57:13 -0400 (EDT)
From: John C Klensin <KLENSIN@infoods.mit.edu>
Subject: Re: Mail addresses as URLs
In-reply-to: <9305111604.AA03736@www3.cern.ch>
To: timbl@nxoc01.cern.ch
Cc: www-talk@nxoc01.cern.ch, uri@bunyip.com
Message-id: <737143033.452103.KLENSIN@INFOODS.UNU.EDU>
X-Envelope-To: timbl@NXOC01.CERN.CH, www-talk@NXOC01.CERN.CH
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
Content-Transfer-Encoding: 7BIT
Mail-System-Version: <MultiNet-MM(330)+TOPSLIB(156)+PMDF(4.2)@INFOODS.UNU.EDU>
>well defined concept of a mailbox which
>may or may not contain documents and may or may not have
>restricted access. Perhaps "mailbox:" would be better than "mailto:".
>Moving a document into somone's mail box is just like moving a
>file into a directory: it is an operation with pre- and  
>postconditions.

Maybe.  I'm not sure how far the distinction is important, but you can't
externally specify delivery to my actual mailbox, nor can you determine
how many of those I have and how your message will be routed by filters
I impose on incoming mail.  All you get with a "mailbox name" is an
abstraction with which you can deliver something to my host's mail user
agent; after that, what happens to it is a local responsibility/issue
that is traditionally not subject to Internet protocols.

>"telephoneto" like "telnet" doesn't fit into that model, and
>so is less usefully integrated.
   Actually, if you are trying to deliver an audio entity, or have
text-to-speech capability, they are pretty similar in theory (the
practice is different because we tend to attach different tools to
mailboxes, but you can't predict this from the protocols).  Whether a
telephone number addresses an individual or a group can't be determined
in the general case.  Whether the message is recorded or discarded after
first reading/hearing is a local matter, as are a series of possible
forwarding and diversion operations.

>Is there not a subsyntax for the usermailbox@domain bit, with the
>comment personal Name removed?  

  First, just to clarify for those who aren't used to the syntax (or who
have seen it for years and not paid attention)...
     (this is a comment)   It is, from a protocol standpoint, noise that
can be discarded without loss of important information.
     The personal name ("phrase" in RFC822-speak) isn't a comment, it is
part of the address.  Mail systems have some obligation to preserve it
(a rule often broken) and not map it into a comment or vice versa
(thereby destroying or inventing information--also often broken).  In
environments at the periphery of the network, where mailboxes cost
money, personal names are often used to differentiate among users of the
same mailbox and post-delivery dispatching arranged on that basis.
    But, yes, one could eliminate the personal-name part as not machine-
addressable.  But there are still a number of variant forms.

>It would have the example of being
>a little closer to something which one can test for equality.
   Not much.   One of the intermittent interesting questions in email-
land is "how do I look down a list of addresses and eliminate
duplicates".  The general answer is "can't".  There are two problems,
both of which related to the mailbox-abstraction issues discussed above.
The first is that there are religious debates about whether
   Joe <someuser@somehost>     and
   Mary <someuser@somehost>    are the same
The machine-processable mailbox strings are the same, but "Joe" and
"Mary" tend to think they are different.

The second is that there is really nothing you can test for equality
anyway.  I've just done a quick count and probably forgotten a few, but
I've got at least eight addresses (different apparent mailboxes on
different apparent hosts) that cause mail to appear in the vicinity of
klensin@infoods.unu.edu, plus one that causes _most_ of the mail sent to
it to appear here, but which keeps the rest.  It takes far more
information than the mail system "owns" and will tell you to deduce the
relationships.

>Would it be possible to remove all RFC822 quoting and apply
>URL escaping as a reversable and well-defined transformation
>which would presvent the horrible results of layered escaping?
  At the risk of being cynical, it depends on the standards of quality
to which you can hold implementors of URL systems and, to the degree to
which humans have to see or type these things, those people.  If the
quality is comparable to what we have experienced with email, it is
hopeless--the quoting conventions in RFC822 are not implemented
correctly by a painful number of systems (and they've had a decade to
get it right).

  --john