Re: keyword tag, REL, REV, TYPE, INDEX....

Craig Hubley (craig@passport.ca)
Thu, 4 May 95 00:23:02 EDT

> I want to agree with Ian Graham, <igraham@utirc.utoronto.ca>,

Me too. I sure hope I actually do. :-)

I'd like to raise the issue of putting instructions to browsers, instructions
to add-on software, and communications about the purpose and implications of
links (from authors to readers) in the same namespace. This seems scary. If
we don't have a fully robust REL=URI solution waiting, maybe we need separate
attributes. This also solves the problem of "next" in English vs. icon for
'Next' in Japanese". What worries me is that this may prevent a URI solution
later. I believe in the long run that links must become first class types
but that we need to experiment with usage for a while before defining these.

> Allowing complex relationships sounds sensible. Requiring that an
> implementation do something with them does not.

Even requiring an implementation to do something with simple navigation
instructions (TOC, INDEX, BACK, FORTH) is questionable... I have less of
a problem with BACK (which has a clear chronological definition) and TOC
(which assumes that an HTML document is part of an author-defined corpus
of same) than I do with INDEX or FORTH/NEXT. Legions of object-oriented
iterator theorists have found hundreds of different definitions of NEXT.

> I've been imagining that we say that an attribute value starting with
> an X is user-defined, for example, as for mail headers.
>
> Then you could do
> REL = "Xsequel: join(Glossary,Author) where docid = $URL.id"
> in your own documents if you wanted, and even have s/w that supported it.

As I discussed with Lee in person today, I believe that there should be some
clear way to distinguish relationships that are hardwired in browsers from
those that may optionally be interpreted by unknown user/author defined
software, from those that are intended only to tell the user WHY the link is
there (which the URL of the destination does NOT do very well on its own).

Someone suggested that we support links which are simply comments, but I
think that we need a separate link type and link type name. Then we have
support for generic browser programming (especially for navigation, e.g.
toc, next, index), application-specific browser programming, and end user
understanding of the (application-specific) reason why one might follow a
link. The current anchor name gives a clue to this but it is not standard
across links of the same type, so it is of little use for generating maps,
understanding the extent of documents, cacheing ahead to match user browsing
habits, or helping communities of practice to develop their own rhetoric (see
below).

I think that the 'hardwired' keywords, actually a means of programming the
browser from within the page, ought to be minimal or zero (I am not sure that
we can convince browser authors to hardwire them immediately), and that they
should be clearly deliniated by an underscore prefix (e.g. "_next") so that
an author is aware that they are invoking a keyword. If this is what REL is
to become then we need a separate attribute to tell the user what is going on,
i.e. why the link is there.

Earlier we discussed combining these into REL="_SCO/next", etc. which is like
saying that link types should ultimately become a URI. A somewhat lower
overhead solution is to let REL="_next" while WHY="next best match", or let
REL="_next" while WHY="image:kanji_for_next". With this new attribute, the
REL is only an instruction to the browser, the WHY is a textual annotation.
The difference between WHY and anchor text is that WHYs are standardized for a
specific application need. Later, when the usage in different communities of
practice and different applications are clear, we can standardize these too
and then we might ultimately get back to REL="link://Search/NextBestMatch", or
some other scheme that combines human and machine readable information, it's
not much of a problem. Lee's suggestion of REL="X-test-weird-new-link-type"
would also be acceptable. I think by HTML4 the link has to be a first class
URI. In the meantime I think WHY might bridge the gap:

WHY can be in the native language of the reader, or it might be an icon, or
even someday it might include a reference to some other document, while REL
can invoke a fixed set of English keywords or provide parameters to software
that can be automatically invoked on the document (as opposed to software
that the user might invoke deliberately in order to generate a web map from
the WHYs). If WHY is not defined it would be defaulted to REL (killing the
required underscore) or to null, so if REL="_next" then by default WHY="next",
if REL=(none) then WHY=(none).

However the author could also choose to set WHY="legal/precedent" to help out
any user who had set his browser to "See WHY". It's self-explanatory. The WHY
can provide a place for different sets of meaningful link types to develop
without confusing the issue of instructions to browsers or other software.
(e.g. legal: WHY="precedent" ="jurisdiction" ="narrows" ="appeals" ="interprets"
academic: WHY="cites" ="refutes" ="was reviewed by" ="was first published in"
netnews: REL=_back WHY="in reply to", REL=_toc WHY="back to newsgroup"
bookdtd: REL=_back WHY="last page", REL=_next WHY="next page", REL=_toc etc.)

Note also that in the vast majority of these cases, REL can be left blank.
I'd expect to see WHY become much more popular than REL, because it demands
far less of the browser and author (neither have to understand any specific
semantics of interpretation). Browsers need only add one preference item
and use it to display the WHY attribute as part of the anchor text (or not).
They may choose to display it also as link information when a user hovers,
where they presently display only the URL.

So, this HTML fragment:

<A HREF="http://..." WHY="legal/precedent">

with WHY OFF (as now) appears as:
...In *Jones vs. Jones* the Court had assumed...
with WHY ON (default?) appears as:
...In *(legal/precedent) Jones vs. Jones* the Court had assumed...

In either case, if the user hovers over the link they see "to legal/precedent:
[the destination URL]". In either case the WHY is reported just as written.

If REL took a URI argument, and if (in the above) REL="link://legal/precedent"
then a "legal/" browser that has special functionality implemented for when
REL="link://legal/..." could invoke it. In this case we might obsolete WHY
or at least give it stronger defaults. But I don't think we're ready for this.

So, REL vs. WHY? The former is a keyword intended for a program, the
latter is an instruction or navigational aid intended by an author for a
user. Anyone who thinks this is the same thing as the anchor text, speak
up now. I believe we need another attribute, if links are to have hardwired
semantics. Somewhere, somehow, the author needs to tell the user what *kind*
of thing will happen when s/he clicks it, and why that was supposed to happen.

> and maybe REL or REV could be used to specify a relationship implemented by
> a Java script that applied to multiple pages.

Ye gods. Absolutely only if REL=[a link URI representing some instructions].

> > posted by Steven Fought. The problem of course is -- which object
> > model (foundation class....) do you use?

The CORBA object model, as implemented in DEC LinkWorks, is the most robust
for general purpose application objects. Microsoft OLE 2 (shudder) is the
most 'popular' (if you consider software shipped to people with no choice
to be popular).

You suggest that browsers would ultimately exchange objects with desktop apps,
Once again it would require expansion of URIs to refer to local applications.
CCI is the right place to add this, right now it requires a port number...!

> This is not to mock or criticise, but to say, just try one and see what
> happens. I don't think anyone can give an answer to which model to use yet.

I'll stick by my previous assertion that, in hypertext systems, abstract link
models have been a failure, simple keywords have been a success, there are
very clear precedents for specific keywords but very few if any are totally
generic (in fact only those specific to browser navigation seem to be so).

Most (like "precedent", "in reply to", "cites", "is a heresy according to")
only make sense in context, but will *always* make sense to a reader who
knows that context.

> > 4. INDEXING features. I've always liked the idea of having document
> > keywords, and the META tag seems appropriate here. This does not

The WHY is sort of a META. It might be replaced by a META attribute on
links. If REL became more robust then this would remain useful.

> I think other people are working on this problem from a different angle,
> and we should perhaps wait to hear from them, the Digital Library work and
> the URC stuff.

I agree. Also, I believe that the way Netscape and Mosaic turn NetNews
postings into HTML, and the way Hypermail turns mail into HTML, and the
ISO Book DTD definitions of 'footnote' etc., provide adequate precedent
for a long list of recommended standard link types. More than enough
if we are also going to standardize navigation (toc, next, back, index)
so the browser can put it in a common toolbar or whatever.

I am starting to like this, as it means that authors can produce HTML
with standard navigation, without ever having to learn cgi-bin scripts.

Gradually, as other forms of hypertext rhetoric become 'standard', we
can add them to the supported link type list... so that there will be
even less need to write cgi-bin scripts... and ultimately define such
a robust means of retrieving link-type-specific behavior on-the-fly
(rather as DTDs or stylesheets will be retrieved) that only the most
frightening applications will require custom cgi-bin scripting.

-- 
Craig Hubley                Business that runs on knowledge
Craig Hubley & Associates   needs software that runs on the net
mailto:craig@hubley.com     416-778-6136    416-778-1965 FAX
Seventy Eaton Avenue, Toronto, Ontario, Canada M4J 2Z5