Re: Is this use of BASE kosher?

Daniel W. Connolly (connolly@beach.w3.org)
Thu, 3 Aug 95 09:23:05 EDT

In message <95Aug3.002811pdt.2762@golden.parc.xerox.com>, Larry Masinter writes
:
>I don't think we can treat HREF="#foo" as an optional optimization.

It's not an "optional optimization." It's a degenerate case of the
general mechanism. It's not an exception in any way.

Suppose the base URI is "http://foo/a/b/c.html" and I see a reference
to "../b/c.html". Do you expect the browser to fetch a new copy, or
use the one it's already got? The spec doesn't say. I suppose
you're arguing that it should.

>Consider a URL whose content updates continuously, returning
>completely different HTML text each time you retrieve it. Within a
>document with such a URL as base, HREF="#place" should still refer to
>the _current_ instance, even if following a link to HREF=".#place"
>might retrieve a new instance.

I'm not sure what you mean by "refer to the _current_ instance." An
<a> element refers to an anchor (by its address or URI), not to any
particular entity (or representation or body -- octet sequence is my
meaning here).

Hmmm... the spec is a little goofy on this. It currently says:

|As a degenerate case, a URI of the form `#fragment' refers to an
|anchor in the same document.

It should say "... in the same resource."

A link is a relationship between two anchors. An anchor is identified
by it's address. So if the base URI is http://foo/, and the markup is
<a href="#name">, then the head URI of the link is
http://foo/#name. Done.

I think what folks are uncomfortable with is that the HTML 2.0 spec
doesn't say exactly how a user agent resolves a URI into a
representation of the resource, i.e. the spec doesn't govern history
mechanisms, caches, and all that.

But I think that's merely a reflection of reality: not all browsers
handle caching/history the same way, and information providers must
not count on any particular behaviour in this arena.

I suppose the spec could be more explicit on this point, by way of
some "how a URI is resolved to an entity representing the resource is
unspecified." Of course another option is to hammer out some standard
semantics in this area. But I think that's the subject of a future
document.

Reading over the "Hyperlinks" section of the HTML 2.0 spec[1], the
only thing that makes me uncomfortable is that the term "base URI" is
net properly introduced. There is some verbiage that makes it seem
that the base URI is a property of the document, when really it's a
property of the state of the user agent:

http://www.w3.org/hypertext/WWW/MarkUp/html-spec/html-spec_7.html#SEC65
|Accessing Resources
|
|To access the head anchor of a hyperlink, the user agent determines
|its URI from the URI given in the tail anchor, using the base URI of
|the document containing the tail anchor if necessary.

RFC1808 is similarly casual about introducing the term "base URI."
It also uses the term "base document," just to confuse things!

OK... it looks like I need to revise the hyperlinks section slightly.

[I think this is a consequence of turning the HTML language spec into
an HTML user agent spec in kind of a hurry. Oh well...]

I need a new draft for the IESG anyway, to address the NAME->NAMES
fix and Paul Burchard's forms fix. I'll try to have it out in the
next two or three working days.

Dan