Re: The spec evolves...

Dan Connolly <connolly@pixel.convex.com>
Message-id: <9212070617.AA12781@pixel.convex.com>
To: marca@ncsa.uiuc.edu (Marc Andreessen)
Cc: Guido.van.Rossum@cwi.nl, www-talk@nxoc01.cern.ch
Subject: Re: The spec evolves... 
In-reply-to: Your message of "Sun, 06 Dec 92 22:41:10 PST."
             <9212070641.AA05810@wintermute.ncsa.uiuc.edu> 
Date: Mon, 07 Dec 92 00:17:37 CST
From: Dan Connolly <connolly@pixel.convex.com>

Look out folks -- we're getting into religious issues here.
I think Marc's made a lot of good points, but be warned:
I've spent a lot of time thinking about this stuff, and
I might state my opinions a little forcefully. :-)

>Dan Connolly writes:
>> Very true. I think the A tag is _highly_ overloaded. One click on an
>> anchor might take you anywhere from the next sentence to somewhere
>> in New Zealand. 
>
>This is part of the beauty of HTML and the Web, and should not be
>abandoned lightly -- complete user-oriented transparency lifts the
>concept of information up from its physical grounding
>(network/machine/directory/file) and removes the need to think of it
>as anything *but* information.  Who cares where it comes from, so long
>as it's there?

Good point. I didn't mean that we should make the physical distance
to the link destination known to the user. But I think users would
benefit from knowledge about the logical distance -- i.e. is
it part of the same node, part of the same document, or in some
other work completely? Is it more specific or more general
than this node?

[By the way Guido: if the information is used by the server to locate
the information, rather than by the client to label the reference,
you should put it in the HREF somewhere.]

>> Meanwhile, I think it's time to redesign HTML. 
>
>I emphatically disagree.  With all due respect (and a lot is due) to
>your efforts with formalizing HTML, it's high time to shoot the
>engineers and stabilize the product.

My communication skills are really failing me lately. This is exactly
what I meant to say: I'm happy with the HTML DTD: it describes
the way HTML is used, fairly completely and exactly. But HTML
leaves a lot to be desired that cannot be fixed in an upwards
compatible way.

>  Widespread success of the
>current implementation will be the single best reason for further
>redesign, which can take place well down the road in the form of HTML
>version 2, after lots of real-life experiences with the current system
>can be catalogued and analyzed -- something currently lacking.  In the
>meantime, HTML and the Web need to work on becoming entrenched and
>widely and generally used, or God help us, we're all gonna be using
>Gopher five years from now.

I see nothing wrong with gopher. It's just NFS without the kernel
hacks, and with fulltext searching wedged in. Gopher+ is a
mess. No two ways about it.

But HTTP is nearly identical to Gopher. In some ways, gopher is
cleaner than W3: the gopher "path" is opaque. Clients never
parse it (except some wierd clients that use the file extension).
An HTTP client parses the path, so there's a syntax imposed
on it -- have you looked at the massive hacks in the W3
browser to support VMS paths?

I think we need to seriously rethink relative addresses.

And a Gopher reference (the information the client has _before_
it traverses a link) includes the type of the information.
A W3 reference does not, and so the client must assume HTML.
(unless it's an FTP address, in which case it sneaks a peek
at the file extension. Yuk! Or unless it's a Gopher address,
where the data format interpretation is hacked into the
routine that opens the connection. Yuk!)

This is the problem: suppose I put a pointer to
<A HREF="wais://wais.host/stuff.gif">a GIF image </A>
in an HTML document and serve it up. Clients have
no way of knowing any better than to grab the data
an barf it on the screen.

And adding <A Content-Type="image/gif" HREF="wais:...>
won't help: the content-type will be ignored by most
browsers.

Hmm... perhaps there's a way out after all.
I could, on the other hand, put

<See HREF="wais://wais.host/stuff.gif" Content-Type="image/gif">
a new kind of link</See>

in the document, and only browsers that know about SEE
elements would even attempt to get the data. And they'd
know better than to treat it as text.

So perhaps it's not the HTML data format that's doomed,
but the <A> element. I guess the lesson is: you can't
teach an old element new tricks.


About python...

>These object-oriented toolkits and interpreters and interface builders
>and so on are all wonderful, but keep in mind that

> (1) sustained use
>of interpreters impacts performance; 

Counterpoint: when the design is complete, performance-critical code
can always be written in C and added as a module. In the mean time, the
benefits of rapid-prototyping outweigh the performance penalties.

>(2) sustained use of any of them
>impacts long-term viability of systems based on them, particularly
>when it comes time to start embedding HTML browsing in other tools;
>and 

I'm not sure I understand what you mean here.
I don't mean to base the W3 architecture on Python -- only some
implementations.

>(3) look at the proliferation of different systems already in use
>and removing all hope of abstracting more than a very small amount of
>common code (Viola, tk/tcl, Midas, VUIT, NeXT interface builder,
>etc.).

Viola and tk/tcl: These try to do what's already been done in
the Xaw and Motif toolkits, and they don't do it as well. (I suppose
this is your point...)

Midas: This is a specially designed language highly suited to
it's purpose. Only the highest level of code in the Midaswww
browser is written in Midas. All the rest is reusable. Tony
did a heck of a job.

VUIT: how did this get in there?

NeXT: I'd drop X/Xt/Xaw for NextStep in a second if it
was an option. NextStep isn't free, so it hasn't proliferated
like X. That's pretty much the end of the story. If I could
limit my user base to NeXT boxes, I'd like to!

>  Doesn't it make more sense to just use portable C (or,
>possibly, C++) and allow others to benefit from and build upon your
>labors without forcing yet another toolkit/language/interpreter on
>them, and more often than not forcing them to reinvent the wheel?

The key is reusable software. You've hit that on the head.
I think python can be a good platform, but I'm having
trouble supporting my point. Maybe I need to think some more...

Certainly I expect core algorithms to be coded in portable
C. That was the purpose of libHTML. But I hate dealing
with dynamic memory allocation. When you're building big
applications in C, you spend all your time getting this
right.

I think the right combination of C and an object-oriented
high level language is the way to go. Folks _love_ Tk
and tcl. But it lacks an object system, recursive list
data structures, symbols, and other essentials.

I like languages that are tailored to special applications.
In my mind, the less code you have to use to solve a problem,
the better.

Dan