comments on HTML+ discussion document

marca@ncsa.uiuc.edu (Marc Andreessen)
Date: Mon, 1 Nov 93 22:15:36 -0800
From: marca@ncsa.uiuc.edu (Marc Andreessen)
Message-id: <9311020615.AA02673@wintermute.ncsa.uiuc.edu>
To: Jim Davis <davis@dri.cornell.edu>
Cc: www-talk@nxoc01.cern.ch
Subject: comments on HTML+ discussion document
In-reply-to: <199311012035.AA02830@willow.tc.cornell.edu>
References: <199311012035.AA02830@willow.tc.cornell.edu>
Jim Davis writes:
> My main comment in that HTML+ is too big.  It is beginning to
> resemble PL/I (am I the only one old enough to remember this
> language) or Common Lisp.  Do we really need all these features, and
> more important, will the browser implementors actually put them all
> in?  The thing about WWW is that it's very very easy to make a
> simple server but it's getting to be harder and harder to make full
> client.

Funny that Jim makes these comments today -- I was going to send out a
note tonight to make much the same point.  We've had some time to mull
over HTML+ now, and to look at the progress of WWW software
development, and it is becoming apparent that HTML+ -- even after
months of revisions -- does noticeably suffer from "second system
syndrome": it just tries to do too much.  I think we need to prune
some things out to make it really manageable.

Justifications:

If we do this, and the things we prune turn out to be very necessary
in the long term, then they will be back in HTML++.  If it turns out
they're not -- i.e., if lack of tables does not turn out to actually
be a showstopper for anyone who wants to use HTML+ as a delivery
format -- then this will be apparent also and save us from having them
burden down this markup language.

If we do *not* prune some of these things, we are indeed making it
prohibitive to build browsers.  It may very well be the end of next
year before Mosaic across platforms fully implements HTML+ as it
stands, and we've got a comparatively huge development effort.  Is
that level of complexity the foundation we want to build WWW on in the
long term?

More and more I think the most important thing for us to do at this
point is to enable as many *applications* as possible.  Fill-out forms
clearly do this, for example.  Conversely, I think we have to let the
finer points of document formatting take a back seat -- the problem
domain is much bigger, and we do have to face the fact that anyone who
is going to use WWW is going to do so for its special capabilities
(distributed hypermedia, fill-out forms, etc.) and not its document
formatting capabilities (for which PostScript, PDF, TeX, etc. are
always going to be better suited, no matter how hard we work in the
context of an effort like WWW).

Like Jim, I don't want to sound negative -- I do think HTML+ is a big
step forward -- but I think we need to tackle these issues now in
order to make sure it's successful and relevant.

> 1) Re 5.2 Hypertext links.  Why not drop TYPE, SIZE, and METHODS?  As
> you say, there's no guarentee they will be accurate.  We're just
> asking for trouble by putting them in the language.  Yes, it would be
> nice if the browser could tell me ahead of time what is at the other
> end of the link but given a choice between inaccurate info and no info
> I'll prefer no info.  If we put these in the language, there will be
> times when it is wrong and it screws someone; What's more, eventually
> some clever person will demand that we build some mechanism for
> guarenteeing that they stay up to date.  

Agreed.  I'm also ready to say let's scrap TITLE and PRINT -- I don't
think they add enough real functionality to a generic browser to
justify their existence.

> 2) 5.11 Conditional Text.  I appreciate the problem, having
> run into it myself, but this is not the right answer.  The general
> problem is that different rendering is required for a printed (dead)
> document and for an online (live one).  HTML+ seems to have several
> different solutions for the problem.  There's the conditional text
> and also the PRINT attribute.  I would prefer to keep *both* out of
> the language until a clean proposal comes in that unifies all.

Agreed.

> 3) Re 56.4 notes and admonishments
> 
> I agree with Bert Bos, the ROLE in NOTE should be a type, and not
> printed.

Yup -- it's easy to put the "<b>NOTE:</b>" in by hand; it makes little
sense to have it in the language.

> 4) Re 8.1 Active areas:
> 
> you have the origin in the upper left corner (good) but the last draft
> of HTTP I saw has the origin (for SPACEJUMP) in the lower left.  We
> are asking for trouble here.

Yup -- let's standardize across all these various different methods by
specifying the same thing with what we're already doing (ISMAP --
upper left).

> 5) 13 Indexing
> 
> I don't understand why this is even in the HTML spec.  I can certainly
> appreciate that document authors might want to define index hits when
> writing documents, but this index info has no semantics to the
> browser, so why should it be in there?

Agreed.

> 6> 14.2 HEAD and BODY.
> 
> Are these mandatory?  or optional?  

I've never been clear on the point of those -- there doesn't seem to
be any point in time where they'd be useful.  I must be missing
something...

More comments from reading over the spec today:

intro page: Use of "light weight" seems incorrect :-).

            Should standardize throughout on the term "hypermedia",
            and not use "hypertext".

page 1: Should be "Uniform Resource Locator", "Uniform Resource Name".

        Strike "for the nearest available copy" -- not relevant, and
        not the only (or main) purpose of URNs.

        "The latter being designed" should be "The latter was
        designed".

        Strike "It is hoped that HTML+ will be useful for information
        exchange via email and network news as well as HTTP" -- there
        is already no reason HTML should be considered HTTP-specific.

page 2: Kill "freely accessible" -- not relevant.

        "World Web" should be "World Wide Web".

        "HTML+ documents consists of" should be "HTML+ documents
        consist of".

page 3: This whole "HTML+ provides a means for authors to specify such
        paths either explicitly via declarations at the beginning of
        the node or implicitly according to the context in which a
        given node is reached.  Another possibility is for servers to
        send such information independently, e.g. as MIME message
        headers" is overkill.  Why should there be two distinct
        methods to accomplish the same thing?  In any case, the latter
        sentence should be tossed out of the HTML+ spec as irrelevant.
        
        "You can also provide a search field that is always present
        (and can't be scrolled away)" is a browser issue and not in
        the scope of a markup language spec.

page 4: Strike "(preferably centered)".

        Strike "flush left".

        Strike "WYSIWYG editors should automatically generate
        identifiers.  In this case, they should provide a point and
        click mechanism for defining links so that authors don't need
        to deal explicitly with identifier names."  Also strike
        sentence after that -- shouldn't be a part of the markup spec
        and in fact exposes a weakness that we don't want to deal with
        yet (like I've said before: there will never be a marker where
        you really need one, regardless of how many you stick in
        there).  Also strike following paragraph.

page 5: "you may wish to switch off word wrap with WRAP=OFF" seems to
        come out of left field.  In any case, do we really want that?
        Isn't that what PRE/LIT is for?

        Footnote: What is "H" (as opposed to "H1 to H6")?

page 6: Is &quot; really necessary?

        Where did the "Q" tag come from?

page 7: What is "<!ENTITY ...> %ISOcyr1;" stuff?  Is it intended to be
        in a DTD or in a HTML+ document?  If the latter, we don't
        normally use "%" as a prefix for anything.  I suggest we just
        toss it out altogether, in any case -- if we need to do
        languages, let's use MIME.

page 8: Kill <U> -- underlining is too convenient, obvious, and
        popular a representation method for hyperlinks; many browsers
        will have to ignore it anyway.

        Why superscript and subscript?  If we don't do math (see
        below), what applications require them?  They would be quite a
        bit of work to implement, and many browsers wouldn't be able
        to.

page 10: "may overwrite previous lines of text???  What???

        Footnote is kinda confusing.

page 11: Omit SEETHRU paragraph.

        "bigpic.giff" should be "bigpic.gif"

        Clear up ambiguity introduced by saying that some servers can
        handle drags.

        Kill final paragraph in section (about "asking HTTP servers to
        include images with the HTML+ document as a MIME multipart
        message") -- it's far from clear that this will be implemented
        in any near term.

page 12: Axe conditional text section.

        Kill <L> -- it's really kinda pointless, because of <BR>.

        Another mention of WRAP=OFF that doesn't seem to need to be
        there.

page 13: Kill suggestion in footnote 1.

page 18: Clarify to leave ISMAP as "?x,y" since it's already in place.

page 19: Kill second paragraph of section 8.3 altogether.

        Kill tables -- they add a lot of complexity to the
        requirements for a conforming browser while not provide enough
        capabilities to think that they'll keep a potential
        application from being show-stopped by their absence.  Both
        inlined images and PRE sections will be "good enough"
        substitutes in most cases.

page 22: Let's use a separate METHOD attribute to keep things
        nonconfusing.

        Kill <htmlplus forms=off> paragraph.

        Do we *really* (I mean *really*) need the MH stuff?

page 23: TYPE's default should be clarified to be a single-line text
        entry area.

        Instances of TYPE=IMAGEMAP should be TYPE=IMAGE throughout.

        Istances of MAX should be MAXLENGTH.

page 28: Kill math stuff altogether.  It belongs elsewhere -- we can't
        do enough of it to be competitive with existing, used systems
        and so we should punt -- at least for now.  Again, this is one
        of those things that can reappear in HTML++ if it turns out
        it's really needed -- but let's keep it out of the base
        spec.

page 31: Kill htmlplus tag.

        ISINDEX spec should specify encoding method beyond just
        space-to-plus.

page 32: I still think NEXTID should be nuked from the spec -- let it
        be used by particular editors WITHIN A COMMENT if necessary,
        but it has nothing to do with the markup language.

page 47: Appendix II is way overkill.  Any text-based browsers
        wouldn't be able to do anything with almost all of the
        symbols, and graphics-based browsers typically would need a
        *lot* of complex special-case rendering code.  Let's let the
        normal (albeit slow) progress of system character sets solve
        this problem correctly.

Cheers,
Marc