Re: Deploying new versions [Was: Versioning HTML at the server]

Daniel W. Connolly (connolly@hal.com)
Fri, 28 Oct 94 13:29:32 EDT

[sorry if this is a dup. Having address trouble...]

In message <94102812242094@v5.cgu.mcc.ac.uk>, Chris Lilley, Computer Graphics U
nit writes:
>> Tables are more like forms. The NCSA 2.5 browser should explicitly
>> Accept: text/html-ncsa-2.5 or some such, and there should be an easy
>> way for information providers to communicate to their server software
>> the fact that a given document has tables in it, like using a .thtml
>> extension. Granted, .thtml is a short-term hack that doesn't scale,
>> but it's better than breaking existing clients.
>
>Does this mean that Arena should accept: text/html-ncsa-2.5 too,
>becuase it also does tables? What sholds mosaic for mac send, i believe
>it does them too, probably it is not at version 2.5 either.

Well, if those implementors are willing to get together and agree
on a common spec, then they can call it
text/html-with-tables

But unless and until they do, a document that was authored by
previewing with Mosaic 2.5 and uses its extended features should be
labelled accordingly, since there's no reason to believe that it works
with any other browser.

>What happens when mosaic 2.6 comes out?

Same thing. I expect there will be lots of experiments. Each
experimental extension to HTML should be deployed with a corresponding
interoperability strategy. For NCSA Mosaic 2.5 tables, you ship the
table markup to NCSA Mosaic 2.5 clients, and for everybody else, you
filter them to <PRE> or provide a link to a gif rendering of the table
or whatever floats your boat.

>Very soon, this would mean that serv ers would have to look out for a
>large list of different accept headers, all of which would mean that
>HTML 3 style tables were accepted.

When this happens, somebody should write a spec for something called
text/html-3-tables, and all the experiments get collapsed.

I wouldn't mind if they all started with the same table spec to begin
with, but I don't get the impression from the NCSA 2.5 documentation
that they implement exactly the HTML+/3 table model. Perhaps the
HTML+/3 table specification should be augmented with "minimal
conformance" info -- for example, any browser that supports TABLE,
ROW, and COLUMN is minimally conforming, even though it may not handle
borders, colors, etc.

> This is just another way of keeping
>a browser list. Server writers and operators should not have to
>maintain an up-to-date list of all the browsers in existence and all
>the different versions therof and a table mapping each of these to what
>features of HTML 3 are supported.
>
>Dan, the more I think about this the more it seems a poorly worked out
>solution. I know you used it mainly as a lead-in to your HTTP/2.0
>discussion, but still...

I agree that it's ugly. That's why I started this discussion. I invite
you to propose an alternative.

It's so ugly and the tools are so primitive that it's not
working. Information providers will not spend 5 extra man-hours a week
to make sure that lynx clients interoperate well if their audience
uses Mosaic.

The tools must be improved -- we need to make it the responsibility of
the folks that develop the HTML extensions not to deploy extra features
without enticing information providers to break existing clients!

>As HTML 3 is destined to be deployed and tested in stages, surely there
>must be some way to specify in a browser independent way what is
>accepted?

I invite you to jump in and write the spec for this "browser
independent way [to specify] what is accepted." In my experience,
everybody does it a little differently until we get enough experience
to see what's good and what's not, and then we write a standard spec.

>Sure, if someone wants to serve radically different experimental
>extensions to HTML these might be tagged as an entirely different
>format.
>
>But it seems absurd to penalise browsers that are helping us along the
>standards track from HTML 2 to HTML 3 by supporting some of the HTML 3
>features.

Not as absurd as breaking the installed base of software and spreading
Fear, Uncertainty, and Doubt about the stability of W3 technology.

>> Eventually, server software should be enhanced to efficiently open
>> the file and find some magic cookie (like a <!DOCTYPE declaration...)
>
>OK, I have been changing things on the servers I run so that all my
>HTML documents (even those served up by CGI scripts!!) begin thus,
>cribbed off the HaL syntax checker ;-)
>
><!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
><html><head> etc
>
>First, is this correct?

As of today, it's correct. I don't expect it to change. But the spec's
not published, and I haven't gotten a report of what changes were
discussed in Chicago.

> Should there be a 2 or 2.0 in there at the end
>and if so, does it go like this
>
><!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN//2.0">

The "...//EN" version effectively selects "the current" or "the
latest" upward compatible version of the HTML DTD. If you use the
"..//2.0" identifier, then you're specifying that even when new
versions of the HTML DTD come out, you want your document parsed
and validated by the 2.0 version.

>Second, if this is correct, PLEASE someone spell out the magic words
>that should be placed at the top of HTML 3 documents.

It depends on the degree of upward compatibility. If all HTML 2.0
documents are also valid HTML 3.0 documents, then they can just
call it:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN//3.0">
and
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Level 3//EN//3.0">

(i.e. both those names refer to the same DTD)

But from what I've seen in the specs (and I don't know if I've seen
the latest one or not), it's not strictly upward compatible. So
it should be called something like:

<!DOCTYPE HTML-PLUS PUBLIC "-//IETF//DTD HTML PLUS//EN//3.0">

>None of the ones at CERN have any declaration, they just start off
><title> which is fine if you know what DTD you are using and it lets
>you omit tags.
>
>Thirdly, if I create some HTML 3 documents and put them on my server,
>which I intend to do, if I get my server to spit out
>
>Content-type: text/html; version=3.0
>
>will this break anything? Will it offend anyone? Will it help anyone?

I believe that most existing clients will go "'text/html;
version=3.0'??? Huh? What's that?" and offer to save it to a file.
This is The Right Thing To Do, but as an information provider, it's
probably not what you want.

Anyway, there is no published HTML version 3.0 document, so this would
be somewhat bogus.

Also, there are two parameters: level identifies a set of features,
and version identifies the document that specifies them. So if you're
using features outside of level 2, you should emit

Content-Type: text/html; level=3; version=???

Actually, it's becoming less and less clear to me that "level" is
a useful distinction. Perhaps

Accept: text/html; forms=2.0; tables=3.0; math=3.1

is what browsers should advertise. I dunno.

Hmmm... but the issues are somewhat simpler for the returned
Content-Type. It is the responsibility of the HTTP server to only
send content types that the client accepts. Once the server
has determined that the client can accept a given document, you
might as well just write:

Content-Type: text/html

and let the client determine the specific version etc. from the
data itself -- perhaps from the <!doctype ...>.

Hmmm... nope... that won't work: existing clients say they Accept:
*/*, so they appear to accept anything. I wish that instead of */*,
clients had advertised that they accept application/octet-stream, and
servers would know that any file can be returned as
application/octet-stream. Hmmm... we _could_ just retrofit those
semantics on the existing situation, i.e. when a client says Accept:
*/*, we define that to mean that it accepts application/octet-stream.

So, for example, if an HTTP server determines that a document with
tables is within the browsers capabilities, because the browser said:

Accept: text/html; tables=3.0;

then the server can just return

Content-Type: text/html

On the other hand, if the HTTP server decides to send the document
to the client becase it said:

Accept: */*

then the server should return

Content-Type: application/octet-stream

which will cause the client to save it to a file.

Whew! Boy am I rambling!

>> I don't have enough experience to design
>> an optimal solution right now, but that's no excuse for folks to go
>> breaking existing clients. (I'll say it again: don't break existing
>> clients!)
>
>OK, the HTML 3.0 samples at cern (w3.org) are breaking existing
>clients, by your definition. They are tagged as text/html and have no
>DOCTYPE in them. What should be done about it?

The HTTP server at their site should _not_ send documents that exceed
the capabilities of the clients, except as application/octet-stream.

The Arena browser should explicitly advertise some capabilities, and
the HTTP server with the experimental data should be configured to
treat the experimental data as text/html; tables=3.0 or whatever.

Dan

------- End of Forwarded Message