Re: Frames and WWW

Phillip M. Hallam-Baker (hallam@dxal18.cern.ch)
Thu, 17 Nov 1994 21:42:29 +0100

In article <94E3@cernvm.cern.ch> you write:

>>
|>>Not correct by whose definition? Are you refering to W3O or W3C draft specs or
|>>an RFC from the IETF? Only the latter is definitive with respect to the Web
|>>and even then only if it doth not contain egregious lossage.
|>
|>Is the above in either the W3O, the W3C or the IETF? Does anyone use
|>this? As you so nicely point out, standards mean nothing if they are
|>not used. It just so happens that the TEI people have some very large
|>archives, and some very large SGML documents, and they needed a way to
|>retrieve only parts of the documents, and so they invented the 2
|>methods I outlined, among others. You seem to suffer from the awful
|>"not invented here" malady. Why not make use of something that is
|>already accepted by quite a few people in academia? I will repeat:
|>Phil proposed this:
|>
|> http:///bongo.cern.ch/fred.html#H1:2/H2:4/H3:3/P:4/10,15
|>
|>while I proposed long ago the use of the TEI invented naming schemes.
|>
|> http:///bongo.cern.ch/fred.html/section=2/subsection=4/subsubsection=3/P=4
|> http:///bongo.cern.ch/fred.html/2/4/3/4

Except that this is a URL that identifies an object not a position within an
object. Not the same thing at all.

|>(Note that the second uses the child number of the element, whereas
|>the first is using the occurence of the element name within the child
|>list.)

Except that this TEI scheme bears no relationship to HTML and you so not
define one. There is no such thing as a HTML `section' or subsection unless
you apply one through a defining a separate sepantics for the tags. This
is all well and good but you then have only positioning for the semantic
interpretation, not for the delivery format.

This is not "Not Invented Here" but "Other Suggestion Non-Starter". I somehow
don't thi9nk that TEI proposed the scheme in the context you intend to
apply it. Again references please.

|>>What relation do these `sections' have to HTML elements. Is H1 a section?
|>>Is H2 a subsection? What is a H3???
|>
|>Well, now we come to the crux of the matter. HTML was very poorly
|>designed because it ignored the inherent structure of documents, so in
|>fact we don't have many containers... if we want to address
|>something using the TEI stuff, it will be very "flat"
|>(fred.html/P=14).

By poorly designed I suppose you mean it didn't happen to conform to
the SGML communities views on document design. Guess the ratio of HTML
documents to other SGML documents. Guess the likely ratio in a few months
time.

|>>Is this an SGML standard or a Web standard? Who has commented on it? Dave
|>>Ragget? Tim B-L?
|>
|>This is a *humanities* standard. The people found that using SGML was
|>of great benefit, because they have *huge* data repositories, and so I
|>guess one could also say it's an SGML, standard. Now, while I have
|>every respect for Tim B-L, I'm not sure he knows all that much about
|>document processing on a large scale. Dave Ragget certainly knows SGML
|>well. Ask him for comments.

Ah the humanities people, well known for their ability to create technical
standards.

|>I should note that no-one now controls the WWW. The genie is out of
|>the bottle.
|>
|>>If its an SGML standard don't imagine that it has any relationship to
|>>HTML.
|>
|>Well, HTML *is* SGML (which of course you know), but it is a
|>particularly poor form of it. As I noted, these are not SGML specific
|>(see below).

Given the SGML spec HTML is probably the best you can do from a very poorly
designed system. If SGML was properly designed it would not have required over
a year to get the basic HTML DTD correct.

|>>It is possible to create containers by associating sections of text with the
|>>preceeding headers and nesting Hn+1 elements within Hn elements. This may be
|>>hard to express in SGML lossage but that is SGML for you.
|>
|>This has got to be the funniest thing I have read all week! Probably
|>all year! SGML's primary purpose is to define the structure of a document
|>explicity by defining containers and content model.

A circumlocuitous way of saying that SGML fails at its pricipal design purpose.
I quite agree. Many of the claims made for SGML are hubris. There are some
communities within the SGML camp wsuch as the ICADD people who are there
because it is the closest thing to a structural markup arround. What I
see as the danger though is people looking at SGML noting the lossage and
concluding that structural markup is a bad idea. This has happened before on
filesystems where people have looked at the MVS/VM lossage and concluded that
structured filesystems are a bad idea and in programming languages where
Wirth set the field back years with the botched Pascal spec which took strong
typing to the point where arrays of different lengths were intrinsically
different types.

|>One cannot define
|>containers by associating Hn with the following text for 2 reasons:
|>
|>1) Many people use Hn for font effects
|>2) One cannot find the boundaries
|>
|>So I guess your idea works on tag occurence within a document, in
|>which case, making it look like a path is a mistake because you imply
|>a heirarchy where there is none.

Its easy enough to define the boundaries. If people engage in Mosaic
tag-abuse they end up losers, so what? Their documents light up the bad HTML
flag on more modern browsers and eventually the users get educated.

The tree structure may be deduced using a simple set of rules aince at the top
level within the BODY container the only valid elements are <Hn>, <P>, <UL>
<DL>, <OL>, <PRE> and <IMG>. The <Hn> elements are the only ones which define
structure within the tree and all the others may be regarded simply as different
types of paragraph.

|>Tim B-L didn't define the containers (even *LaTeX* has them...) and
|>now you blame SGML for not being used correctly? When your programs
|>crash do you swear at the language design?

I blame SGML for having the most incomprehensible structure definition
grammar since sendmail and still not allowing the structure to be effectively
represented.

|>----
|>For people who are not interested in some SGML evangelism, you can
|>skip the rest.
|>
|>As I said earlier, unless you are working in the abstractions that the
|>author uses, one is missing the boat entirely. One of the nice things
|>about SGML is that it gives one the flexibility to define the
|>abstractions (if one so wishes), or to simply use the abstractions of
|>others if they happen to fit. Better, it allows you to define the
|>information structure explicitly, and then check that the document
|>actually fits the model you defined.

In other words not all documents are divided into chapters or sections.
This is why an untagged indexing scheme has to operate on the tags used
for the markup.

|>Now SGML *is* coming to the WWW. There are many large corporations and
|>academic sites that *require* SGML, and the structure it contains. In
|>fact, you told me that *you* had plans to do an SGML aware browser...

Quite true. This is one reason why I am more aware than most of the scale
of SGML lossage.

|>If you think you could possibly stop the many large corporations
|>around the world who want this, you are very much mistaken. As you so
|>rightly pointed out, the bottom line is in the RFC's, the IETF, and
|>most of all, in the hands of users (who vote with their feet shall we
|>say). The genie is out of the bottle. One cannot cork it again. To try
|>to do so is arrogance.

The main reason why it won't fit in the bottle being of course that the life
support system its attached to won't fit into the neck. As we know the users
vote with their feet, hello Microsoft Word 6.0.

--
Phillip M. Hallam-Baker

Not Speaking for anyone else.