Re: Toward Closure on HTML

lilley@v5.cgu.mcc.ac.uk (Chris Lilley, Computer Graphics Unit)

Mail folder: WWW Talk Apr 94-present
Next message: Chris Lilley, Computer Graphics Unit: "Re: Toward Closure on HTML "
Previous message: Jay C. Weber: "Re: FORM ENCTYPE=multipart/www-form (was: Toward closure on HTML)"
Maybe in reply to: Daniel W. Connolly: "Re: Toward Closure on HTML "
Reply: Chris Lilley, Computer Graphics Unit: "Re: Toward Closure on HTML "

Errors-To: listmaster@www0.cern.ch
Date: Wed, 6 Apr 1994 19:53:32 --100
Message-id: <94040618511810@cguv5.cgu.mcc.ac.uk>
Errors-To: listmaster@www0.cern.ch
Reply-To: lilley@v5.cgu.mcc.ac.uk
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: lilley@v5.cgu.mcc.ac.uk (Chris Lilley, Computer Graphics Unit)
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: Toward Closure on HTML 
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 5952

letovsky-stan@CS.YALE.EDU wrote:

> "Daniel W. Connolly" <connolly@hal.com> wrote

>>NEWLINES, PARAGRAPH BREAKS, AND <P>

>>Folks have asked why the <p> tag is necessary at all -- why can't we just
>>use a blank line like troff and TeX?

If this was a small project just starting out, that would be a valid suggestion 
(but a poor one; a paragraph should hava a tag just like anything else). 

Given however that the Web is in daily use by millions, such a suggestion is 
well off the mark.

Yes, there is a problem in that <p> has no </p> and is at the end of a 
paragraph. OK, its broken but the browsers handle it.

They also handle a form which fits in with the way all the other tags work:

<p>this is a paragraph.</p>

and this form is likely to be in html+

>>First, it's too late to do that: there are too many documents with blank
>>lines that don't indicate a paragraph break.

Absolutely.

>HTML is not yet at the point where it should be regarded
>as cast in stone. An incompatible would-be successor HTML+
>is already on the horizon. Now is the time to consider
>such changes.

This is confusing several issues. Yes, HTML should be frozen as a standard 
rather than continually being twiddled with. Hence HTML+; what was learned from 
HTML has been fed into it. 

But altering HTML at this late stage to be even less SGML compliant would be 
such a bad move I am amazed that anyone has suggested it.

Incompatibility is a non-issue; I suspect a browser can tell the difference 
between text/html and text/htmlplus.

Now is indeed the time to consider such changes; the defect has been noted and 
corrected; the HTML+ DTD specifies <p> ... </p>. Simple, consitent, SGML 
compliant. End of problem.

>>Second, not everybody wants it that way: I'd like to be free to stick blank
>>lines in lists and such without introducing paragram breaks.

Indeed.

> This is a non-issue. LaTeX has a perfectly reasonable approach to
> ignoring extra paragraph breaks in list contexts; use that.

Why?

And why just lists?

And why LaTeX of all the awful things to pick as an example. If you are happy 
writing in LaTeX, do that - and use the Leeds converter to make it into HTML. 
But do not suggest that HTML be altered in wierd and un-SGML-like ways to fit in 
with what you happen to already find familiar. Not everyone uses LaTeX; and as 
the Web grows outside the research/academic community, the percentage of LaTeX 
users will fall drastically.

>Third, the mechanism for expressing this in SGML, SHORTREF, introduces
>significant complexity to parsing HTML. It opens up a canof worms including
><em/foo/ and other tricky parsing idioms.

I think this is just saying that the suggested "empty line means a closing 
paragraph tag really" method is just not SGML and would need unsightly hacks to 
express in in a DTD. I agree.

>In other words, you would rather have a language that is convenient
>to parse than one that is convenient to use. Big mistake. 

You think that having a blank line mean something in one place and not mean it 
in other places (lists, whrere else) is 'convenient to use'. Come to that, you 
think that anything connected with LaTeX is convenient to use? An even bigger 
mistake.

Count the number of people using LaTeX in 'the real world' compared to the 
number using GUI wordprocessors.

>The 
><p> ... </p> construct is a big step in the wrong direction: it makes
>a simple construct like a paragraph, which was already well handled
>by a text-editor, into something onerous

No, it makes it something which is consistent with the way all the other tags 
work and is therefore easier to use and understand if you are typing in html by 
hand - which is not the only way to do it. How 'onerous' is typing </P> ??

You will be suggesting next that titles are well handled by a text editor

This is a title
---------------

so that should be in HTML too? ;-)

>no one but a parser-writer
>would view <p> ... </p> as an elegant way to say "this is a
>paragraph. Similarly for <li>, etc.

So you want to do away with <li> too? How does LaTeX do that then?
If people understand that <h1>Title</h1> is a title, understanding that 
<p>paragraph</p> is a paragraph does not seem too great a leap.

I think this thread has conflated several distinct points:

1) Current usage

Current use of the <p> tag as a separator is anomalous. This has been sorted in 
HTML+

2) Ease of parsing

Having a consistent, SGML compliant syntax makes documents easy to parse and 
generate automatically; altering the spec to allow non-SGML-like forms would be 
a backward step.

3) Ease of use

People find easiest using what they know already. People used to LaTeX naturally 
find the forms of that mark-up more natural. People using other things will find 
those things more natural. The solution to this is not however to turn HTML into 
LaTeX - which which would only suit one subgroup of users - but to author using 
a system you are familiar with and convert or export as HTML. Solutions already 
exist for LaTeX, FrameMaker, Microsoft Word and WordPerfect. There is a project 
to develop a Motif GUI html editor. Ease of use is addressed by better 
HTML-producing tools, not by convenience hacks to HTML.

Chris Lilley
+-----------------------------------------------------------------------------+
| Technical Author, ITTI Computer Graphics and Visualisation Training Project |
+-----------------------------------------------------------------------------+
| Computer Graphics Unit,        |  Internet: C.C.Lilley@mcc.ac.uk            |
| Manchester Computing Centre,   |     Janet: C.C.Lilley@uk.ac.mcc            |
| Oxford Road,                   |     Voice: +44 61 275 6045                 |
| Manchester, UK.  M13 9PL       |       Fax: +44 61 275 6040                 |
| <A HREF="http://info.mcc.ac.uk/CGU/staff/lilley/lilley.html">click here</A> | 
+-----------------------------------------------------------------------------+