Re: SPaces and Tabs in HTML documents

Tim Berners-Lee <timbl@www3.cern.ch>
Date: Mon, 14 Jun 93 18:23:12 +0200
From: Tim Berners-Lee <timbl@www3.cern.ch>
Message-id: <9306141623.AA02170@www3.cern.ch>
To: marca@ncsa.uiuc.edu (Marc Andreessen)
Subject: Re: SPaces and Tabs in HTML documents
Cc: Damian Cugley <Damian.Cugley@prg.ox.ac.uk>, www-talk@nxoc01.cern.ch
Reply-To: timbl@nxoc01.cern.ch

>Date: Mon, 14 Jun 93 04:46:28 -0500
>From: marca@ncsa.uiuc.edu (Marc Andreessen)

>Tim Berners-Lee writes:
>> ...The general understanding (before Mosaic) was that
>> 	- Multiple spaces should be respected as such (for example
>> 	  some people like them around punctuation) and should not
>> 	  be used for prettying up the source.
...
>The general understanding before Mosaic??  The general understanding
>we were under was that redundant spaces, newlines, tabs, etc. were  
not
>significant, and that's the way Mosaic treats 'em.  The concept that
>multiple spaces should be respected as such is new to me -- where in
>the online info was this stated?

There was obviously a general MISunderstanding!  My fault, should
have got it clearer at the time. This was when Dan Connolly was
our "SGML cop". As I remember, his conclusion was that multiple
spaces were weird things to put in, but if someone wanted to put
them in, then they should be given more space as a result.

>> I would like to specify that multiple spaces be interpreted as  
such.
>> Would this be a big problem for anyone?

It seems to be a problem for Mosaic and for Dave.

>Isn't it a violation of the SGML philosophy that we've all spent so
>much blood, sweat, and tears trying to adhere to?

I don't think so... SGML allows you to define the significance of the
data.  I know the IBM mainframe SGML implementation here took two
spaces as two spaces.

>If we're going to start down this path, I'd like to see a line
>containing nothing but whitespace to be considered an implicit
>paragraph separator (<p>), as in LaTeX.

I see your argument.  Newlines led to a horrible discussion about
what was expected in HTML.

>> >    <p> Also, the two browsers have different ideas about whether  
the
>> >	ADDRESS tag marks a new paragraph or not -- www puts the  
address
>> >	flush right in a new paragraph, Mosaic simply switches to a
>> >	different typeface.
...
>It was a choice, made at least partially because documents can be
>wider than the available window space, causing a horizontal  
scrollbar
>to show up and info on the right side of the document to be hidden  
until
>the window is scrolled.  Didn't seem reasonable that information
>should be shoved over where one might not even be able to see it.

They can be wider... but Misaic is normally very clever at
sizing them to the screen size.  Its only preformatted bits which  
force a wide document.  It would be useful if the default  
preformatted font was chosen such that 80 characters fitted within
the default window width.  As it is one normally starts Mosaic
and then has to stretch it the moment a plain ascii document comes  
up.

Dan's remark on www-talk on 8 Jan 93 was:
"Is anyone distressed by the situation where some browsers compress
multiple spaces [in typeset paragraphs] into one, and some do not?  
I'm not.
I'd say "Don't do that" to the fool who put multiple spaces in
his source."

Well, I'm not too fussy about this, but an agreement is essential.
I can change the behaviour of the parser in the library if  we
all agree.

PROPOSED CHANGE then:

All white space to be shrunk to one space outside PRE
sections.   Anyone object?

Tim