Re: HTML todo list

Thomas A. Fine <fine@cis.ohio-state.edu>
Date: Tue, 12 Jan 93 17:51:29 -0500
From: Thomas A. Fine <fine@cis.ohio-state.edu>
Message-id: <9301122251.AA13895@soccer.cis.ohio-state.edu>
To: connolly@pixel.convex.com, @cis.ohio-state.edu@cis.ohio-state.edu
Subject: Re: HTML todo list 
Cc: timbl@nxoc01.cern.ch, www-talk@nxoc01.cern.ch
X-Mailer: Perl Mail System v1.1
>>I don't think we should do any shortref magic.  The simplest thing
>>(the way it works now) is that the two examples above are identical.
>>I say this is fine.
>
>But it's a royal pain to implement! Doing full SGML newline processing
>by the standard is quite involved (see the article by Eric Naggum
>in comp.text.sgml about SGML and Records that I referenced in
>an earlier message). For example, you can't just get rid of all
>newlines immediately before or after tags, like it says in the
>web: Only those right after a start tag (of a non-empty element),
>right before an end tag,
>or the ones on a line containing only comments and processing instructions.
>Newlines around <P> tags, for example, _are_ reported.
>
>If we don't stick the SHORTREF magic in the DTD to force the
>parser to report all newlines, we'll end up with countless hacks
>at newline processing, none of which matches the standard, and
>it'll be luck if any of them matches each other.

Not necessarily.  Carefully define which new-lines have to be ignored.
This may yield something complex.  But then, you are still free to ignore
more new-lines than that in several different places, thus reducing the
problem.  In other words, it is up to the formatting program to decide
how to interpret them.  If it decides to throw out a few more new-lines
at the beginning or end of various data elements, life becomes much
easier.

You might have countless hacks, but since formatters ar allowed to
format things differently, does it matter?

Take this example:

  Here's some text<P>And some more text
  <P>
  And some final text.

SGML may say this:

  \nHere's some text
  (P
  )P
  And some more text\n
  (P
  )P
  \nAnd some final text.

But the formatter is still free to toss those new-lines at the beginning
and end of each paragraph (and in fact it had better if you don't want
a space at the beginning of your paragraphs).

	 tom