Re: SGML newline processing

Michael Leventhal <mleventh@us.oracle.com>
Message-id: <9301082234.AA10337@hqsun4.us.oracle.com>
Date: Fri, 8 Jan 93 14:34:22 PST
From: Michael Leventhal <mleventh@us.oracle.com>
To: connolly@pixel.convex.com
Subject: Re: SGML newline processing
Cc: www-talk@nxoc01.cern.ch

>However, we're only interested in parsing instances of
>a particluar DTD.
>
>If we make no SHORTREF declarations in this DTD, we can
>dispense with shortref processing in our parser.

You've already decided to buy-in on SGML.  You must,
correctly, I think, see the immediate and long-term
advantages in adhering to the international standard.
"Conforming", the minimum set of features, is a
non-arbitrary baseline which you can expect all
html-capable parsers to meet.

One should be able to get a PhD for a good
SHORTREF dissertation.  I think I've gotten the gist
of some of the discussion though - SHORTREF must be
supported in the SGML declaration in order to deal with
some problems which arise through the handling of 
some special codes on some systems.  For example, &#RS;
&#RE; equivalent to &#RS on UNIX systems

It is quite possible that making your parser conforming
will save your neck someday.

>Hmmm... this is interesting. First a question: why
>the newline element in the first place? why not just
>make the shortref expand to a newline character in
>the first place?

I don't think you could do that, since SHORTREF is an
implicit tagging scheme.  Even if you could it wouldn't
help since it would leave the ESIS structure unchanged,
with the same filtering of newlines as content is passed
to the process.  My scheme changes the ESIS before content
is passed to the process.

Michael