Re: SGML newline processing

Dan Connolly <connolly@pixel.convex.com>
Message-id: <9301082052.AA17146@pixel.convex.com>
To: Michael Leventhal <mleventh@us.oracle.com>
Cc: www-talk@nxoc01.cern.ch
Subject: Re: SGML newline processing 
In-reply-to: Your message of "Fri, 08 Jan 93 12:35:54 PST."
             <9301082035.AA08237@hqsun4.us.oracle.com> 
Date: Fri, 08 Jan 93 14:52:09 CST
From: Dan Connolly <connolly@pixel.convex.com>

>SHORTTAG is an optional feature, but SHORTREF is not, since
>it is required in the SGML declaration.  I think, according
>to the standard, a system which does not support SHORTREF
>is not compliant and therefore not even minimum SGML.

Hmm... I've got the standard in my lap, and while it usually
takes me at least 1/2hour to be sure I've reall all the
relavent sections, it appears to agree with your statement.

However, we're only interested in parsing instances of
a particluar DTD.

If we make no SHORTREF declarations in this DTD, we can
dispense with shortref processing in our parser.

>My solution only requires SHORTREF.  I code:
>
><!ELEMENT	newline	- o		EMPTY>
><!ENTITY	nltag	STARTTAG	"newline">
><!SHORTREF	nlmap	"&#RS;"		nltag>
><!USEMAP	nlmap			(verbatim)>
>
>The use of OMITTAG in the newline element is not
>necessary.  This code causes the parser to recognize
>record starts as newline tas within verbatim tags.
>My processor converts the newline tags back to record
>starts.

Hmmm... this is interesting. First a question: why
the newline element in the first place? why not just
make the shortref expand to a newline character in
the first place?

If we put declarations like this in the HTML DTD, it
would then be legal to treat

<XMP>
foo
</XMP>

different from <XMP>foo</XMP>

Thanks for the idea... it just might work!

Dan