Re: special entities: < > & and "

Steve Heaney <Steve.Heaney@delft.sgp.slb.com>
X-Delivered: at request of secret on dxcern.cern.ch
Date: Wed, 6 Oct 1993 19:07:35 +0100
From: Steve Heaney <Steve.Heaney@delft.sgp.slb.com>
Message-id: <199310061807.AA26247@mordred.delft.sgp.slb.com>
To: www-talk@nxoc01.cern.ch
Subject: Re: special entities: < > & and "

Kevin,

The requirement that < > and other characters be replaced by entity references 
(in certain situations) comes from SGML and is to do with the way that an SGML 
parser processes the text of an SGML file.

There are several "data types" which elements can have including:

#PCDATA - parsed character data.  Parser needs to determine if it contains 
          any more markup.

CDATA   - character data. All markup characters are ignored.

RCDATA  - replacable character data.  As CDATA except entity references 
          and character references are recognised.

EMPTY   - element does not have any content.

Most of the elements in the HTML DTD will be declared to have content of 
type #PCDATA.  NCSA Mosaic may not have a problem with "reserved" characters 
such as the <, >, " and & in these elements, but you can bet that an SGML 
parser will choke on it.

Here starteth the sermon ...

    That's what comes of using a browser to validate your markup :-)

Here endeth the sermon. Amen.

Steve.

------------------------------------------------------------------------
Steven Heaney

Schlumberger Geco-Prakla
Internet: heaney@delft.sgp.slb.com
------------------------------------------------------------------------