Mr. Van Hoff is quite right. Common practice is pretty much
irreconcilable with the SGML standard in the example he cites.
For a while, this was in the SGML declaration for HTML:
                  SPACE       32
                  TAB SEPCHAR  9
                  LF  SEPCHAR 10
                  FF  SEPCHAR 12
                  CR  SEPCHAR 13
        -- The above is an accurate description of the usage of FUNCTION --
        -- characters in HTML implementations; that is, there is no      --
        -- Record Start or Record End character, and no occurences of    --
        -- character 10 or 13 are "ignored" by the parser.               --
        -- But because few SGML implementations support this concrete    --
        -- sytax, we include the one below.                              --
        --        RE          13
                  RS          10
                  SPACE       32
                  TAB SEPCHAR  9 --
While it "fixes" this problem, it makes life very difficult for sgmls users.
The current draft says this:
3.2.1  Conventional Representation of Newlines and Record Delimiter Characters
   SGML specifies that a text entity is a sequence of records, each
   beginning with a record start character and ending with a record
   end character (character number 10 13 respectively).
   MIME specifies that a body of type text/* is a sequence of lines,
   each terminated by CRLF, that is octets 10, 13.
   NOTE: In practice, HTML documents are frequently represented and
   transmitted using an end of line convention that depends on the
   conventions of the source of the document; frequently, that
   representation consists of CR only, LF only, or CR LF
   combination. Hence the decoding of the octets will often result in
   a text entity with some missing record start and record end
   characters.
   Since there is no ambiguity, HTML user agents are encouraged to
   infer the missing record start and end characters.
   An HTML user agent should treat end of line in any of its
   variations as a word space in all contexts except
   preformatted text. Within preformatted text, an HTML user agent
   should expect to treat any of the three common representations of
   end-of-line as starting a new line.
This doesn't address the case below.
Frankly, I'm not sure what to do about this.
Suggestions?
Arthur van Hoff writes:
 > > On Mon, 10 Apr 1995, Arthur van Hoff wrote:
 > > 
 > > > element. If that is true, what is the correct interpretation of RE
 > > > iside a PRE content? For example:
 > > > 
 > > > <pre>
 > > > This is <b>
 > > > bold
 > > > </b> text.
 > > > </pre>
 > > > 
 > > > Should this be interpreted as:
 > > > 
 > > > <pre>
 > > > This is <b>bold</b> text.
 > > > </pre>
 > > 
 > > No, that is not correct. <pre> means preformatted ... that is use a
 > > fixed pitch font and break the lines where the user did. 
 > > Section 10.2 of the March 29 HTML 2 draft is quite clear.  You example
 > > should show as:
 > > 
 > > This is
 > > BOLD
 > > text.
 > 
 > I was afraid you would say that. Section 7.6.1 "Record Boundaries" of 
 > the SGML specification states:
 > 
 > 	The first RE in an element is ignored if no RS, data, or
 > 	proper subelement preceded it.
 > 
 > 	The last RE in an element is ignored if no RS, data, or
 > 	proper subelement follows it.
 > 
 > Does this mean that HTML cannot be parsed by a strict SGML parser?