Re: Interpretation of RE

Albert Lunde (Albert-Lunde@nwu.edu)
Tue, 11 Apr 95 13:46:38 EDT

>Hi Dave,
>
>> On Mon, 10 Apr 1995, Arthur van Hoff wrote:
>>
>> > element. If that is true, what is the correct interpretation of RE
>> > iside a PRE content? For example:
>> >
>> > <pre>
>> > This is <b>
>> > bold
>> > </b> text.
>> > </pre>
>> >
>> > Should this be interpreted as:
>> >
>> > <pre>
>> > This is <b>bold</b> text.
>> > </pre>
>>
>> No, that is not correct. <pre> means preformatted ... that is use a
>> fixed pitch font and break the lines where the user did.
>> Section 10.2 of the March 29 HTML 2 draft is quite clear. You example
>> should show as:
>>
>> This is
>> BOLD
>> text.
>
>I was afraid you would say that. Section 7.6.1 "Record Boundaries" of
>the SGML specification states:
>
> The first RE in an element is ignored if no RS, data, or
> proper subelement preceded it.
>
> The last RE in an element is ignored if no RS, data, or
> proper subelement follows it.
>
>Does this mean that HTML cannot be parsed by a strict SGML parser?
>
>Have fun,
>
> Arthur van Hoff (avh@eng.sun.com)
> http://java.sun.com/people/avh/
> Sun Microsystems Inc, M/S UPAL02-301,
> 100 Hamilton Avenue, Palo Alto CA 94301, USA
> Tel: +1 415 473 7242, Fax: +1 415 473 7104

I think this point was addressed in the old CERN "Level 1" HTML spec. See
for example:

http://www.w3.org/hypertext/WWW/MarkUp/Text.html

Which says in part:

>Line Breaks
>
>A line break character is considered markup (and ignored) if it is the
>first or last piece of content in an
>element. This allows you to write either
>
><PRE>some example text</pre>
>
>or
>
><pre>
>some example text
></pre>
>
>and these will be processed identically.
>
>Also, a line that's not empty but contains no content will be ignored
>altogether. For example, the element
>

I can't find identical language in the HTML 2.0 spec draft, but if the
above agrees with what is required by HTML, I don't think we've got a
problem, since the old spec is the historical basis for a number of
implementations, and the HTML 2.0 and HTML 3.0 specs _say_ they comply with
SGML.

Also, <pre> was introduced to replace other tags not SGML complaint, so I
suspect someone has looked at these issues in the past...

---
    Albert Lunde                      Albert-Lunde@nwu.edu