Re: Parsing < and <

Luke (ylu@ccwf.cc.utexas.edu)
Thu, 20 Apr 95 17:17:03 EDT

On Thu, 20 Apr 1995, lilley wrote:
>Luke Y. Lu writes:
>
>> Is there any difference between &#60 and < etc. in current html 2.0
>> spec? I can't seem to find a authoritative answer. I noticed an "anomaly"
>> in netscape 1.1b3 today: it "eats" the following html fragment:
>>
>> <stuff>
>>
>> i.e., it seems to unescape &#NN first then feed it to the parser...
>>
>> <stuff>
>>
>> is displayed as <stuff> though.
>
>Four characters are used for markup and thus might be misinterpreted in
>some contexts: < > " & So you need a way to insert these special characters
>such that they are not treated as markup. Named entities is that way.
>
>The numerical references, though, are just the same as typing in the actual
>characters. You would only use &#60; &#62; if, for example, your keyboard
>did not have the < > characters (which would be a royal pain for hand
>editing HTML!)

Please tell me which part of the spec states the _exact_ difference between
named reference and numeric reference. Section 13. of the current spec
(http://www.ics.uci.edu/pub/ietf/html/draft-ietf-html-spec-03.txt)
Character Entity Sets, esp. 13.1 Numeric and Special Graphic Entity Set is
not clear for this question...

>For other characters which are not used as markup, there is no difference
>in practive between the numeric reference and the named entity.
>
>If you have access to a Unix system, you might like to try checking this
>sort of question against the DTD, which is quicker than running up every
>browser in your collection. Check out:
>
> <http://www.halsoft.com/html-tk/>

My question has nothing to do with the HTML DTD (grep the DTD you'll find
no &#60; or &lt;). It seems to be more like a SGML question. Which
behavior is conforming? or is it undefined? Tell me, SGML gurus...

__Luke

--
Luke Y. Lu
mailto:ylu@mail.utexas.edu/
http://www.utexas.edu/~lyl/