Re: Character Data

Murray Maloney <murray@sco.COM>
Date: Thu, 22 Sep 94 15:34:34 EDT
Message-id: <9409221414.aa01254@dali.scocan.sco.COM>
Reply-To: murray@sco.COM
Originator: html-wg@oclc.org
Sender: html-wg@oclc.org
Precedence: bulk
From: Murray Maloney <murray@sco.COM>
To: Multiple recipients of list <html-wg@oclc.org>
Subject: Re: Character Data
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
X-Comment: HTML Working Group (Private)
> 
> In message <9409221044.aa00721@dali.scocan.sco.COM>, Murray Maloney writes:
> ><P>
> >Because certain special characters are subject to interpretation 
> >and special processing, information providers and 
> >browser implementors should follow 
> ><A HREF="#spclchars"> these guidelines </A>
> 
> This paragraph is misleading. In HTML 2.0, there are no characters
> that are "subject to interpretation and special processing."
> There's just ISO8859-1 -- a bunch of character glyphs, two or
> three control characters, and the rest are not used.

As the later text states, at least space and hyphen may be
interpreted or processed differently in different contexts
or by specific processing engines -- like H&J.  I think that
it is imprudent not to make this potential clear from the outset.
> 
> ><P>
> >Certain characters may not be accessible from your
> >keyboard, or some part of your system (i.e. translation software)
> >may not be equipped to deal with 8-bit character codes.
> 
> This is correct. And it is the _only_ reason for the ISO Added
> Latin 1 entity names in HTML (well... you could also say
> that they serve a mnemonic purpose).

What's your point?  Are you suggesting that the need to include
& > or < is not a valid reason?
> 
> ><H4> Line Feed  (LF - 10 dec) </H4>
> ><UL>
> ><LI> Interpreted as a word space in all contexts except &lt;PRE&gt;.
> ><LI> Within &lt;PRE&gt;, the tab should be interpreted 
> >as a shift to the start of a new line;
> >that is, <CODE> col := 0; row := row+1 </CODE>
> ></UL>
> ><H4> Carriage Return (CR - 13 dec) </H4>
> ><UL>
> ><LI> Interpreted as a word space in all contexts except &lt;PRE&gt;.
> ><LI> Within &lt;PRE&gt;, the tab should be interpreted 
> >as a shift to the start of the line;
> >that is, <CODE> col := 0; </CODE>
> ></UL>
> ></UL>
> 
> What if a line is terminated by CRLF in PRE content? Does
> that count as 1 linebreak or 2?

Good question.
> 
> Corprew: could you run some tests?
> 
> I think this could be clarified.
> 
> Dan