Re: Entities
Murray Maloney <murray@sco.COM>
Date: Thu, 22 Sep 94 15:34:40 EDT
Message-id: <9409221429.aa01312@dali.scocan.sco.COM>
Reply-To: murray@sco.COM
Originator: html-wg@oclc.org
Sender: html-wg@oclc.org
Precedence: bulk
From: Murray Maloney <murray@sco.COM>
To: Multiple recipients of list <html-wg@oclc.org>
Subject: Re: Entities
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
X-Comment: HTML Working Group (Private)
>
> In message <9409221024.aa00606@dali.scocan.sco.COM>, Murray Maloney writes:
> >
> >P.S. I have noticed one curious thing... When I use  I get a space.
> >ASCII 11 is supposed to be a Vertical Tab (VT), so I find it a bit odd.
>
> Is  the _only_ control character that behaves this way, or is it
> the case that perhaps all characters <= 32 act like space in some
> browser? Or perhaps the explanation comes from the fact that some
> implementation of the C library function isspace(c) returns true for
> 11.
No, it's just  -- at least in Mosaic 2.4.
Strangely, not
>
> With regard to the spec, does it matter? 11 is one of the "UNUSED"
> characters. The behaviour of a browser when encountering such
> a character is not specified, correct?
>
> Should we call them "SHUNNED", "UNUSED", "UNDEFINED",
> or is some other term appropriate?
I would have thought SHUNNED was appropriate.
Perhaps Terry, Lee or Yuri could help here.
>
> >The full set of characters in 8859/1 is available through
> >numeric character reference except for nbsp.
>
> In what way is nbsp not available? I haven't tested it, but
> character 160 in the X fonts is in fact a space characters,
> so I expect that it works (out of happy concidence, if nothing
> else) on X/Mosaic. And if the Mac and PC browsers are doing
> their ISOlatin1 conversion correctly,   should work
> there too.
Perhaps it is just the system I am on.
When I try   I get nothhing.
No space, nothing. I thought that you and I had discussed
this a while ago and had agreed that nbsp and shy did
not work. If y'all tell me that it works on most browsers/systems
then I'll change it.
>
> Corprew: you seem to have ready access to these things. Wanna
> check this out for us?
>
> >None of the control characters are supported except for
> >09 (HT), 10 (LF), and 11 (VT).
> >
> >That means that 00-08, 12-31, 127-160, and 215 are outstanding issues.
>
> Isn't 13 (CR) supported?
Oops! I should have read what I wrote.
It is taken as word space in all contexts except <PRE>.
>
> >The multiply sign currently at #172 is not legitimately part of 8859-1.
> >However, the division sign at #247 is part of 8859-1.
>
> Very strange. Oh well...
>
> >> 27: an escape character for ISO2022 escape sequences?
> >> (the multi-lingual document issue again...)
> >
> >We have not declared support for ISO2022 is HTML 2.0 have we?
>
> Not at all. But I thought it might be wise to give it some
> special "reserved for future use" status. For example,
> Spyglass has announced a supported Japanese version of
> Mosaic. I'm curious to know how they represent Japanese
> characters in HTML.
Sort of like how I want to reserve the 8879 entity name space?
> >> 127-159: is there any defined use for these?
> >
> >Yes, ISO-6429 defines the codes from 128-159. Seven are undefined.
> >The remainder have potential uses in browsers, retrieval engines,
> >HTTP, and editors.
>
> Except for perhaps 173 shy, I'd say let's leve these SHUNNED.
> Anyway... it's a 2.1 issue if anything.
Right.
>
> >> 11(0B): --UNUSED--
> >
> > Hmmm! Not what I discovered.
>
>
> Could you elaborate on this? What was the observed behaviour,
> and with what browser?
With Mosaic 2.4 I got a space in plain text and in <PRE>
>
> >
> >>From 160-191, the names listed are not usable as character entity names.
> >These characters can only be used as coded characters or numeric
> >character references.
>
> Correct, for 2.0. For 2.1, we may want to open up the issue
> again. I didn't mean to cloud things.
>
>
> Dan