Re: Entities

"Daniel W. Connolly" <connolly@hal.com>
Date: Wed, 21 Sep 94 16:23:24 EDT
Message-id: <9409212024.AA02099@ulua.hal.com>
Reply-To: connolly@hal.com
Originator: html-wg@oclc.org
Sender: html-wg@oclc.org
Precedence: bulk
From: "Daniel W. Connolly" <connolly@hal.com>
To: Multiple recipients of list <html-wg@oclc.org>
Subject: Re: Entities 
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
X-Comment: HTML Working Group (Private)
In message <9409211753.AA10388@sqrex.sq.com>, lee@sq.com writes:
>The enclosed list of entities comprises the ones that are in the draft plus
>the ones that define the other ISO 8859-1 characters.
>Since it's stated that the character set is ISO 8859-1, shouldn't the full
>set of entites be available?
>
>The additions are at the end.
>
>Mosaic 2.4 doesn't seem to support these, so you culd argue that this is not
>current practice.  It does seem odd to support these chars with &#ddd; and
>not with named references, though.

Only a little. I suspect this will be one of the issues addresed in
HTML 2.1.

Here's a table that summarizes the HTML character set
as I understand it. Note that characters 27, 127-159, 172,
215, and 247 (decimal) still represent outstanding issues:

	27: an escape character for ISO2022 escape sequences?
		(the multi-lingual document issue again...)

	127-159: is there any defined use for these?

	172: in the X fonts, it's a "logical not" character.
		Is this part of the ISO8859-1 standard?
		What's the SGML entity name (&lnot; ?)

	215: in X fonts, it's a "times" character...
	2247: in X fonts, it's a "divide" character...

There's also the question of whether PRE lines end in CR, LF, CRLF,
or any of the above.

  Number  Entity Glyph Description
   0(00):              --UNUSED--
   1(01):              --UNUSED--
   2(02):              --UNUSED--
   3(03):              --UNUSED--
   4(04):              --UNUSED--
   5(05):              --UNUSED--
   6(06):              --UNUSED--
   7(07):              --UNUSED--
   8(08):              --UNUSED--
   9(09):              TAB (just like space, except in pre)
  10(0A):              LF  (just like spece, except in pre)
  11(0B):              --UNUSED--
  12(0C):              --UNUSED--
  13(0D):              CR  (just like space, except in pre)
  14(0E):              --UNUSED--
  15(0F):              --UNUSED--
  16(10):              --UNUSED--
  17(11):              --UNUSED--
  18(12):              --UNUSED--
  19(13):              --UNUSED--
  20(14):              --UNUSED--
  21(15):              --UNUSED--
  22(16):              --UNUSED--
  23(17):              --UNUSED--
  24(18):              --UNUSED--
  25(19):              --UNUSED--
  26(1A):              --UNUSED--
  27(1B):              ESC ???
  28(1C):              --UNUSED--
  29(1D):              --UNUSED--
  30(1E):              --UNUSED--
  31(1F):              --UNUSED--
  32(20):               ala ISO646-IRV (ASCII)
  33(21):            !  ala ISO646-IRV (ASCII)
  34(22):            "  ala ISO646-IRV (ASCII)
  35(23):            #  ala ISO646-IRV (ASCII)
  36(24):            $  ala ISO646-IRV (ASCII)
  37(25):            %  ala ISO646-IRV (ASCII)
  38(26):            &  ala ISO646-IRV (ASCII)
  39(27):            '  ala ISO646-IRV (ASCII)
  40(28):            (  ala ISO646-IRV (ASCII)
  41(29):            )  ala ISO646-IRV (ASCII)
  42(2A):            *  ala ISO646-IRV (ASCII)
  43(2B):            +  ala ISO646-IRV (ASCII)
  44(2C):            ,  ala ISO646-IRV (ASCII)
  45(2D):            -  ala ISO646-IRV (ASCII)
  46(2E):            .  ala ISO646-IRV (ASCII)
  47(2F):            /  ala ISO646-IRV (ASCII)
  48(30):            0  ala ISO646-IRV (ASCII)
  49(31):            1  ala ISO646-IRV (ASCII)
  50(32):            2  ala ISO646-IRV (ASCII)
  51(33):            3  ala ISO646-IRV (ASCII)
  52(34):            4  ala ISO646-IRV (ASCII)
  53(35):            5  ala ISO646-IRV (ASCII)
  54(36):            6  ala ISO646-IRV (ASCII)
  55(37):            7  ala ISO646-IRV (ASCII)
  56(38):            8  ala ISO646-IRV (ASCII)
  57(39):            9  ala ISO646-IRV (ASCII)
  58(3A):            :  ala ISO646-IRV (ASCII)
  59(3B):            ;  ala ISO646-IRV (ASCII)
  60(3C):            <  ala ISO646-IRV (ASCII)
  61(3D):            =  ala ISO646-IRV (ASCII)
  62(3E):            >  ala ISO646-IRV (ASCII)
  63(3F):            ?  ala ISO646-IRV (ASCII)
  64(40):            @  ala ISO646-IRV (ASCII)
  65(41):            A  ala ISO646-IRV (ASCII)
  66(42):            B  ala ISO646-IRV (ASCII)
  67(43):            C  ala ISO646-IRV (ASCII)
  68(44):            D  ala ISO646-IRV (ASCII)
  69(45):            E  ala ISO646-IRV (ASCII)
  70(46):            F  ala ISO646-IRV (ASCII)
  71(47):            G  ala ISO646-IRV (ASCII)
  72(48):            H  ala ISO646-IRV (ASCII)
  73(49):            I  ala ISO646-IRV (ASCII)
  74(4A):            J  ala ISO646-IRV (ASCII)
  75(4B):            K  ala ISO646-IRV (ASCII)
  76(4C):            L  ala ISO646-IRV (ASCII)
  77(4D):            M  ala ISO646-IRV (ASCII)
  78(4E):            N  ala ISO646-IRV (ASCII)
  79(4F):            O  ala ISO646-IRV (ASCII)
  80(50):            P  ala ISO646-IRV (ASCII)
  81(51):            Q  ala ISO646-IRV (ASCII)
  82(52):            R  ala ISO646-IRV (ASCII)
  83(53):            S  ala ISO646-IRV (ASCII)
  84(54):            T  ala ISO646-IRV (ASCII)
  85(55):            U  ala ISO646-IRV (ASCII)
  86(56):            V  ala ISO646-IRV (ASCII)
  87(57):            W  ala ISO646-IRV (ASCII)
  88(58):            X  ala ISO646-IRV (ASCII)
  89(59):            Y  ala ISO646-IRV (ASCII)
  90(5A):            Z  ala ISO646-IRV (ASCII)
  91(5B):            [  ala ISO646-IRV (ASCII)
  92(5C):            \  ala ISO646-IRV (ASCII)
  93(5D):            ]  ala ISO646-IRV (ASCII)
  94(5E):            ^  ala ISO646-IRV (ASCII)
  95(5F):            _  ala ISO646-IRV (ASCII)
  96(60):            `  ala ISO646-IRV (ASCII)
  97(61):            a  ala ISO646-IRV (ASCII)
  98(62):            b  ala ISO646-IRV (ASCII)
  99(63):            c  ala ISO646-IRV (ASCII)
 100(64):            d  ala ISO646-IRV (ASCII)
 101(65):            e  ala ISO646-IRV (ASCII)
 102(66):            f  ala ISO646-IRV (ASCII)
 103(67):            g  ala ISO646-IRV (ASCII)
 104(68):            h  ala ISO646-IRV (ASCII)
 105(69):            i  ala ISO646-IRV (ASCII)
 106(6A):            j  ala ISO646-IRV (ASCII)
 107(6B):            k  ala ISO646-IRV (ASCII)
 108(6C):            l  ala ISO646-IRV (ASCII)
 109(6D):            m  ala ISO646-IRV (ASCII)
 110(6E):            n  ala ISO646-IRV (ASCII)
 111(6F):            o  ala ISO646-IRV (ASCII)
 112(70):            p  ala ISO646-IRV (ASCII)
 113(71):            q  ala ISO646-IRV (ASCII)
 114(72):            r  ala ISO646-IRV (ASCII)
 115(73):            s  ala ISO646-IRV (ASCII)
 116(74):            t  ala ISO646-IRV (ASCII)
 117(75):            u  ala ISO646-IRV (ASCII)
 118(76):            v  ala ISO646-IRV (ASCII)
 119(77):            w  ala ISO646-IRV (ASCII)
 120(78):            x  ala ISO646-IRV (ASCII)
 121(79):            y  ala ISO646-IRV (ASCII)
 122(7A):            z  ala ISO646-IRV (ASCII)
 123(7B):            {  ala ISO646-IRV (ASCII)
 124(7C):            |  ala ISO646-IRV (ASCII)
 125(7D):            }  ala ISO646-IRV (ASCII)
 126(7E):            ~  ala ISO646-IRV (ASCII)
 127(7F):             ???
 128(80):             ???
 129(81):             ???
 130(82):             ???
 131(83):             ???
 132(84):             ???
 133(85):             ???
 134(86):             ???
 135(87):             ???
 136(88):             ???
 137(89):             ???
 138(8A):             ???
 139(8B):             ???
 140(8C):             ???
 141(8D):             ???
 142(8E):             ???
 143(8F):             ???
 144(90):             ???
 145(91):             ???
 146(92):             ???
 147(93):             ???
 148(94):             ???
 149(95):             ???
 150(96):             ???
 151(97):             ???
 152(98):             ???
 153(99):             ???
 154(9A):             ???
 155(9B):             ???
 156(9C):             ???
 157(9D):             ???
 158(9E):             ???
 159(9F):             ???
 160(A0):       nbsp  Non breaking space
 161(A1):      iexcl  = inverted exclamation mark 
 162(A2):       cent  = cent sign 
 163(A3):      pound  = pound sign 
 164(A4):     curren  = general currency sign 
 165(A5):        yen  = /yen =yen sign 
 166(A6):     brvbar  = broken (vertical) bar 
 167(A7):       sect  = section sign 
 168(A8):        uml  =umlaut mark
 169(A9):       copy  = copyright sign 
 170(AA):       ordf  = ordinal indicator, feminine 
 171(AB):      laquo  = angle quotation mark, left 
 172(AC):             ???
 173(AD):        shy  Soft Hyphen
 174(AE):        reg  = /circledR =registered sign 
 175(AF):       macr  =macron
 176(B0):       ring  =ring
 177(B1):     plusmn  = /pm B: =plus-or-minus sign 
 178(B2):       sup2  = superscript two 
 179(B3):       sup3  = superscript three 
 180(B4):      acute  =acute accent
 181(B5):      micro  = micro sign 
 182(B6):       para  = pilcrow (paragraph sign) 
 183(B7):     middot  = /centerdot B: =middle dot 
 184(B8):      cedil  =cedilla
 185(B9):       sup1  = superscript one 
 186(BA):       ordm  = ordinal indicator, masculine 
 187(BB):      raquo  = angle quotation mark, right 
 188(BC):     frac14  = fraction one-quarter 
 189(BD):       half  = fraction one-half 
 190(BE):     frac34  = fraction three-quarters 
 191(BF):     iquest  = inverted question mark 
 192(C0):     Agrave  capital A, grave accent 
 193(C1):     Aacute  capital A, acute accent 
 194(C2):      Acirc  capital A, circumflex accent 
 195(C3):     Atilde  capital A, tilde 
 196(C4):       Auml  capital A, dieresis or umlaut mark 
 197(C5):      Aring  capital A, ring 
 198(C6):      AElig  capital AE diphthong (ligature) 
 199(C7):     Ccedil  capital C, cedilla 
 200(C8):     Egrave  capital E, grave accent 
 201(C9):     Eacute  capital E, acute accent 
 202(CA):      Ecirc  capital E, circumflex accent 
 203(CB):       Euml  capital E, dieresis or umlaut mark 
 204(CC):     Igrave  capital I, grave accent 
 205(CD):     Iacute  capital I, acute accent 
 206(CE):      Icirc  capital I, circumflex accent 
 207(CF):       Iuml  capital I, dieresis or umlaut mark 
 208(D0):        ETH  capital Eth, Icelandic 
 209(D1):     Ntilde  capital N, tilde 
 210(D2):     Ograve  capital O, grave accent 
 211(D3):     Oacute  capital O, acute accent 
 212(D4):      Ocirc  capital O, circumflex accent 
 213(D5):     Otilde  capital O, tilde 
 214(D6):       Ouml  capital O, dieresis or umlaut mark 
 215(D7):             ???
 216(D8):     Oslash  capital O, slash 
 217(D9):     Ugrave  capital U, grave accent 
 218(DA):     Uacute  capital U, acute accent 
 219(DB):      Ucirc  capital U, circumflex accent 
 220(DC):       Uuml  capital U, dieresis or umlaut mark 
 221(DD):     Yacute  capital Y, acute accent 
 222(DE):      THORN  capital THORN, Icelandic 
 223(DF):      szlig  small sharp s, German (sz ligature) 
 224(E0):     agrave  small a, grave accent 
 225(E1):     aacute  small a, acute accent 
 226(E2):      acirc  small a, circumflex accent 
 227(E3):     atilde  small a, tilde 
 228(E4):       auml  small a, dieresis or umlaut mark 
 229(E5):      aring  small a, ring 
 230(E6):      aelig  small ae diphthong (ligature) 
 231(E7):     ccedil  small c, cedilla 
 232(E8):     egrave  small e, grave accent 
 233(E9):     eacute  small e, acute accent 
 234(EA):      ecirc  small e, circumflex accent 
 235(EB):       euml  small e, dieresis or umlaut mark 
 236(EC):     igrave  small i, grave accent 
 237(ED):     iacute  small i, acute accent 
 238(EE):      icirc  small i, circumflex accent 
 239(EF):       iuml  small i, dieresis or umlaut mark 
 240(F0):        eth  small eth, Icelandic 
 241(F1):     ntilde  small n, tilde 
 242(F2):     ograve  small o, grave accent 
 243(F3):     oacute  small o, acute accent 
 244(F4):      ocirc  small o, circumflex accent 
 245(F5):     otilde  small o, tilde 
 246(F6):       ouml  small o, dieresis or umlaut mark 
 247(F7):             ???
 248(F8):     oslash  small o, slash 
 249(F9):     ugrave  small u, grave accent 
 250(FA):     uacute  small u, acute accent 
 251(FB):      ucirc  small u, circumflex accent 
 252(FC):       uuml  small u, dieresis or umlaut mark 
 253(FD):     yacute  small y, acute accent 
 254(FE):      thorn  small thorn, Icelandic 
 255(FF):       yuml  small y, dieresis or umlaut mark