Re: Entities

Daniel W. Connolly
Wed, 21 Sep 94 16:23:24 EDT

In message <>, writes:
>The enclosed list of entities comprises the ones that are in the draft plus
>the ones that define the other ISO 8859-1 characters.
>Since it's stated that the character set is ISO 8859-1, shouldn't the full
>set of entites be available?
>The additions are at the end.
>Mosaic 2.4 doesn't seem to support these, so you culd argue that this is not
>current practice. It does seem odd to support these chars with &#ddd; and
>not with named references, though.

Only a little. I suspect this will be one of the issues addresed in
HTML 2.1.

Here's a table that summarizes the HTML character set
as I understand it. Note that characters 27, 127-159, 172,
215, and 247 (decimal) still represent outstanding issues:

27: an escape character for ISO2022 escape sequences?
(the multi-lingual document issue again...)

127-159: is there any defined use for these?

172: in the X fonts, it's a "logical not" character.
Is this part of the ISO8859-1 standard?
What's the SGML entity name (&lnot; ?)

215: in X fonts, it's a "times" character...
2247: in X fonts, it's a "divide" character...

There's also the question of whether PRE lines end in CR, LF, CRLF,
or any of the above.

Number Entity Glyph Description
0(00): --UNUSED--
1(01): --UNUSED--
2(02): --UNUSED--
3(03): --UNUSED--
4(04): --UNUSED--
5(05): --UNUSED--
6(06): --UNUSED--
7(07): --UNUSED--
8(08): --UNUSED--
9(09): TAB (just like space, except in pre)
10(0A): LF (just like spece, except in pre)
11(0B): --UNUSED--
12(0C): --UNUSED--
13(0D): CR (just like space, except in pre)
14(0E): --UNUSED--
15(0F): --UNUSED--
16(10): --UNUSED--
17(11): --UNUSED--
18(12): --UNUSED--
19(13): --UNUSED--
20(14): --UNUSED--
21(15): --UNUSED--
22(16): --UNUSED--
23(17): --UNUSED--
24(18): --UNUSED--
25(19): --UNUSED--
26(1A): --UNUSED--
27(1B): ESC ???
28(1C): --UNUSED--
29(1D): --UNUSED--
30(1E): --UNUSED--
31(1F): --UNUSED--
32(20): ala ISO646-IRV (ASCII)
33(21): ! ala ISO646-IRV (ASCII)
34(22): " ala ISO646-IRV (ASCII)
35(23): # ala ISO646-IRV (ASCII)
36(24): $ ala ISO646-IRV (ASCII)
37(25): % ala ISO646-IRV (ASCII)
38(26): & ala ISO646-IRV (ASCII)
39(27): ' ala ISO646-IRV (ASCII)
40(28): ( ala ISO646-IRV (ASCII)
41(29): ) ala ISO646-IRV (ASCII)
42(2A): * ala ISO646-IRV (ASCII)
43(2B): + ala ISO646-IRV (ASCII)
44(2C): , ala ISO646-IRV (ASCII)
45(2D): - ala ISO646-IRV (ASCII)
46(2E): . ala ISO646-IRV (ASCII)
47(2F): / ala ISO646-IRV (ASCII)
48(30): 0 ala ISO646-IRV (ASCII)
49(31): 1 ala ISO646-IRV (ASCII)
50(32): 2 ala ISO646-IRV (ASCII)
51(33): 3 ala ISO646-IRV (ASCII)
52(34): 4 ala ISO646-IRV (ASCII)
53(35): 5 ala ISO646-IRV (ASCII)
54(36): 6 ala ISO646-IRV (ASCII)
55(37): 7 ala ISO646-IRV (ASCII)
56(38): 8 ala ISO646-IRV (ASCII)
57(39): 9 ala ISO646-IRV (ASCII)
58(3A): : ala ISO646-IRV (ASCII)
59(3B): ; ala ISO646-IRV (ASCII)
60(3C): < ala ISO646-IRV (ASCII)
61(3D): = ala ISO646-IRV (ASCII)
62(3E): > ala ISO646-IRV (ASCII)
63(3F): ? ala ISO646-IRV (ASCII)
64(40): @ ala ISO646-IRV (ASCII)
65(41): A ala ISO646-IRV (ASCII)
66(42): B ala ISO646-IRV (ASCII)
67(43): C ala ISO646-IRV (ASCII)
68(44): D ala ISO646-IRV (ASCII)
69(45): E ala ISO646-IRV (ASCII)
70(46): F ala ISO646-IRV (ASCII)
71(47): G ala ISO646-IRV (ASCII)
72(48): H ala ISO646-IRV (ASCII)
73(49): I ala ISO646-IRV (ASCII)
74(4A): J ala ISO646-IRV (ASCII)
75(4B): K ala ISO646-IRV (ASCII)
76(4C): L ala ISO646-IRV (ASCII)
77(4D): M ala ISO646-IRV (ASCII)
78(4E): N ala ISO646-IRV (ASCII)
79(4F): O ala ISO646-IRV (ASCII)
80(50): P ala ISO646-IRV (ASCII)
81(51): Q ala ISO646-IRV (ASCII)
82(52): R ala ISO646-IRV (ASCII)
83(53): S ala ISO646-IRV (ASCII)
84(54): T ala ISO646-IRV (ASCII)
85(55): U ala ISO646-IRV (ASCII)
86(56): V ala ISO646-IRV (ASCII)
87(57): W ala ISO646-IRV (ASCII)
88(58): X ala ISO646-IRV (ASCII)
89(59): Y ala ISO646-IRV (ASCII)
90(5A): Z ala ISO646-IRV (ASCII)
91(5B): [ ala ISO646-IRV (ASCII)
92(5C): \ ala ISO646-IRV (ASCII)
93(5D): ] ala ISO646-IRV (ASCII)
94(5E): ^ ala ISO646-IRV (ASCII)
95(5F): _ ala ISO646-IRV (ASCII)
96(60): ` ala ISO646-IRV (ASCII)
97(61): a ala ISO646-IRV (ASCII)
98(62): b ala ISO646-IRV (ASCII)
99(63): c ala ISO646-IRV (ASCII)
100(64): d ala ISO646-IRV (ASCII)
101(65): e ala ISO646-IRV (ASCII)
102(66): f ala ISO646-IRV (ASCII)
103(67): g ala ISO646-IRV (ASCII)
104(68): h ala ISO646-IRV (ASCII)
105(69): i ala ISO646-IRV (ASCII)
106(6A): j ala ISO646-IRV (ASCII)
107(6B): k ala ISO646-IRV (ASCII)
108(6C): l ala ISO646-IRV (ASCII)
109(6D): m ala ISO646-IRV (ASCII)
110(6E): n ala ISO646-IRV (ASCII)
111(6F): o ala ISO646-IRV (ASCII)
112(70): p ala ISO646-IRV (ASCII)
113(71): q ala ISO646-IRV (ASCII)
114(72): r ala ISO646-IRV (ASCII)
115(73): s ala ISO646-IRV (ASCII)
116(74): t ala ISO646-IRV (ASCII)
117(75): u ala ISO646-IRV (ASCII)
118(76): v ala ISO646-IRV (ASCII)
119(77): w ala ISO646-IRV (ASCII)
120(78): x ala ISO646-IRV (ASCII)
121(79): y ala ISO646-IRV (ASCII)
122(7A): z ala ISO646-IRV (ASCII)
123(7B): { ala ISO646-IRV (ASCII)
124(7C): | ala ISO646-IRV (ASCII)
125(7D): } ala ISO646-IRV (ASCII)
126(7E): ~ ala ISO646-IRV (ASCII)
127(7F):  ???
128(80): ???
129(81): ???
130(82): ???
131(83): ???
132(84): ???
133(85): ???
134(86): ???
135(87): ???
136(88): ???
137(89): ???
138(8A): ???
139(8B): ???
140(8C): ???
141(8D): ???
142(8E): ???
143(8F): ???
144(90): ???
145(91): ???
146(92): ???
147(93): ???
148(94): ???
149(95): ???
150(96): ???
151(97): ???
152(98): ???
153(99): ???
154(9A): ???
155(9B): ???
156(9C): ???
157(9D): ???
158(9E): ???
159(9F): ???
160(A0): nbsp Non breaking space
161(A1): iexcl = inverted exclamation mark
162(A2): cent = cent sign
163(A3): pound = pound sign
164(A4): curren = general currency sign
165(A5): yen = /yen =yen sign
166(A6): brvbar = broken (vertical) bar
167(A7): sect = section sign
168(A8): uml =umlaut mark
169(A9): copy = copyright sign
170(AA): ordf = ordinal indicator, feminine
171(AB): laquo = angle quotation mark, left
172(AC): ???
173(AD): shy Soft Hyphen
174(AE): reg = /circledR =registered sign
175(AF): macr =macron
176(B0): ring =ring
177(B1): plusmn = /pm B: =plus-or-minus sign
178(B2): sup2 = superscript two
179(B3): sup3 = superscript three
180(B4): acute =acute accent
181(B5): micro = micro sign
182(B6): para = pilcrow (paragraph sign)
183(B7): middot = /centerdot B: =middle dot
184(B8): cedil =cedilla
185(B9): sup1 = superscript one
186(BA): ordm = ordinal indicator, masculine
187(BB): raquo = angle quotation mark, right
188(BC): frac14 = fraction one-quarter
189(BD): half = fraction one-half
190(BE): frac34 = fraction three-quarters
191(BF): iquest = inverted question mark
192(C0): Agrave capital A, grave accent
193(C1): Aacute capital A, acute accent
194(C2): Acirc capital A, circumflex accent
195(C3): Atilde capital A, tilde
196(C4): Auml capital A, dieresis or umlaut mark
197(C5): Aring capital A, ring
198(C6): AElig capital AE diphthong (ligature)
199(C7): Ccedil capital C, cedilla
200(C8): Egrave capital E, grave accent
201(C9): Eacute capital E, acute accent
202(CA): Ecirc capital E, circumflex accent
203(CB): Euml capital E, dieresis or umlaut mark
204(CC): Igrave capital I, grave accent
205(CD): Iacute capital I, acute accent
206(CE): Icirc capital I, circumflex accent
207(CF): Iuml capital I, dieresis or umlaut mark
208(D0): ETH capital Eth, Icelandic
209(D1): Ntilde capital N, tilde
210(D2): Ograve capital O, grave accent
211(D3): Oacute capital O, acute accent
212(D4): Ocirc capital O, circumflex accent
213(D5): Otilde capital O, tilde
214(D6): Ouml capital O, dieresis or umlaut mark
215(D7): ???
216(D8): Oslash capital O, slash
217(D9): Ugrave capital U, grave accent
218(DA): Uacute capital U, acute accent
219(DB): Ucirc capital U, circumflex accent
220(DC): Uuml capital U, dieresis or umlaut mark
221(DD): Yacute capital Y, acute accent
222(DE): THORN capital THORN, Icelandic
223(DF): szlig small sharp s, German (sz ligature)
224(E0): agrave small a, grave accent
225(E1): aacute small a, acute accent
226(E2): acirc small a, circumflex accent
227(E3): atilde small a, tilde
228(E4): auml small a, dieresis or umlaut mark
229(E5): aring small a, ring
230(E6): aelig small ae diphthong (ligature)
231(E7): ccedil small c, cedilla
232(E8): egrave small e, grave accent
233(E9): eacute small e, acute accent
234(EA): ecirc small e, circumflex accent
235(EB): euml small e, dieresis or umlaut mark
236(EC): igrave small i, grave accent
237(ED): iacute small i, acute accent
238(EE): icirc small i, circumflex accent
239(EF): iuml small i, dieresis or umlaut mark
240(F0): eth small eth, Icelandic
241(F1): ntilde small n, tilde
242(F2): ograve small o, grave accent
243(F3): oacute small o, acute accent
244(F4): ocirc small o, circumflex accent
245(F5): otilde small o, tilde
246(F6): ouml small o, dieresis or umlaut mark
247(F7): ???
248(F8): oslash small o, slash
249(F9): ugrave small u, grave accent
250(FA): uacute small u, acute accent
251(FB): ucirc small u, circumflex accent
252(FC): uuml small u, dieresis or umlaut mark
253(FD): yacute small y, acute accent
254(FE): thorn small thorn, Icelandic
255(FF): yuml small y, dieresis or umlaut mark