Re: Perceived Consensus: Murray's entity stuff goes in

Murray Maloney (murray@sco.COM)
Tue, 11 Oct 1994 10:18:41 -0400 (EDT)

> In message <9410101117.aa18290@dali.scocan.sco.COM>, Murray Maloney writes:
> >The "Additional Entity Sets (Proposed)" was presented
> >as a way of committing HTML to using the 8879 entity sets.
> >I think that we would all agree that we should not adopt
> >other entity-naminmg schemes and break one-to-one compatibility
> >with the rest of the SGML docu-verse. N'est pas?
> Agreed: if we need names for characters, and there's an ISO entity
> name for the character, we'll use it.

Right. The statement that I submitted says:

Additional Entity Sets (Proposed)

In future editions of HTML, additional entity sets
may be officially supported. Implementors are encouraged
to adopt the naming conventions described in Annex D
of the SGML standard (ISO 8879) for the following entity sets:

> >I wonder, then, if there might be some more appropriate part of
> >the RFC -- perhaps where the relationship between HTML and SGML
> >is described -- to commit HTML to SGML common practice(s) including
> >the use of these well-known entity sets.
> I'm willing to commit to supporting mnemonic entities for characters
> that are already in the HTML character set (ISO8859-1) like &shy;,
> &nbsp;, &iexcl;, &laquo;, and such.

By which I think you mean that if a character is already supported,
by virtue of it being part of the supported ISO8859-1, then we
could commit to providing "character entity" support in addition
to the "numeric character references". This is more specific,
for ISO8859-1, than I was expecting from the spec. But it is
certainly an acceptable "stake in the ground" from my perspective.
> I don't think it's wise to suggest that HTML will include all the ISO
> entity sets -- ISOnum (frac58, darr, sung), ISOgrk1,2,3 (agr-OHgr),
> ISOtech (becaus, bernou, exist, forall) ISOcyr1,2 (Dcy, etc.) -- until
> we understand what impact this will have on browsers and web
> communications in general.

I don't think that the wording presented above makes that suggestion
-- I may be wrong, but I don't think it does.
> I'd hate to see some developer on a machine with every font in the
> world cook up some design and say "See, it's easy!" and force
> everybody else to bloat their browser installations with zillions of
> fonts.

I expect that there would continue to be an expectation that such
a feature would not be considered part of the spec until it was
deployed in at least three browsers -- or whatever the "official
acid test" is these days.
> It's likely that if we introduce all these entity sets, browser
> implementors will just find clunky translations to ASCII, and
> "fidelity of communications on the web," one of the stated goals of
> this standard, will suffer.

As I've said, I would expect the feature to remain unofficial until
it was widely deployed. However, I think that it would be perfectly
acceptable for a browser to substitute ASCII renderings of characters
if the alternative were no support for a character. That is, I find
that "a=PI*r^2" is far more helpful than "the area is equal to pi
times the radius squared".

> For 2.1, we need to tackle the multilingual WWW. For 2.0, I don't want
> to suggest that HTML is _anything_ beyond ISO8859-1, the 8bit Latin1
> character set.

The ISO8859-1 8bit Latin1 character set is specified as the code set
-- not the set of characters to be rendered. So long as we keep
that distinction, I agree completely. In fact, for 2.0 nothing
else needs to be said or done. But for 2.1 and beyond...

> Dan
Murray C. Maloney Internet:
Technical Publications Writer/Architect Uucp: ...uunet!sco!murray
SCO Canada, Inc. My Phone: (416) 960-4031
130 Bloor Street West, 10th Floor Fax: (416) 922-2704
Toronto, Ontario, Canada M5S 1N5 SCO Phone: (416) 922-1937
Sponsor member of Davenport Group (
Member of IETF HTML Working Group (