Re: ISO techniques for Mathematics tagging

Paul Grosso (paul@arbortext.com)
Fri, 21 Apr 95 07:43:57 EDT

> From: "Mike Cowlishaw" <mfc@VNET.IBM.COM>
> Subject: ISO techniques for Mathematics tagging
>
> At Danvers I was asked to post the reference to the mathematics tags
> published by ISO. It is:
>
> ISO/TR 9573:1988, Information Processing -- SGML support
> facilities -- Techniques for Using SGML. (Section 8, pp83-97)
>
> As I recall, the author of this was Anders Berglund, of the ISO
> Secretariat in Geneva. The tags were also used in CALS, I believe.
>
> Mike Cowlishaw
> IBM Fellow
>

This isn't the complete story. ISO TR 9573 was the first time
a math tag set was described in an ISO publication. (This is
a Technical Report which isn't quite the same as an International
Standard.) The AAP's Electronic Manuscript Project had also published
quite early a math tag set which was picked up by several vendors
and user communities. Over the past several years, an attempt
was made to update the AAP math and harmonize it with the TR 9573 math.
The results of that effort is the maths tagging set found in the
international standard ISO 12083:1993 (Eric van Herwijnen, editor).

- - - - More detail on the above (definitely optional reading) - - - -

The major efforts at defining SGML definitions for tagging mathematics
have been that of the Association of American Publishers (AAP) in their
Electronic Manuscript Project starting around 1986 (and resulting in 1988
in an ANSI NISO standard Z39.59-1988, though the mathematics part was not
actually made part of the official standard); that recorded in the ISO
Technical Report 9573:1988 (there are several Parts, and one is on Maths
and Chemistry); and that which was started as an AAP Math Update committee
to revise the AAP work which, in fact, resulted in the ISO standard titled
"Electronic manuscript preparation and markup"--ISO 12083:1993 which
has just now become available. Since 12083 provides what is basically an
updated form of the original AAP work, I would take the view that there
are now really two extant schemes worth investigating, both of which are
available as ISO publications: ISO TR 9573:1988 Part 7 and ISO 12083
(though note that 9573 undergoes constant revision, and the parts change
numbers, so I cannot be certain of the "Part 7" part, but it is the part
about mathematics and chemistry).

For the above two references, each has a DTD fragment plus some
explanatory text. Neither discusses design philosophy much, but quite
a bit of discussion did precede especially that of 12083. Our
committee meetings ran the gamut from those who wanted to represent all
possible distinct mathematical semantics in the markup to those who
wanted the markup to represent the geometrical and typographic layout
in a semantic-free way. In fact, most of the mathematicians present
did not believe one could capture all (present and potential future)
semantics and therefore preferred a scheme that would represent the
obvious semantics plus allowed, via some more generic geometrical
constructs, the representation of a wide range of other possibilities.
The 12083 DTD was an attempt to accommodate this desire.

The CALS MIL-M-28001 specification includes a older snapshot of the
TR 9573 math tagging. In practice, it has seen very little use. The
current version of MIL-M-28001 has the following about math tagging:

When creating a tagged instance for a specific document or contract,
mathematical elements must be handled in one of four ways:

a. Simple in-line formulas and equations can be formatted as regular
text using special characters, superscripts, and subscripts.

b. Complex formulas and equations can be generated with an
illustration program and included or referenced as graphics
in the tagged instance.

c. Complex formulas and equations can be formatted using a specialized
mathematical formatting tool where the resulting composed formulas or
equations are included as graphics in the tagged instance. It is
recommended in this case that the source for these formulas and
equations be included or referenced in the tagged instance using some
element declaration designed to accommodate this notation.

d. SGML tagging can be employed for marking up simple and complex
mathematics. However, since the current state of the Output
Specification does not support the association of complex mathematical
formatting specifications to arbitrary SGML declaration sets, this
option must be evaluated in terms of the practical availability of
composition systems that will be able to format such SGML-tagged
mathematics, and it should not be assumed that such systems are or
will be necessarily widely available.

At the top of the appendix section where it lists the CALS version of the
TR 9573 math tagging, it states:

The following lists a declaration set defining a logical, SGML structure
for describing mathematical formulae. Other SGML declaration sets such
as ISO 12083, Electronic Manuscript Preparation and Markup, are also
available for tagging mathematics.

- - - - - - - - - - - - - - - - - - - - - - - - -

Paul Grosso
VP Research, ArborText, Inc.
and
Chief Technical Officer, SGML Open

Email: paul@arbortext.com