Goals of HTML Math [Was: Shortref [was: Re: Super and Subscripts] ]

Daniel W. Connolly (connolly@hal.com)
Fri, 20 Jan 95 16:19:46 EST

In message <9501201947.AA13430@texcel.no.texcel.no>, Paul Grosso writes:
>> From: "Daniel W. Connolly" <connolly@hal.com>
>> I don't think this argument about automated HTML editing can hold
>> water until the technology for it is widely deployed.

>I would
>hope the ArborText SGML editor will provide a WYSIWYG interface for whatever
>"HTML mathematics" turns out to be, it can now input an SGML file with
>shortrefs.] I just want to be sure we don't design a tagging structure
>that's so complex that the only way to make it acceptable is with shortrefs,
>because I believe many users will find shortrefs unacceptably difficult to use

Hmmm... perhaps I spoke too soon. Maybe I'm in the dark about the
technology that's already out there...

What "tag sets" are currently supported in direct-mainpulation (aka
WYSIWYG) interfaces?

Hmmm... I'm getting a really bad feeling that we're embarking on
"design by committee."

And I'm sure we've all had bad experiences with open-ended designs,
i.e. design efforts where there are no clear goals or requirements.

When can we declare victory on the design of tables, figures, and math
for HTML? (no, I haven't forgotten that we still need to close on the
2.0 document... I'm just bored to tears over it!)

Here are some possible design criteria/requirements:

* HTML tables aren't done until we can represent (aka tranlate) CALS
tables, FrameMaker MIF tables, and Microsoft Word RTF tables, and
LaTeX tables (no user-definitions) to HTML tables. Perhaps 100%
expressive capability is not a good idea, but somebody should map out
the issues and get some experience converting some large bodies of
existing data; I would very much like to see some sort of "white
paper" ala the paper on LaTeX2HTML that discusses the issues involved.

* HTML math isn't done until we can resonably represent ISO20???
equasions, FrameMaker equasions, and MS Word equasions. We will never
be able to represent turing-complete langauges like TeX, but
FrameMaker and Microsoft support a fixed set of features that seem to
satisfy a large market segment.

* The HTML tables/math design isn't complete until someone does at
least a proof-of-concept translation to PostScript for printing.

* HTML tables/math design isn't complete until someone does at least
a proof-of-concept implementation on X, Mac, and Windows platforms,
to be sure we've explored all the issues like availability of fonts
and font metrics.

Hmmm... this brings up a nagging suspicion that I've had all along: I
suspect there is no "85%" solution to the problem of typesetting
mathematics. I doubt the technical journal community is willing to let
browsers do mathematical typesetting. They're much more likely to be
satisfied with adding support for dvi or hyperdvi inline images than
any sort of mathematical typesetting on the client side.

What other communities/markets are really pulling for math in HTML? I
doubt the marketing/support providers need it very much. I doubt Joe
HomePage needs it. The academic types all use Postcript derived from
TeX. (and they're working on hyper-dvi.)

I can see the need for tables/figures/generalized layout. It's used in
newsletters, marketing materials, technical papers, magazines,
newspapers, and just about all forms of written communications. The
only question is how to balance generalized layout versus resizable
windows and fonts.

And I hope that all the implementors aren't tying tag names to
presentational features too tightly: I hope folks are seeing that
generic markup + style sheets is an emerging paradigm. Currently,
authors are discovering that there's no way to "tickle" certain
features of their displays that they're used to seeing: centering,
blinking, colors, special fonts, certain forms of alignment, etc.

I expect this process will eventually terminate, and folks will be
able to get whatever look they want from HTML (or from DSSSL-Lite,
whichever gets there first!).

But then folks will discover that maintaining documents that look like:

<li> <blink><bold>WARNING:</bold></blink> <h3>Do not put
the blowdrier in the bathtub!</h3>

is tedious, messy, error-prone, and a general waste of time. If they're
given the opportunity to write:

<warning>Do not put the blowdrier in the bathtub!</warning>

using a stylesheet and DTD of their own design, I think they'll go for

And I don't think down-conversion on the server side is a lasting
solution. I think that the cost of that translation, plus the cost of
the lost information is greater than the cost of enhancing browsers to
support the generalized markup + stylesheets technique.

But then again... maybe not. Maybe the constraints of authoring and
document management are irreconcilable with the constraints of
distribution and rendering, and there will always be a conversion step.
But I sure hope not.