Re: SGML parsing, moving between formats

Dave_Raggett <dsr@hplb.hpl.hp.com>
Errors-To: listmaster@www0.cern.ch
Date: Wed, 16 Feb 1994 15:03:23 --100
Message-id: <9402161357.AA17436@manuel.hpl.hp.com>
Errors-To: listmaster@www0.cern.ch
Reply-To: dsr@hplb.hpl.hp.com
Originator: www-talk@info.cern.ch
Sender: www-talk@www0.cern.ch
Precedence: bulk
From: Dave_Raggett <dsr@hplb.hpl.hp.com>
To: Multiple recipients of list <www-talk@www0.cern.ch>
Subject: Re: SGML parsing, moving between formats
X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas
Content-Length: 1582
At  1:19 PM 2/15/94 +0000, Daniel W. Connolly wrote:

> Yes... and it seems to me (at first glance... I'll have to look more
> closely...) that we've lost the ability to translate HTML to Microsoft
> Word or FrameMaker without any loss of information.
>
> Let's get formal why don't we: I do not mean that we should be able to
> take any RTF file and convert it to HTMLPLUS, or MIF for that matter.
> But I think it's crucial that there exist invertible mappings
>
>        h : HTML -> RTF
> and
>        g : HTML -> MIF
> and
>        h : HTML -> TeXinfo
>
> so that I can take a given HTML document, convert it to RTF, and
> convert it back and get exactly what I started with (the same ESIS,
> that is... perhaps SGML comments and a few meaningless RE's would get
> lost).

I don't understand why Dan thinks that mapping from HTML+ to the other
formats will be impossible. It seems straightforward enough to me.
The main decision is what style to assume for each logical element.

Reversibility *is* a big problem with HTML as current filters from
FrameMaker or LaTeX to HTML tend to translate tables and math into
inline images. This problem goes away when we switch to HTML+ as
you can then translate reversibly into HTML+ tables and math. Thats
why adoption of these features is so important. I will be demoing them
at the forthcoming WWW Conference with a new X11 HTML+ browser.

You would still lose any style settings in the process, but even this
can be dealt with by associating the HTML+ document with as style sheet
as suggested by O'Reilly & Associates.

Dave Raggett