Re: HTML-PS converter.

Jared_Rhine@hmc.edu
Thu, 15 Dec 1994 16:52:32 +0100

JD == Jim Davis <davis@DRI.cornell.edu>

?> I would like to ask for the help: Where can I find the software for
?> converting HTML to PostScript?

JD> There is no HTML to PS software.

Incorrect. I include a recent package release announcement below.

-- begin excerpt from package announcement --

From: jan@betelgeuse.tdb.uu.se (Jan K{rrman)
Newsgroups: comp.infosystems.www.providers,comp.infosystems.www.users
Subject: ANNOUNCE: html2ps, an HTML-to-PostScript converter
Date: 12 Dec 1994 08:38:52 GMT

Version 0.1 beta of html2ps, an HTML-to-PostScript converter, is now
available as "ftp://ftp.tdb.uu.se/pub/sources/html2ps/html2ps_0.1beta.tar".
You can also fetch the individual (four) files from the same directory.
Use binary mode when retrieving the files. The converter is written in
Perl. Perl is available from any comp.sources.misc archive.

First of all: html2ps cannot handle in-line images.

Are you still reading!? Well then, here are some features extracted from
the README file:

-------------------------------------------------------------------------
* Most HTML tags are handled.

* Scaling of the text to any size is possible (the line and page breaks
will off course be adjusted to fit the page).

* It is possible to change the sizes and styles for all the 6 header
levels individually.

* The font size used for preformatted text may be changed.

* The size of the page can be adjusted. The defaults are adapted to the
A4 paper size.

* The margin sizes may be changed.

* Different fonts can be selected. You can easily add new fonts, an
example is given in the Perl script.

* Printing in landscape mode is supported.

* Anchor texts are underlined by default, this can be turned off.

* No syntax check of the HTML code is done by the converter, but it is
possible to call an external HTML checker, specified via the command
line options. The default syntax checker is weblint.

* Page numbers can be inserted.

* A heading tag will cause a page break if the text is close to the end
of a page.

* Highlighting tags is additively interpreted. For example, the HTML
code "<B><I>some text</I></B>" would produce bold italic text.
This can be turned off so only the innermost tag is interpreted
(here, the italics).

* You can force a page break by including the comment <!--NewPage-->
in the HTML document, at the point you want the page break. (This
action is not defined in the HTML specification. I would like to
have a special character (eg &page;), ignored by screen browsers,
but used to force a page break when printing a document.)

* The generated PostScript code is very compact, it will be less than
the size of the HTML file plus the size of a PostScript header
(presently about 8 kilobytes).
-----------------------------------------------------------------------

The converter is not as kind as certain browsers towards incorrect
HTML code. For example, Mosaic writes "&lt" as "<", even though the
trailing semicolon is missing.

The Perl script has been tested with Perl 4 and 5 on different Suns
running Solaris 2.3 and SunOS 4.1.3. I would like to hear your
experiences with installing html2ps on other platforms.

I am also very interested in getting suggestions for improvements and
bug reports.

Jan Karrman
Dept. of Scientific Computing
Uppsala University, Sweden
jan@tdb.uu.se

-- end excerpt from package announcement --

-- 
Jared_Rhine@hmc.edu | Harvey Mudd College | http://www.hmc.edu/~jared/home.html

"One cannot mark the point without marking the path."