PDA

View Full Version : HTML to PDF



Mystinar
29th October 2007, 06:28 PM
Are there any programs for Linux that can be used to convert HTML files to PDF files? If not, is there some combination of converter programs that can be used to do so (like an html-to-xml and then xml-to-pdf)?

grassmonk
29th October 2007, 08:10 PM
htmldoc will do HTML to PDF. It's in the repos.

Mystinar
29th October 2007, 08:38 PM
Okay, thanks.

FriedChips
29th October 2007, 09:29 PM
from firefox choose print then choose print to file.

William Haller
30th October 2007, 04:06 PM
We've experimented with quite a few. All have some fundamental levels of functionality.

When dealing with images the PDFs can get messy. We had the best luck writing a short Perl script to convert the HTML directly to LaTeX and then letting latex produce high quality PDFs from them. Images can be difficult in LaTeX as well if you are trying to produce an in-line effect.

Since our web pages followed a similar format, we could substitute specific LaTeX markup included in HTML comments in the web page when problem pictures were found, and pick up the originals rather than the thumbnails used for the web page at the same time. For our product brochures, this approach can generally take our web pages directly and produce a good looking PDF that is of much higher quality that we got out of the htmldoc type programs. Only an occasional page that has overlapping in line images needs some manual adjustment after the fact.

We do some tests on last modification times to see if the source HTML file or picture has changed versus the saved PDFs and either recreate a new PDF or serve the last created one when requested.

pobbz
30th October 2007, 04:16 PM
Just to add one: cups-pdf. It essentially adds a pdf printer to your cups setup.