Home » Programming » Archive by category "Javascript"

Convert HTML to PDF with HTML2PDF Web Service

HTML2PDF Web ServiceRecently I launched my new product HTML2PDF Web Service — a web service for converting HTML to PDF.

In this post I’d like to talk about HTML2PDF Web Service. Why to choose it, how to use it and what technologies were used to create it.

Why Choose HTML2PDF Web Service?

Programmatically generating PDF documents is a painful and time consuming problem that neither makes your developers nor designers happy. With HTML2PDF Web Service you can design your invoices or reports in HTML, style them with CSS and convert the resulting page into a PDF document. Using HTML2PDF Web Service saves your developers and designers time which is better spent making your product better.

Say your web application or mobile app (or any application for that matter) needs to generate invoices or reports in PDF format. Unless you can install special HTML to PDF conversion software you’re probably stuck with some of the libraries available for your language that can programmatically generate PDF documents. To do this you would probably design your document in something like MS Word, LibreOffice Writer or perhaps HTML. After this design has been approved you can start programming your PDF module; setting up coordinates, font sizes etc. And then all of the sudden you notice your library has limited support for doing actual document layouts and presenting tabular data that can span multiple lines. Now you need to write your own routines for splitting text over multiple lines, keep track of coordinates and make sure nothing overlaps. If like me you’ve already been there, it’s quite the nightmare.

So being able to design in HTML, style with CSS (heck, even use a bit of JavaScript) and convert the resulting page to PDF would speed up this process a lot. Am I starting to tickle your interest?

How to use HTML2PDF Web Service

Simply create your soon to be PDF documents in HTML, style them with CSS and if wanted you can use JavaScript as well. The final document is best previewed in a WebKit based browser such as Google Chrome, since that’s the technology HTML2PDF Web Service uses in the background to render the HTML and convert it to PDF.

Here are some examples on how to call the web service. Converting HTML to PDF is easy with the HTML2PDF Web Service. You can pass an URL to the page you want to convert or either send the HTML code with the request.

cURL

$ curl -H "X-API-Key: F8802062-4D31-11E3-8F59-BFD4058B6BFF"
       -H "X-API-Username: MyUsername"
       -d '{"content":"<html><head><title>My page</title></head><body><h1>Hello World!</h1><p>I am an HTML page converted to PDF!</p></body></html>"}'
       https://html2pdfwebservice.com/api/convert > page.pdf

Perl

#!/usr/bin/env perl
use strict;
use warnings;
use Mojo::UserAgent;

my $ua = Mojo::UserAgent->new;
my $tx = $ua->post(
    'https://html2pdfwebservice.com/api/convert' => {
        'X-API-Username' => 'MyUsername',
        'X-API-Key'      => 'F8802062-4D31-11E3-8F59-BFD4058B6BFF'
    } => json => {url => 'http://domain.com/invoice.html'}
);
if (my $res = $tx->success) {
    my $pdf_data = $res->body;
}

Ruby

require 'net/https'
require 'uri'

uri           = URI.parse('https://html2pdfwebservice.com/api/convert')
https         = Net::HTTP.new(uri.host, uri.port)
https.use_ssl = true
# In case the SSL certificate isn't accepted
https.verify_mode = OpenSSL::SSL::VERIFY_NONE

req = Net::HTTP::Post.new(uri.path)
req['X-API-Username'] = 'MyUsername'
req['X-API-Key']      = 'F8802062-4D31-11E3-8F59-BFD4058B6BFF'
req.body              = '{"url": "http://domain.com/invoice.html"}'

res = https.request(req)
if res.code == '200'
    pdf_data = res.body
    # - or write to file -
    # File.open('invoice.pdf', 'w') { |file| file.write(res.body) }
end

PHP

$settings = array(
    'url' => 'http://domain.com/invoice.html',
);

$curl = curl_init();
curl_setopt($curl, CURLOPT_POST, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, json_encode($settings));
curl_setopt($curl, CURLOPT_HTTPHEADER, array(
    'X-API-Username: MyUsername',
    'X-API-Key: F8802062-4D31-11E3-8F59-BFD4058B6BFF'
));

curl_setopt($curl, CURLOPT_URL, 'https://html2pdfwebservice.com/api/convert');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
// Helps to debug in case of issues
// curl_setopt($curl, CURLOPT_VERBOSE, 1);

// In case the SSL certificate isn't accepted because of outdated certificates
// on your server
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);

$res = curl_exec($curl);
// Save PDF to disk
file_put_contents('document.pdf', $res);
curl_close($curl);

Technologies used to develop HTML2PDF Web Service

The most interesting part in developing HTML2PDF Web Service was choosing which technology to use for converting HTML to PDF. After doing research on the subject and testing several solutions I eventually went with a WebKit based solution. By using WebKit it’s easier for the end user to preview their document using a WebKit based browser.

The HTML to PDF conversion server was developed using Go. Go is a fun language to program with, does concurrency in a really nice way and can produce a native executable for Linux, OS X, Windows and some other platforms. Thanks to Go the conversion server is fast, snappy and low on memory and CPU usage. Being able to create a binary executable allows me to sell the conversion server as a standalone product as well.

To get access to the web service there’s also a web application which is written in Perl. My favorite web framework of choice has become Mojolicious for quite some time now and thus HTML2PDF Web Service has been written with it. DBIx::Class has been used for database interaction and Validation::Class is used to validate all user inputted data.

Used databases are PostgreSQL and Redis. The former is used to store user accounts, subscriptions and more. The latter is used to keep track of token usage per user.

Sign up now for a free trial

If after reading all this and you’re still reading, please do sign up for a free trial. The trial gives full access to all the features of the web service so if you like it, please consider buying a subscription.

In case of any questions, please do contact me either through the comments on this page or send an e-mail to support at support@html2pdfwebservice.com.

Help beta test an HTML2PDF Web Service

In an earlier post I asked if anyone would be interested to help me out test a web service for converting HTML to PDF. Today I’m opening up the beta to anyone that’s interested.

Please visit https://html2pdfwebservice.com/ and sign-up for a 7-day trial account. No credit card required! Trial length can be extended upon request.

Converting HTML to PDF is easy with the HTML2PDF Web Service. Here are some examples:

cURL

$ curl -H "X-API-Key: F8802062-4D31-11E3-8F59-BFD4058B6BFF"
       -H "X-API-Username: MyUsername"
       -d '{"content":"<html><head><title>My page</title></head><body><h1>Hello World!</h1><p>I am an HTML page converted to PDF!</p></body></html>"}'
       https://html2pdfwebservice.com/api/convert > page.pdf

Perl

#!/usr/bin/env perl
use strict;
use warnings;
use Mojo::UserAgent;

my $ua = Mojo::UserAgent->new;
my $tx = $ua->post(
    'https://html2pdfwebservice.com/api/convert' => {
        'X-API-Username' => 'MyUsername',
        'X-API-Key'      => 'F8802062-4D31-11E3-8F59-BFD4058B6BFF'
    } => json => {url => 'http://domain.com/invoice.html'}
);
if (my $res = $tx->success) {
    my $pdf_data = $res->body;
}

Ruby

require 'net/https'
require 'uri'

uri           = URI.parse('https://html2pdfwebservice.com/api/convert')
https         = Net::HTTP.new(uri.host, uri.port)
https.use_ssl = true
# In case the SSL certificate isn't accepted
https.verify_mode = OpenSSL::SSL::VERIFY_NONE

req = Net::HTTP::Post.new(uri.path)
req['X-API-Username'] = 'MyUsername'
req['X-API-Key']      = 'F8802062-4D31-11E3-8F59-BFD4058B6BFF'
req.body              = '{"url": "http://domain.com/invoice.html"}'

res = https.request(req)
if res.code == '200'
    pdf_data = res.body
    # - or write to file -
    # File.open('invoice.pdf', 'w') { |file| file.write(res.body) }
end

PS: Prices are subject to change. During the beta you can’t use your own credit card for payments since we’re still running in sandbox mode. All data will be wiped after the beta ends. Expected launch date will be some time in January 2014.

Google Tag Manager

Today I learned of the existence Google Tag Manager. With Google Tag Manager it becomes easier to place tags of any kind such as Google Analytics or any other tracking tag on your website and even on specific pages. All you’ve got to do is include the Google Tag Manager Javascript, setup your tags and you’re good to go.

I’ve added it to this website as well and it’s now managing the placement of Google Analytics for me.

It can of course be used for more than only Google Analytics. You could place tags to measure how many people have reached your sales page and have actually made a purchase.

You can always add these codes yourself but the added benefit from using Google Tag Manager is that it’s dynamic. No need to edit your website’s templates. Just set up a tag, define on which page(s) it must be shown and you’re good to go. It’s a nice and simple concept. Free as well.

I’m now a freelance developer

To my surprise I hadn’t even announced here that I have recently started freelancing! As of August the 1st I now operate under the name Kras IT and am available as a freelance developer. Since it’s now the 1st of September this means I’m already in business for a month now which has been exciting. I’ve been with my previous employer for almost 7 years and after a lot of thinking and planning I decided to take the leap!

As a freelance developer you can hire me for all your PHP and Perl work. I mainly do webdevelopment but I do a lot of backend as well. Aside from that I also enjoy configuring Linux servers. I personally think being able to configure and optimize both your app and the server(s) the app is running on can be a great asset, as it gives a lot more insight in the overall workings of your software.

Aside from doing freelance work I also plan on doing product development. I’ve got a few ideas in the pipeline of which one I expect to launch within 5 months. I’m likely to blog about this in the near future as well as about freelancing and running your own business.

For a full list of skills you can take a look at my website at Kras IT. Currently the website is still Dutch only but my LinkedIn and résumé are in English. Do you’ve got any questions or are in need of a freelance developer? Feel free to contact me!

Create a PDF document out of an HTML page

Perl has several modules on CPAN for creating and manipulating PDF files. Just a single search on PDF results in over 500 modules that have something to do with PDF files.

The most useful (or rather essential for PDF processing) are PDF::API2 and CAM::PDF. The former lends itself best for creating PDF’s and the latter for manipulating existing PDF’s and extracting data (such as plain text) from it.

Though these modules make handling PDF’s easier, handling PDF’s still isn’t much fun. As I was in need of a way to generate PDF’s out of work orders (or job tickets) and not feeling much for creating the layout manually and properly formatting paragraphs (manually) with PDF::API2 I started to look further.

I ended up trying out PDF::FromHTML. With PDF::FromHTML you can create a simple HTML layout and let the module create a PDF out of it. You can do some basic configuration such as changing fonts and font-size (check out its documentation for more). It also provides a nifty command line tool called html2pdf.pl for converting an HTML page to a PDF.

The resulting PDF’s from PDF::FromHTML weren’t as pretty as I had wanted, but good enough for the problem I needed solving. But after I started using these work order PDF’s in practice I found I needed more formatting freedom when writing the problem description. So I decided to add Markdown support through Text::Markdown.

Using Markdown I had added a list of tasks to a work order with the items being in bold text and the descriptions underneath it in normal text. Sadly the PDF’s created by PDF::FromHTML didn’t cope very well with nested HTML-elements. A bold paragraph would somehow cause the next paragraph become bold as well. I think that’s a bug in PDF::FromHTML and I’m sure it can be fixed and shame on me for not looking into it.

So instead of seeing if I could fix the bug I did a quick search on the internet and stumbled upon xhtml2pdf, which is provided by python-pisa/xhtml2pdf. Pisa is a Python library for converting HTML pages to PDF’s. It’s far more sophisticated than PDF::FromHTML as it supports more (all?) HTML tags and even CSS2 (plus some CSS3 stuff) for styling.

Currently my webapp will be using xhtml2pdf if it’s available or either fall back to PDF::FromHTML.

Some other interesting Perl PDF modules worth looking into some day are PDF::Boxer and PDF::TextBlock. And while writing this post I also found out that PhantomJS, a headless WebKit, also has a way of saving a page to PDF. So even though handling PDF’s still isn’t a lot of fun, with all these modules and software available it has become a lot easier.

Want to use a web service to convert HTML to PDF? Then take a look at HTML2PDF Web Service.