GithubHelp home page GithubHelp logo

brucewu16899 / chrome-print Goto Github PK

View Code? Open in Web Editor NEW

This project forked from brad-jones/chrome-print

0.0 2.0 0.0 2.05 MB

A set of docker containers that automate Google Chrome to convert HTML documents into PDF Documents.

License: MIT License

PHP 89.78% Shell 4.19% Nginx 5.29% HTML 0.74%

chrome-print's Introduction

Google Chrome Print

A set of docker containers that automate Google Chrome using XVFB, Xdotool, Visgrep and other peices of tech to convert HTML documents into PDF Documents.

THIS IS AN EXPERIMENT, DO NOT USE IN PRODUCTION

Why?

I have another project Gears\Pdf that uses phantomjs to convert HTML to PDF. This is what I use in production but the problem is that it is hard to debug layout issues because phantomjs is not the same rendering engine as chrome, it uses a much older version of webkit.

I have also used wkhtmltopdf with varying results. While they do use a patched version of webkit which supports a few extra things, like custom fonts, it still suffers from the same problem.

I can't develop the HTML/CSS using a standard browser. I can get close using a normal browser but then need to spend time generating many PDF's to fix and tweak layout bugs.

One day I then read an article about controlling GUI apps with XVFB and XDOTOOL. I was also just starting to play with docker. My thinking was that if we can just use the latest version of Chrome to generate the PDF then I could use my workstation instance of Chrome in my document development workflows like any other web page.

How?

First there is the main RoboFile.php, this is used in conjunction with another docker project of mine called conductor

This provides the "glue" between all the docker containers.

  • storage: The first container is the storage container. This provides several mount points that get shared to all other containers.

  • nginx: This runs an instance of nginx which serves files from /var/www/html Which is shared via the storage container.

  • php-fpm: We run the php fast cgi process manager in this container. It is configured to communicate to nginx via a unix socket, that is shared via the storage container: /var/run/php-fpm.sock

  • xvfb: The container that houses Google Chrome running inside a virtual frame buffer setup by xvfb-run. The xvfb-pool container will spawn new instances of this container as needed.

  • xvfb-pool: This is just a PHP script that enters into a never ending loop. It manages the xvfb pool files located in /var/run/xvfb-pool, again shared via the storage container. The php REST api will look in here for instances of the xvfb contaienr that are booted and ready for use. This pool manager script will automatically create new xvfb containers and remove expired containers. This makes the REST requests as fast as possible.

Most of the actual logic for controlling Google Chrome is contained int he php REST api. There is a class XdoTool.php that is basically a wrapper for xdotool a command which sends keyboard and mouse events to an X server.

The Reality

The reality is that this is simply too slow and error prone. If anything this project taught me more about docker containers than anything.

I had it running beautifully on my workstation and then tried to deploy it too an Amazon EC2 instance and it just refused to work. Im sure if I upgraded the EC2 instance with some more RAM, maybe an extra core or 2 it would have worked but it wasn't my instance to upgrade.

The issues stem from the fact that because we are dealing with a GUI application timing of sending key strokes and mouse clicks is absolutely crtitical. If the button has not yet been drawn to the frame buffer we can't click it, which then just snowballs.

I tried to mitigate this where possible by using visgrep

Then not mention the fact that took on the order of ~5 seconds to actually print a PDF, compared to less than 1 second when using phantomjs.

If selenium ever adds an option to Print pages then maybe this might be worth looking at again.

chrome-print's People

Contributors

brad-jones avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.