GithubHelp home page GithubHelp logo

kartik1998 / pdf-images Goto Github PK

View Code? Open in Web Editor NEW
17.0 2.0 0.0 447 KB

The library aims to simplify pdf-conversion by providing wrappers over poppler / pdfImages & imageMagick to convert pdfs to images.

Home Page: https://www.npmjs.com/package/pdf-images

License: MIT License

TypeScript 100.00%
imagemagick poppler pdfimages pdf image pdf2image

pdf-images's Introduction

pdf-images

Simplify pdf-conversion by using in built methods which use poppler & imageMagick to convert pdfs to images.

pdf-image

pdf2image

Note

linux: Ensure you have imagemagick and pdfImages installed
mac: Ensure you have imagemagick and poppler installed
windows: not supported
Poppler is very fast and shows results in milliseconds, however it's accuracy is low compared to image magick. If your pdf has images like for example images of of cards etc. then using poppler is a good idea, however if you have proper pdfs which are let's say converted from md files, then I would suggest using imagemagick.

Usage: Poppler

const { Poppler } = require('pdf-images');
const result = Poppler.convert('/pdf/path/sample_pdf.pdf', 'output/directory/path', 'outputName'); // you can also add a 4th arguement which can specify the output image extension like jpg or jpeg
  • A successfull result will look something like:
{
  pdfPath: '/pdf/path/sample_pdf.pf',
  outputImagesDirectory: '/output/directory/outputName/',
  images: [
    '/output/directory/outputName/outputName-001.png',
    '/output/directory/outputName/outputName-002.png'
  ],
  success: true
}
  • An error response will look something like:
{
  pdfPath: '/pdf/path/sample_pdf.pf',
  error: <Err object>
}

Usage: ImageMagick

Async api to convert

  • By default images have png extension
  • you can also add a string of the args that you want to run with the imagemagick shell command. checkout resultWithArgs
const { ImageMagick } = require('pdf-images');
const result = ImageMagick.convertAsync('/pdf/path/sample_pdf.pdf', 'output/directory/path', 'outputName');
const resultWithArgs = ImageMagick.convertAsync('/pdf/path/sample_pdf.pdf', 'output/directory/path', 'outputName', '-alpha background');
const resultWithDifferentExtension = ImageMagick.convertAsync(
  '/pdf/path/sample_pdf.pdf',
  'output/directory/path',
  'outputName',
  null,
  'jpeg',
);
  • A successfull result will look something like:
{
  pdfPath: '/pdf/path/sample_pdf.pf',
  outputImagesDirectory: '/output/directory/outputName/',
  commandExecuted: 'convert -quiet -alpha background -density 200 -quality 100 /pdf/path/sample_pdf.pf /output/directory/outputName/outputName.jpeg',
  images: [
    '/output/directory/outputName/outputName-001.jpeg',
    '/output/directory/outputName/outputName-002.jpeg'
  ],
  success: true
}
const { ImageMagick } = require('pdf-images');
const result = ImageMagick.convert('/pdf/path/sample_pdf.pdf', 'output/directory/path', 'outputName'); // you can also add a 4th arguement which can specify the output image extension like jpg or jpeg
  • A successfull result will look something like:
{
  pdfPath: '/pdf/path/sample_pdf.pf',
  outputImagesDirectory: '/output/directory/outputName/',
  images: [
    '/output/directory/outputName/outputName-001.png',
    '/output/directory/outputName/outputName-002.png'
  ],
  success: true
}
  • An error response will look something like:
{
  pdfPath: '/pdf/path/sample_pdf.pf',
  error: <Err object>
}
  • To set the density and quality of imagemagick use:
ImageMagick.setQuality(100);
ImageMagick.setDensity(200);
  • Default ImageMagick quality is 100 and density is 200

pdf-images's People

Contributors

kartik1998 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pdf-images's Issues

Add a generic parameter.

Hello,

Unfortunately, the jpg extension (issue #10) does not solve the transparency issue. I mean it removes transparency but for one example I am working with, it turns the white background of the PDF into black in JPG.

I found this link https://www.imagemagick.org/discourse-server/viewtopic.php?t=31014 that gives a solution for PDFs. Indeed, the below command works better:
convert -density 300x300 -units pixelsperinch Test.pdf -background white -alpha background -alpha off -antialias -compress none +adjoin Test.png

During my tests I found that this one is OK as well:
convert -density 300 -quality 100 Test.pdf -alpha background -alpha off Test.png

What I would like you to do, if you agree, is to add a generic parameter where we can add things like -alpha background -alpha off.
Obviously is up to the user to add correct options here.

Florin

Add new parameter to convert function

I noticed that png generated by convert have transparent background, when converting PDFs with white background. This is not happening when using jpg, like this:
convert -quiet -density 200 Test.pdf -quality 200 Test.jpg

I would like you to add one more parameter with the file extension.

public static convert(pdfPath: string, outputImgDir: string, outputImgName: string, outputImgExtension: string): any {
.........
outputImgPath + '/' + outputImgName + outputImgExtension,

I attached a test file and the command lines to convert them:

convert -quiet -density 200 Test.pdf -quality 200 Test.jpg
convert -quiet -density 200 Test.pdf -quality 200 Test.png

Test.pdf

Test

Test

Invalid Parameter pdf-images

I installed modules on localhost but an error occurs when I try to process pdf.
here is the code

exports.convertPdfToPng = async(req,res,next)=>{

    try{
        const result = ImageMagick.convert(pdfPath, baseDir, '/mainpng')
        if(result.success === true){
            return next();
        }
    }catch(err) {
        res.status(403).json({
            status:'failed conversion pdf->jpeg',
            messahe:err,
        })
    }
}

on the online server it works without problems, it saves the image in the respective folder,but I also need to use this functionality locally but the following error occurs.

Invalid Parameter - -quiet
Invalid Parameter - -density
Invalid Parameter - -JS

I think there is a difference between development environments,I think that on the online server I have a toolkit from ImageMagick, gm resources that I don't have locally.
error comes from image-magick.js but I can't figure out what is causing the error.

 static convert(pdfPath, outputImgDir, outputImgName) {
        const outputImgPath = path_1.default.join(outputImgDir, outputImgName);
        if (!fs_1.default.existsSync(outputImgPath)) {
            fs_1.default.mkdirSync(outputImgPath);
        }
        const infoObject = { pdfPath };
        try {
            execFileSync('convert', [
                '-quiet',
                '-density',
                this.density,
                pdfPath,
                '-quality',
                this.quality,
                outputImgPath + '/' + outputImgName + '.jpg',
            ]);
            infoObject.outputImagesDirectory = outputImgPath;
            infoObject.images = fs_1.default.readdirSync(outputImgPath).map((img) => outputImgPath + '/' + img);
            infoObject.success = true;
        }
        catch (err) {
            infoObject.error = err;
        }
        return infoObject;
    }

Pdf paths with spaces

Hello,

The solution implemented with issue #16 is not OK.
I give you an example from my project.

I am running this: ImageMagick.convertAsync(sourceFilePath, conversionTargetFolder, outPrefix, null, "png")
where
sourceFilePath = "./conversion/1705769513380/source/65abf8486435e55c55736ea8/Test PDF with spaces.pdf"
conversionTargetFolder = "./conversion/1705769513380/target/"
outPrefix = "Test PDF with spaces"

This is the error:

Error: Command failed: convert -quiet -density 200 -quality 100 ./conversion/1705769513380/source/65abf8486435e55c55736ea8/Test\ PDF\ with\ spaces.pdf conversion/1705769513380/target/Test PDF with spaces/Test PDF with spaces.png
convert-im6.q16: unable to open image conversion/1705769513380/target/Test': No such file or directory @ error/blob.c/OpenBlob/2924. convert-im6.q16: no decode delegate for this image format ' @ error/constitute.c/ReadImage/575.
convert-im6.q16: unable to open image PDF': No such file or directory @ error/blob.c/OpenBlob/2924. convert-im6.q16: unable to open image PDF': No such file or directory @ error/blob.c/OpenBlob/2924.
convert-im6.q16: no decode delegate for this image format ' @ error/constitute.c/ReadImage/575. convert-im6.q16: unable to open image with': No such file or directory @ error/blob.c/OpenBlob/2924.
convert-im6.q16: unable to open image with': No such file or directory @ error/blob.c/OpenBlob/2924. convert-im6.q16: no decode delegate for this image format ' @ error/constitute.c/ReadImage/575.
convert-im6.q16: unable to open image spaces/Test': No such file or directory @ error/blob.c/OpenBlob/2924. convert-im6.q16: unable to open image spaces/Test': No such file or directory @ error/blob.c/OpenBlob/2924.
convert-im6.q16: no decode delegate for this image format ' @ error/constitute.c/ReadImage/575. convert-im6.q16: unable to open image PDF': No such file or directory @ error/blob.c/OpenBlob/2924.
convert-im6.q16: unable to open image PDF': No such file or directory @ error/blob.c/OpenBlob/2924. convert-im6.q16: no decode delegate for this image format ' @ error/constitute.c/ReadImage/575.
convert-im6.q16: unable to open image with': No such file or directory @ error/blob.c/OpenBlob/2924. convert-im6.q16: unable to open image with': No such file or directory @ error/blob.c/OpenBlob/2924.
convert-im6.q16: no decode delegate for this image format `' @ error/constitute.c/ReadImage/575.

at ChildProcess.exithandler (node:child_process:422:12)
at ChildProcess.emit (node:events:514:28)
at maybeClose (node:internal/child_process:1105:16)
at ChildProcess._handle.onexit (node:internal/child_process:305:5) {
code: 1,
killed: false,
signal: null,
cmd: 'convert -quiet -density 200 -quality 100 ./conversion/1705769513380/source/65abf8486435e55c55736ea8/Test\ PDF\ with\ spaces.pdf conversion/1705769513380/target/Test PDF with spaces/Test PDF with spaces.png'
}

In my opinion you should not use addBackslashForSpaces, instead you need to enclose the source and target between double quotes, like this:

const commandToBeExecuted = convert -quiet ${args || ''} -density ${ImageMagick.density} -quality ${ ImageMagick.quality } "${pdfPath}" "${outputImgPath + '/' + outputImgName + '.' + imgExtension}"
.replace(/\s+/g, ' ')
.trim();

That should produce something like this:
cmd: 'convert -quiet -density 200 -quality 100 "./conversion/1705769513380/source/65abf8486435e55c55736ea8/Test PDF with spaces.pdf" "conversion/1705769513380/target/Test PDF with spaces/Test PDF with spaces.png"'

Florin

window support

There is a possibility to add support for windows as well??

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.