GithubHelp home page GithubHelp logo

jtanadi / conveyor Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 0.0 453 KB

Microservice to convert PDFs to image files and upload them to AWS S3

Home Page: http://raa-conveyor.herokuapp.com/

JavaScript 4.17% TypeScript 95.83%

conveyor's People

Contributors

jtanadi avatar

Watchers

 avatar  avatar

conveyor's Issues

Improve queue

Queue API is a little awkward. Consider using an event-based approach.

Fix error handling

  • Send error messages to client instead of throwing internally and only sending 500: Server Error
  • How to tell client about pingback error?
    • Can conveyor ensure pingback address is valid before processing files?

Some conversion processes take a long time

Based on our current task queue, we are only processing one PDF at a time, so files queued later seem like they're taking a long time.

Can we run spawn() concurrently?

Check when messages are sent

For some reason, it seems like "Queuing task" is the last message sent before files are processed and uploaded (ie. scotty shows that message the longest). Check when these messages are being sent & that they're being sent in the right order / at the right time.

Generalize API

Possibly:

/api/pdf2img/?out=png
/api/pdf2img/?out=jpg
etc.

This way, we can add other endpoints in the future, if we need to convert other file types:

/api/doc2pdf/
/api/xls2pdf/
etc.

PNG optimization is a little bit slow

We're currently not setting an optimization level for optipng, and it may be testing different levels to find a best fit. Perhaps it'll work faster when we set a (low) level?

AWS Lambda?

Look into turning conveyor into an AWS Lambda function.

Send progress updates to pingback address

  • Currently only sending pingback when process is completed, but it might be good to send updates to pingback address at major milestones
  • Milestones: start, progress: converting file, progress: uploading, end, error

Ideas for pingback message structure, somewhat following node's events (req.on("end"), req.on("error"), etc.):

// For anything in progress, like sending logs
{
  status: "progress",
  message: "Converting PDF with GhostScript"
}

// When process is finally completed, `message`
// will contain relevant data
{
  status: "end",
  message: {
    // contents here (roomID, s3Dir, etc.)
  }
}

// On error
{
  status: "error",
  message: "Network error"
}

Some images should be scaled down

Screen Shot 2020-04-16 at 11 31 37 AM

Currently gs is outputting at 150dpi resolution:

  • Probably excessive for our use case
  • Ends up outputting full-scale panels very large (eg. files that are 10โ€“20+ MB)

Is there a way to find out PDF size before converting and resizing as necessary?

Use restana

Use restana instead of express, since we don't really need all the features of express.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.