micahstubbs / screenshot-service Goto Github PK
View Code? Open in Web Editor NEWservices to create a screenshot of a web page. optimized for screenshotting interactive data graphics.
services to create a screenshot of a web page. optimized for screenshotting interactive data graphics.
spin up a server that runs a headless browser to take screenshots
(and perhaps generate pdfs as well ๐ค )
add init.d script to run screenshot-bot service on startup
https://askubuntu.com/questions/9382/how-can-i-configure-a-service-to-run-at-startup
a follow up from #9
docs https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagepdfoptions
speed up screenshot serving from cache
perhaps use an in-memory store like redis to speed up access to cached screenshots?
current process is to make two API calls to GCP
screenshotmachine wrapper
custom filename param
feature idea:
support screenshot size presets like
preview 960px by 500px
thumbnail 230px by 120px
as well as caller-specified screenshot dimensions.
could fix the crop to originate at the top-left 0,0 point, or could also support an xy translate param as well to set the origin point for the crop ๐ค
per @curran #9 (comment)
Some snippets that may be useful (from the thumbnail generation of datavis.tech):
const setThumbnail = require('../../db/actions/setThumbnail')
const generateThumbnailBuffer = require('./generateThumbnailBuffer')
module.exports = async (browser, sandbox, shareDbDoc) => {
const html = await sandbox({id: shareDbDoc.id})
const page = await browser.newPage()
const thumbnailBuffer = await generateThumbnailBuffer(page, html)
const thumbnail = thumbnailBuffer.toString('base64')
setThumbnail(shareDbDoc, thumbnail)
}
const sharp = require('sharp')
module.exports = async generateThumbnailBuffer(page, html) => {
await page.setViewport({width: 960, height: 500})
await page.setContent(html)
await page.waitFor(5000)
const buffer = await page.screenshot()
await page.close()
return sharp(buffer)
.resize(230, 120)
.toBuffer()
}
const browser = await puppeteer.launch({args: ['--no-sandbox']})
pages like these:
https://bl.ocks.org/isaacs/raw/1636859/
https://bl.ocks.org/GerHobbelt/raw/3192376/
need to read up on the best ways to handle errors like this when using a puppeteer to control headless chrome
https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagepdfoptions
pages param for pdfs
let's reuse/directly expose the pageRanges
param from puppeteer:
https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagepdfoptions
browshot wrapper
improve scale performance
better handle concurrency for
upgrade auth to use jwt token
something like this perhaps https://medium.com/@patrykcieszkowski/jwt-authentication-in-express-js-ee898b87a60
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: (node:29770) UnhandledPromiseRejectionWarning: Error: Page crashed!
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: at Page._onTargetCrashed (/home/ubuntu/screenshot-service/dedicated-server/node_modules/puppeteer/lib/Page.js:102:24)
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: at Session.Page.client.on.event (/home/ubuntu/screenshot-service/dedicated-server/node_modules/puppeteer/lib/Page.js:97:56)
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: at emitOne (events.js:116:13)
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: at Session.emit (events.js:211:7)
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: at Session._onMessage (/home/ubuntu/screenshot-service/dedicated-server/node_modules/puppeteer/lib/Connection.js:210:12)
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: at Connection._onMessage (/home/ubuntu/screenshot-service/dedicated-server/node_modules/puppeteer/lib/Connection.js:105:19)
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: at emitOne (events.js:116:13)
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: at WebSocket.emit (events.js:211:7)
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: at Receiver._receiver.onmessage (/home/ubuntu/screenshot-service/dedicated-server/node_modules/ws/lib/WebSocket.js:141:47)
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: at Receiver.dataMessage (/home/ubuntu/screenshot-service/dedicated-server/node_modules/ws/lib/Receiver.js:389:14)
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: (node:29770) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
Oct 31 19:22:17 screenshot-bot screenshot-bot[29770]: (node:29770) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
urlbox.io wrapper
add aws lambda implementation
a tutorial for doing this with AWS Lambda
https://serverless.com/blog/building-a-serverless-screenshot-service-with-lambda/
source here https://github.com/svdgraaf/serverless-screenshot
explore a writing some code to wrap service url2png
explore using image buffer instead of writing to server filesystem
it looks like this is possible if we just omit the path parameter here (this example is for pdfs, but we could do the same thing in our png image generation code)
https://medium.com/@raphaelstbler/advanced-pdf-generation-for-node-js-using-puppeteer-e168253e159c
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.