googlechromelabs / pptraas.com Goto Github PK
View Code? Open in Web Editor NEWPuppeteer as a service
Home Page: https://pptraas.com
License: Apache License 2.0
Puppeteer as a service
Home Page: https://pptraas.com
License: Apache License 2.0
Puppeteer also provides capability to do screenshot for a specific element -- https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#elementhandlescreenshotoptions
Can pptraas.com do that someday?
Hope so. Thx.
I'm building a service modeled after pptraas. We're doing some performance/load testing, and it seems that lack of caching is a big issue. Is there a reason that y'all don't re-use the browser? Wondering if there are any gotchas / threading issues with re-using the browser. Certainly seems like a big performance advantage to do so!
Trying to launch the example https://pptraas.com/screenshot?url=https://paul.kinlan.me/ fails with 500. Tried other sites, and pages inside the blog also works just fine. Only the home page screenshot fails.
We have a React page, and want to know when it is finished rendering. Can pptraas
supply a function hook to call when rendering is complete, for the /pdf
endpoint?
We should handle errors and promise rejections more effectively - in some cases, we should outright crash the instance.
Provide a landscape
argument to /pdf
.
I've hit issue puppeteer/puppeteer#594 using a modified endpoint from this repo. Did I do something wrong?
I added an endpoint to simply return the html content:
app.get("/html", async (request, response) => {
const url = request.query.url;
if (!url) {
return response.status(400).send("Please provide a URL. Example: ?url=https://example.com");
}
const browser = response.locals.browser;
const page = await browser.newPage();
const res = await page.goto(url, { waitUntil: "networkidle0" });
const content = await page.content();
response.status(res.status()).send(content);
await browser.close();
});
Any idea how to address the EventEmitter memory leak detected. 11 exit listeners added
warning, which led to the server being unresponsive eventually?
The pptraas.com website is down.
Trying to switch from rendertron but it's failing when trying to execute with message that Chromeium isn't installed:
> [email protected] start /app
> node server.js
App is listening on port 8080
(node:19) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1):
AssertionError [ERR_ASSERTION]: Chromium revision is not downloaded. Run "npm install" or "yarn install"
Steps used:
git clone https://github.com/GoogleChromeLabs/pptraas.com.git
cd pptraas.com
npm install
docker build -t pptraas . --no-cache=true
docker run -it -p 8080:8080 pptraas
Version details:
$ node --version && npm --version && docker --version
v10.1.0
6.0.0
Docker version 18.05.0-ce-rc1, build 33f00ce
When taking a screenshot of webgdedeck.com it doesn't load in the content, I suspect this is because it is loaded after the 'load' event.
It would be nice to let the user configure when the action should take place in the page life-cycle.
Does anyone know a good strategy to use this with aws serverless architecture? Something like running lambda@edge for each connection, checking if its a crawler and redirecting it to pptraas.com? I never did such a thing and would really like to know how to do it and how much extra cost the lambda will be.
Provide a printBackground
argument to /pdf
.
Hi,
Thanks for the nice repo, it is very helpful. Just one issue:
app.all('*', async (request, response, next) => {
response.locals.browser = await puppeteer.launch({
dumpio: true,
// headless: false,
// executablePath: 'google-chrome',
args: ['--no-sandbox', '--disable-setuid-sandbox'], // , '--disable-dev-shm-usage']
});
next(); // pass control on to routes.
});
This code shows that we create a new browser every time for a new request, this requires more memory and more load time than create a new tab for the same browser.
I wonder what's the thought behind this?
Thanks,
Vincent
It appears nodejs8 has been removed
ERROR: (gcloud.app.deploy) INVALID_ARGUMENT: Invalid runtime 'nodejs8' specified. Accepted runtimes are: [php, php55, python27, java, java7, java8, go111, go112, go113, java11, nodejs10, nodejs12, php72, php73, python37, python38, ruby25]
On http://pptraas.com github link points to:
https://github.com/GoogleChromeLabs/puppeteeraas.com
should be:
Scale + speed + uptime!
Here's what I'm using in other projects.
Dockerfile
FROM node:9.5-slim
MAINTAINER Eric Bidelman <ebidel@>
# See https://crbug.com/795759
#RUN apt-get update && apt-get install -yq libgconf-2-4
# Install latest chrome dev package and fonts to support major charsets (Chinese, Japanese, Arabic, Hebrew, Thai and a few others)
# Note: this installs the necessary libs to make the bundled version of Chromium that Puppeteer
# installs, work.
RUN apt-get update && apt-get install -y wget --no-install-recommends \
&& wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
&& apt-get update \
&& apt-get install -y google-chrome-unstable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst ttf-freefont \
--no-install-recommends \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get purge --auto-remove -y curl \
&& rm -rf /src/*.deb
# It's a good idea to use dumb-init to help prevent zombie chrome processes.
ADD https://github.com/Yelp/dumb-init/releases/download/v1.2.0/dumb-init_1.2.0_amd64 /usr/local/bin/dumb-init
RUN chmod +x /usr/local/bin/dumb-init
COPY . /app/
WORKDIR /app
COPY package.json .
RUN yarn --production
COPY server.mjs .
# RUN chmod +x server.mjs
# Add user so we don't need --no-sandbox.
RUN groupadd -r pptruser && useradd -r -g pptruser -G audio,video pptruser \
&& mkdir -p /home/pptruser/Downloads \
&& chown -R pptruser:pptruser /home/pptruser \
&& chown -R pptruser:pptruser /app
# # Run everything after as non-privileged user.
USER pptruser
EXPOSE 8080
ENTRYPOINT ["dumb-init", "--"]
CMD ["npm", "run", "start"]
app.yaml
runtime: custom
env: flex
automatic_scaling:
min_num_instances: 1
max_num_instances: 4
resources:
cpu: 4
memory_gb: 16 # cpu * [0.9 - 6.5] - 0.4
disk_size_gb: 100
skip_files:
- ^(.*/)?tests
- ^(.*/)?.*\.md$
Currently any branch will get deployed on travis. lock it to just updates to master
puppets => puppeteraas
We have a page we want to render with a POST
request due to the amount of data required for the page. Can pptraas support proxying a POST request?
Right now it errors out. We should list examples from the readme there
At some point, someone, somewhere will rinse this so we need to come up with a decent solution.
It seems Travis is not enabled for pull requests.
When trying to run the Dockerfile, I get:
W: Failed to fetch http://deb.debian.org/debian/dists/jessie-updates/main/binary-amd64/Packages 404 Not Found
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.