Comments (10)
Is there a way to connect the puppeteer-cluster to a remote instance of chromium? (“connect” instead of “launch”)
from puppeteer-cluster.
Hello - just wanted to get a feel for how active this project is. I see puppeteer cluster as being useful for several projects I'd like to work on. However, I'm hesitant to use it if development will be abandoned. Is development still happening? Thanks!
from puppeteer-cluster.
-
Add a mixed concurrency model. i.e for PAGE or CONTEXT concurrency model, have the option to distribute the jobs to more than one browser instance. So a crash won't affect all jobs and this offers a good balance between reliability and resource usage.
-
Add API to return the length of queue, time when the oldest item in queue was added and Number of jobs processed in the last minute. For a continuously operating cluster i.e jobs being added continuously, this information is valuable.
from puppeteer-cluster.
Cool, glad to hear that. Feel free to ping me if you need any help)
from puppeteer-cluster.
I have a question. How many browsers I can spawn in parallel for processor core? Lets Say my server has processor with 4 cores. How many browsers I can spawn in one time for my tests to pass?
from puppeteer-cluster.
Next time, please open a separate issue if it has nothing to do with this issue.
Regarding your question: It depends on your use case. For simple DOM handling I was able to run ~10 worker on my machine (i5 quad core). Just give it a try with the option (monitor: true) and see how your machine is handling the tasks.
from puppeteer-cluster.
Unfortunately, the current implementation of custom concurrency doesn't address the case when you need to provide custom puppeteer parameters to jobInstance
s. IMHO this would effectively solve the #36 with puppeteer args: [ '--incognito', '--proxy-server=${proxyServer}' ]
and await page.authenticate(credentials)
.
@thomasdondorf , what do you think about this?
from puppeteer-cluster.
I'm currently thinking about completely reworking the concurrency implementations. Then there would be no more "WorkerInstance" and "JobInstance". Just one function that is called when a page is needed. Then the concurrency implementation would have 100% flexibility when a puppeteer instance is started and when one is reused.
Expect some code changes in the next two weeks ;)
from puppeteer-cluster.
+1 for Docker container support.
https://github.com/skalfyfan/dockerized-puppeteer
from puppeteer-cluster.
(Long-term runs of puppteer-cluster #25) Make sure it's reliable and crawl more than 10 million pages with it (so far the maximum I crawled was ~800k pages)
I use k6 benchmarks in my CI tests for soketi, making sure all releases are passing benchmarks in most of the cases.
Would it be a great idea to set it up for you for page rendering testing?
from puppeteer-cluster.
Related Issues (20)
- Single setup before starting concurrent cluster? HOT 2
- I think a timeout of `0` should disable timeouts HOT 2
- Clear up Concurrency wording incorrect usage HOT 5
- Feature: Lifetimes
- How To Stop Worker To Become Idle automatically
- Expose stats via prometheus HOT 2
- Screen shot getting stuck forever
- Use same URL but diffetent logic on each browser HOT 1
- Concurrency launch: CONCURRENCY_BROWSER definition slightly misleading HOT 2
- Error detection super slow with new Puppeteer versions HOT 1
- Support to new puppetter versions HOT 1
- share the dockerfile I'm using
- Suggestion: Allow pool of already instantiated browser workers
- how to open the progress view and monitoring statistics? HOT 1
- cluster concurrent seems not work HOT 1
- Regarding resource usage HOT 1
- Worker Error getting browser page HOT 1
- How to set args like .launch({ args: [] }) ? HOT 3
- Has anyone managed to use separate data for each browser?
- browser crushing due to "open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)"
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from puppeteer-cluster.