Comments (5)
This cannot be reliably done before we have a middleware system for request interception: #204 since it cannot be done on Chromium level (with a flag) and adding request interception prevents adding more request interceptions.
I'm just gonna drop a doc here for referencing it later perhaps:
@param {Boolean|Array} [blockResources=false]
* Uses the [`Apify.utils.puppeteer.blockResources()`](../api/puppeteer#puppeteer.blockResources)
* function to block downloads of resources such as images, videos or CSS. It accepts either
* a boolean `true`, which will enable the default blocking or a `string[]` listing
* the resources that should be blocked. See the
* [`Apify.utils.puppeteer.blockResources()`](../api/puppeteer#puppeteer.blockResources)
* for details.
from crawlee.
This should go to the new class called PuppeteerEx
- see other issue.
from crawlee.
Now we have new Apify.utils.puppeteer
namespace for this!
from crawlee.
And now we even have Apify.utils.puppeteer.blockResources()
that can be used. Perhaps we can only add option blockResourceTypes
to launchPuppeteer
and that's it.
from crawlee.
Since request interception disables cache, this might not bring the expected benefits. Closing for now.
from crawlee.
Related Issues (20)
- Issue Downgrading from Crawlee 3.7.2 to 3.4.0 - Persistent Version and TypeScript Errors HOT 8
- Save screenshot/HTML on first occurrence of error in error statistics HOT 2
- Double clicking title selects also prefix pill – makes it harder to copypaste HOT 1
- dataset as requestsFromUrl
- add "exclude" property to enqueueLinksByClickingElements like "enqueueLinks"
- Implement Automatic Memory Management in Playwright for Enhanced Stability in Web Crawling Operations
- Support plain-text sitemaps (sitemap.txt) in the `Sitemap` class HOT 1
- Implement sitemap autodetection (independent of robots.txt)
- `maxUsageCount: 1` does not retire session after a single use HOT 1
- `useIncognitoPages` doesn't rotate fingerprints HOT 1
- Add support for all tags defined by the sitemap protocol
- `page.evaluate` results error HOT 2
- HttpCrawler - determining character encoding
- Add `waitForAllRequestsToBeAdded` option to `enqueueLinks`
- XPATH selectors support HOT 4
- Multiple calls to enqueueLinks with Promise.all result in a crash HOT 1
- `RestrictedCrawlingContext` should not extend `Record<string, unknown>` HOT 2
- Could not kill browser: Cannot read private member #process from an object whose class did not declare it HOT 2
- Image not available(build status) in readme
- page.waitForTimeout is removed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crawlee.