Crawl a site for Cache-Control headers.
- extracts URLs from HTML and CSS files
- reports URLs grouped by distinct values for the Cache-Control header
$ git clone [email protected]:jameslnewell/cache-control-spider.git
$ npm i
$ node index.js http://dev.online4.nib.com.au
Create a new crawler.
Add a URL to be crawled.
Attach a plugin to the crawler.
Start crawling URLs.
Emitted before the crawler has started crawling URLs.
TODO: Emitted before a request is sent to the server.
- url :
String
- the URL - res :
Request
- the request
Emitted after a response is received from the server.
- url :
String
- the URL - res :
Response
- the response
Emitted after the crawler has stopped crawling URLs.
Emitted when an error occurs crawling a URL.
- err :
Error
- the error
Extract URLs to crawl from CSS files.
- accept :
function(url) : bool
- whether the URL should be crawled
Extract URLs to crawl from HTML files.
- accept :
function(url) : bool
- whether the URL should be crawled