puppeteer / examples Goto Github PK

View Code? Open in Web Editor NEW

2.3K 84.0 300.0 559 KB

Use case-driven examples for using Puppeteer and headless chrome

Home Page: https://developers.google.com/web/tools/puppeteer/

License: Apache License 2.0

JavaScript 77.84% HTML 22.16%

puppeteer demos automation browser-tools browser-testing

examples's Introduction

Useful Puppeteer demos!

Examples for using Puppeteer to do big, bold things.

Output from some of the examples:

code_coverage.js

Test lazy loading strategy by seeing CSS/JS code coverage usage across page load.

verify_sw_caching.js

Verify all the resources you expect are being cached by a service worker for offline.

google_search_features.js

Gut check your page to make sure it renders correctly for Google Search.

lazyimages_without_scroll_events.js

Determine if your lazy loaded images will be seen correctly by Google Search.

crawlsite.js

Discover all the URLs on a site and visualize the subpages.

side-by-side-pageload.js

Load 2 or more pages side-by-side to visually see the difference in page load. Optional desktop viewport and throttling settings.

License

examples's People

Contributors

Stargazers

Watchers

Forkers

liuderchi littlekign reemardelarosa rozenmd whitelock ryanwhite89 reyadrahman orpheus-maximus fritz-c frontdevops kevinmalarick desyatkov ahamed750 dhilip89 vo0doo argoskenny siarcarse madaleno arenaswan dwanghua zhiephieforks yeekzhang snaylor86 argyleink svenmay jarek-przygodzki chao1927 brandonshowers seandunlap francishero hilalisa mrakibmiah burakdev ipankaj jimspider laiyubin nomissbowling duonglx moealmaw oekeur rahulkumar66 apsolut-resources bogusweb webstardotme bennv182 otsukayuhi kobkrit omerherera manishfoodtechs won21kr markchipman ereztaiar housseindjirdeh koobitor boogie77 shaxus geekplux aches ider-zh asafuli misterhtmlcss dev-kml scientificpotato yuttasakcom jjeejj vladimir12 heyecheng forrest321 lesleyjanenorton c0d3rm0nk3y mukesh23singh skyblue-001 nisun-007 uncle-mo mkamakura petrgch tforsberg matt-boyd-msm ideaguy3d enterstudio mehrdad-shokri wuxinyi yangyangliu eschocolat dragon8github lasloyu romulocintra mgcfish frontenda knurddongle dev-pawar lgs kicktemp miguelramosfdz undeadinu kangbobi reeturajc hhy5277 suqcnn hassoon1986

examples's Issues

Fails to save screen shots when url has query string

The site I was trying to crawl has query strings as part of the navigation causing the script to fail when trying to save the screen shot on Windows (may or may not repro on other platforms). It appears slugify doesn't trim out all characters illegal for Windows file names.

Example error sequence (sort headers on a table add to the query string):
Loading: https://example.org/index/99984?sort=NAME&order=asc (node:19708) UnhandledPromiseRejectionWarning: Error: ENOENT: no such file or directory, open 'C:\Users\Reeves\source\repos\puppeteer\output\https___example.org\https___example.org\index_99984?sort=NAME&order=asc'

I corrected this in my script by adding 'santize-filename' and adding to the screenshots section of the code (on line 146 at this hot second).
const path = `./${OUT_DIR}/${slugify(sanitze(page.url))}.png`;

The slugify in this context may be redundant.

crawl.json does not get added

I tried the crawlsite.js file and the links get shown in the console.

But crawl.json does not get added under output folder and also not able to visualize it through d3.

Thanks in Advance

Adding examples related to Browser pool

Can i open a pull request showcasing browser pool examples, since there are various use cases which would benefit from browser pooling for example a screenshot app.

Error in npm install: build failing in Windows 10

Hello, I installed Python 3.8.2. Visual Studio Community 2019 to run node-gyp and tried to install the modules executing "yarn".
But in the final step of installation, in build, it gives a compilation error. How I solve this?

question - lazyload example

is the lazy load example script: https://github.com/GoogleChromeLabs/puppeteer-examples/blob/master/lazyimages_without_scroll_events.js supposed provide an accurate "view" of what the googlebot (image or search) sees -- visually and dom-wise?

Many lazyloading scripts try to compensate by sniffing the UA (including the included example). However it's not clear if the result of the dom manipulation (say swapping data-src to src) is indeed what the google grabs.

I understand it's chrome, not the google - but do you have any insight or knowledge of how we should interpret the results to what the google sees? Or this just a test of the events and not indicative of the googles view?

It (purposefully?) does not set the UA to a googlebot, so I'm guessing there's an interpretation we should be taking away from it.

Failing to run google_search_features.js

Very cool usages!

Unfortunately I'm having problems running the Google Search example, my output is:

oyvinds-mac:puppeteer-examples osmestad$ node google_search_features.js
Trace started.
Navigating to https://www.chromestatus.com/features
Waiting for page to be idle...
Trace complete.
(node:2259) UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'pid' of undefined
    at trace.traceEvents.filter.e (/Users/osmestad/Code/puppeteer-examples/google_search_features.js:195:38)
    at Array.filter (<anonymous>)
    at collectFeatureTraceEvents (/Users/osmestad/Code/puppeteer-examples/google_search_features.js:193:36)
    at process._tickCallback (internal/process/next_tick.js:68:7)
(node:2259) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:2259) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

I'v tried with Node v8.12.0 and v10.13.0 any idea what might be causing this?
(lazyimages_without_scroll_events.js worked for me)

Google Links scrapper from all the pages on a google search

Hello, I have created a google links scraper that scrapes all the website links available for a google search and I want to share it as a puppeteer example.

Example for writing tests for chrome extensions

Right now I'm working on some chrome extension and found useful you tip about testing. I think having fully working example here might be useful for others.
I can help on this one. wdyt?

lazyimages_without_scroll_events not possible to run for Mobile

NAME	ABOUT
Feature Request	Support for different devices (Smartphone, Tablet...)

Is your feature request related to a problem? Please describe.

When trying to validate an URL in its Mobile version is not possible to evaluate with the script lazyimages_without_scroll_events.js

Describe the solution you'd like

Would be nice to provide a flag to the command where the user can specify which device wants to load the url with
To keep Desktop as default behaviour is fine

Describe alternatives you've considered

Please describe alternative solutions or features you have considered.

maybe an interaction prompt where the user can choose the device?
user agent parameter instead?

Add eslint rules

How to build page resources dependency tree by initiator ?

Hi, I am really excited about the puppeteer, you have done really great tool. I am using it from v0.11.0.
I am wondering if it possible to build the page resources dependency tree by their initiators.
For example, to visualize what amount of calls are sent by 3-d party tracking or analytics libs.
Such functionality has Lighthouse (critical request chains in performance audit results) but I need the whole picture.
It would be great if some of you guys could share your thoughts or advice how it could be done with the puppeteer.
As I understood, all information I need could be found in the trace of the page, but maybe there is a better solution than parsing that big and scary JSON file.

Lazy load example accuracy

https://github.com/GoogleChromeLabs/puppeteer-examples/blob/master/lazyimages_without_scroll_events.js#L110

i have doubts screenshotPageWithoutScroll :

it is using DEFAULT_VIEWPORT which is may be vary for each page, with long content page, some part of the page will not exists on the viewport, this causing IntersectionObserver not triggered, and make the test FAILED
For example in https://css-tricks.com/examples/LazyLoading/ . this sample calling processScroll();, so given correct VIEWPORT the images will be shown even its not using IntersectionObserver. And causing test to PASSED
For this own sample https://rawgit.com/GoogleChromeLabs/puppeteer-examples/master/html/lazyload.html : this html is have height about 2000, so the page does not even need scrolling, if you set the height on script to 1000 it will show test FAILED

Any idea to make the improve accuracy

Add puppeteer and accessibility

What do you think about adding puppeteer and e.g. axe?

crawlsite.js crashes on PDFs

When the script reaches a PDF, it crashes.

Example:

(node:23872) UnhandledPromiseRejectionWarning: Error: net::ERR_ABORTED at https://code.design/files/code-design-magazine-001.pdf
    at navigate (/Users/martin/Sites/crawlsite/node_modules/puppeteer/lib/Page.js:539:37)
    at process._tickCallback (internal/process/next_tick.js:68:7)
(node:23872) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:23872) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

Error: Failed to launch chrome!

I got the following error while I run node speech.js

(node:75891) UnhandledPromiseRejectionWarning: Error: Failed to launch chrome! spawn /Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary ENOENT


TROUBLESHOOTING: https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md

    at onClose (/Users/abuduwail/Desktop/personal/Puppeteer/puppeteer-examples/node_modules/puppeteer/lib/Launcher.js:342:14)
    at ChildProcess.helper.addEventListener.error (/Users/abuduwail/Desktop/personal/Puppeteer/puppeteer-examples/node_modules/puppeteer/lib/Launcher.js:333:64)
    at ChildProcess.emit (events.js:182:13)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:238:12)
    at onErrorNT (internal/child_process.js:407:16)
    at process._tickCallback (internal/process/next_tick.js:63:19)
(node:75891) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:75891) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

crawl site missing 'output' folder

I received the error below. I could verify the script was trying to mkdir, but it wouldn't work. After manually creating the folder 'output'. The script worked as expected. Unsure if this is a bug or user error...

UnhandledPromiseRejectionWarning: Error: ENOENT: no such file or directory, mkdir 'output/https___news.polymer-project.org_'

Stale Examples - last update 4 years ago

Is the project dead? Or just a TODO to update the examples

Cannot visualize crawl.json in D3

I use Node v10.13.0. I checked out the puppeteer examples repo, ran yarn install, ran crawlsite.js, got a good crawl.json in the output, but when I run node server.js to try to visualize in D3, I . just get a Cannot GET message in the browser. There is no console output from server.js in the terminal. Any ideas?

Crawlsite.js Error: TypeError: Converting circular structure to JSON

When I try to crawl with a DEPTH bigger than 2 I get this error:
Error: TypeError: Converting circular structure to JSON
line: await util.promisify(fs.writeFile)(./${OUT_DIR}/crawl.json, JSON.stringify((root), null, ' '));
if i use util.inspect, error is gone but seems like the output is invalid json.

How to know when the network is idle (not on loading)

Hi @ebidel

Could you tell me why is this line commented?

https://github.com/GoogleChromeLabs/puppeteer-examples/blob/59355609ecb3c2e396a289b28f34d5116fc89b8e/lazyimages_without_scroll_events.js#L97

I would like to know if it's possible to know when the network is idle.

Pages failing the lazyimages_without_scroll_events.js test

Hi.

I ran .lazyimages_without_scroll_events.js to test two webpages that uses different approachs for lazy loading images: one with IntersectionObserver and another with event listeners (DOMContentLoad, scroll,resize).

Both pages didn't pass the test for some reason.

I expected that the page with the IntersectionObserver would not fail the test, since the images are loaded onece the elementes are on the viewport and the users doesn't need to scroll to it.

The same with the page with event listeners. Despite using the scroll event, this is not the only event that triggers the image to load. The page also uses the DOMContentLoaded event, so users can see hidden images as soon as the page loads (if the element is in the viewport when the page is loaded). That said, I don't understand why the page didn't pass the test.

If anyone can help me understand why those pages have not passed the tests, I would appreciate it.

Thanks in advance!

lazyimages_without_scroll_events.js finishes with error TimeoutError: Navigation Timeout Exceeded

I try to check lazy loading for site using Google official guide.

I've inited project from git, installed all dependencies.

When I run

node lazyimages_without_scroll_events.js -h --url==https://dns-shop.ru

I get an error below:

(node:7280) UnhandledPromiseRejectionWarning: TimeoutError: Navigation Timeout Exceeded: 30000ms exceeded
at Promise.then (C:\Root\puppeteer\puppeteer-examples\node_modules\puppeteer\lib\LifecycleWatcher.js:142:21)
-- ASYNC --
at Frame. (C:\Root\puppeteer\puppeteer-examples\node_modules\puppeteer\lib\helper.js:110:27)
at Page.goto (C:\Root\puppeteer\puppeteer-examples\node_modules\puppeteer\lib\Page.js:629:49)
at Page. (C:\Root\puppeteer\puppeteer-examples\node_modules\puppeteer\lib\helper.js:111:23)
at screenshotPageAfterScroll (C:\Root\puppeteer\puppeteer-examples\lazyimages_without_scroll_events.js:143:14)
at process._tickCallback (internal/process/next_tick.js:68:7)
(node:7280) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:7280) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

And process hangs forever.

Node version is v10.15.3, platform is Windows 7.

What are steps to check the site correctly?

Add basic auth option to lazyimages_without_scroll_events.js

Sometimes people wants to tests lazy loads on basic auth protected websites (usually on preproduction environment).

I wish propose a new yarg option to pass basic auth to website in the file lazyimages_without_scroll_events.js.

Eg: node lazyimages_without_scroll_events.js -b username::password -u https://example.com