GithubHelp home page GithubHelp logo

test-crawler's Introduction

test-crawler

★ Online documentation ★

► Try it directly on GitHub

test-crawler is a tool for end to end testing, by crawling a website and making some snapshot comparison. This is fully open-source and can be self hosted or use directly on GitHub.

Getting started

🛈 Note: you need to use at least node v11

yarn global add test-crawler
test-crawler

Open url http://127.0.0.1:3005/ and create a new project:

screenshot-start

There is two way to crawl pages:

  • Spider bot crawling method will get all the links inside the page of the given URL and crawl the children. It will then continue do the same with the children till no new link is found. Be careful if you have big website, this is most likely not the right solution for you.

  • URLs list crawling method will crawl a specific sets of URLs. In the URL input field you must provide an endpoint containing a list of URLs (a simple text format, with one URL per line). The crawler will crawl each of those URL only and will not try to find links in the page.

URLs list example:

http://127.0.0.1:3005/
http://127.0.0.1:3005/page1
http://127.0.0.1:3005/category/page33

🛈 Note: to don't get false visual differences, you must run your test always on the same environment. Diffrent OS, different graphic card, ... might trigger visual differences in the snapshot, even if there was no changes. Prefer to always run your tests on the same machine.

Pins

Pins are the references screenshot to make the comparison with. While crawling, the crawler is comparing page to pin. To create a pin go in the result page of your crawling result, each screenshot has some action buttons:

screenshot-action-buttons

click on the button on the right with little pin icon.

You can then visualize all your pins:

screenshot-pins

Crawling result

screenshot-diff

On the result page, you will see many screenshot with eventually some differences found. A difference is represented by a yellow rectangle. By clicking on the rectangle, popup 3 buttons giving you the possibility to report this difference (rectangle will became red) or validate this difference (rectangle will became green). You can as well validate this difference "for ever", then this area of the pages will always reconize this zone as valid place for changes.

Note: comparing page that are growing is very difficult (different height). For the moment this result to weird behaviors when comparing 2 screenshots of different size. To avoid this, use the code injection to remove the dynamic part of the page. Hopefully in the future, we will find better algarithm to reconize such changes.

Inject code

screenshot-code

Inject some code in the crawler while parsing the page. This code will be executed just after the page finish loaded, before to make the screenshot and before extracting the links.

This can be useful to remove some dynamic part from a page, for example some comments on a blog pages or some reviews on prodcut page. You could also inject code to simulate user behavior, like clicking or editing an input fields.

Test-crawler is using Puppeteer to crawl the page and make the screenshot. By injecting the code, you can use all the functionnalities from Puppeteer.

In the editor, you need to export a function that will get as params the page currently opened by Puppeteer.

module.exports = async function run(page) {
// your code
}

You can then use this page variable to manipulate the page. Following is an example that will insert "Test-crawler is awesome!" on the top of the page:

module.exports = async function run(page) {
    await page.evaluate(() => {
        const div = document.createElement("div");
        div.innerHTML = "Test-crawler is awesome!";
        document.body.insertBefore(div, document.body.firstChild);
    });
}

You can as well make some assertion. Any failed assertion will be displayed in the result page.

screenshot-assertion

const expect = require('expect');

module.exports = async function run(page) {
  await expect(page.title()).resolves.toMatch('React App');
  expect('a').toBe('b'); // fail
}

By default expect library from jest is installed but you can use any library of your choice.

Storybook

You can use code injection to crawl storybooks. Say test-crawler to crawl your storybook url http://127.0.0.1:6006/ and then inject some code to extract the urls of the stories and transform them to there iframe version. The code should be something like that:

module.exports = async function run(page) {
    await page.evaluate(() => {
        hrefs = Array.from(document.links).map(
            link => link.href.replace('/?', '/iframe.html?')
        );

        document.body.innerHTML = hrefs.map(
            href => `<a href="${href}">${href}</a>`
        ).join('<br />');
    });
}

You can find this code by clicking the button Code snippet of the code editor.

🛈 Note: feel free to make some pull request to propose some new code snippet.

Cli

You can run test directly from the cli. This can be useful for continuous integration test.

# test-crawler-cli --project the_id_of_the_project
test-crawler-cli --project f0258b6685684c113bad94d91b8fa02a

With npx:

ROOT_FOLDER=/the/target/folder npx test-crawler-cli --project the_id_of_the_project

Continuous integration

As mentioned before, to don't get false visual differences, you must run your test always on the same environment. The best way to solve this is to include test-crawler in your continue integration, with some tools like Travis or GitHub actions. Test-crawler is already supporting out of the box Github actions. In order to run test-crawler in the CI container, you must use test-crawler-cli.

Example of GitHub action:

name: Test-crawler CI

on: [push]

jobs:
  build:

    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v2
    - name: Setup node
      uses: actions/setup-node@v1
    - name: Run test-crawler
      run: |
        ROOT_FOLDER=`pwd` npx -p test-crawler-cli --project ${{ github.event.client_payload.projectId }}
    - name: Commit changes
      run: |
        git config --local user.email "[email protected]"
        git config --local user.name "Test-crawler"
        git add .
        git status
        git commit -m "[test-crawler] CI save" || echo "No changes to commit"
        git pull
        git push "https://${{ secrets.GITHUB_TOKEN }}@github.com/${{ github.repository }}"

Contribution

If you are interested to work on this project, you are really welcome. There is many way to bring help, testing, documentation, bug fixes, new features...

For the one who want to dive in the code, you need to know about TypeScript, React and eventually Puppeteer but the most important thing to be aware is that test-crawler is base on isomor. It might be useful to undertsand the concept of this tool before to touch the code.

Since you was reading the doc, you now know that the code should be modified in "src-isomor".

To start the project in dev mode:

git clone https://github.com/apiel/test-crawler.git
cd test-crawler
npx lerna bootstrap
cd packages/test-crawler
yarn dev

yarn dev will start 3 processes using run-screen. The first process is the isomor-transpiler, the second is the backend server and the third is react server. To switch between process, press 1, 2 or 3.

If you have any question, feel free to contact me at [email protected]

test-crawler's People

Contributors

actions-user avatar apiel avatar kerbe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

test-crawler's Issues

How to install ?

Hi,
your tool look very nice.
I wanted to test it and I tried npx test-crawler as suggested in your doc, unfortunately I got the following error:

Downloading Chromium r669486 - 347.1 Mb [====================] 100% 0.0s 
info sharp Using cached [HOME]/.npm/_libvips/libvips-8.7.4-linux-x64.tar.gz
[HOME]/.npm/_npx/20761/bin/test-crawler: ligne 11: isomor-server : commande introuvable

Here are my plateform informations:

  • Ubuntu 18.04
  • node 12.4.0
  • npm 6.9.0

Do you have any suggestion ?

Cannot GET /

I tried installing test-crawler to node v13.4.0 and node v12.14.1 with yarn global add test-crawler followed by test-crawler to start it up.

Installation gave few errors, but seems to have installed it happily:

yarn global v1.21.1
[1/4] Resolving packages...
warning test-crawler > antd > babel-runtime > [email protected]: core-js@<3 is no longer maintained and not recommended for usage due to the number of issues. Please, upgrade your dependencies to the actual version of core-js@3.
warning test-crawler > antd > rc-tree-select > rc-trigger > rc-animate > fbjs > [email protected]: core-js@<3 is no longer maintained and not recommended for usage due to the number of issues. Please, upgrade your dependencies to the actual version of core-js@3.
warning test-crawler > isomor > isomor-transpiler > @types/[email protected]: This is a stub types definition. chokidar provides its own type definitions, so you do not need this installed.
warning test-crawler > react-scripts > jest > jest-cli > jest-config > jest-environment-jsdom > jsdom > [email protected]: use String.prototype.padStart()
[2/4] Fetching packages...
warning [email protected]: Invalid bin field for "mini-css-extract-plugin".
warning [email protected]: Invalid bin entry for "sha.js" (in "sha.js").
info [email protected]: The platform "linux" is incompatible with this module.
info "[email protected]" is an optional dependency and failed compatibility check. Excluding it from installation.
info [email protected]: The platform "linux" is incompatible with this module.
info "[email protected]" is an optional dependency and failed compatibility check. Excluding it from installation.
info [email protected]: The platform "linux" is incompatible with this module.
info "[email protected]" is an optional dependency and failed compatibility check. Excluding it from installation.
[3/4] Linking dependencies...
warning "test-crawler > react-scripts > @typescript-eslint/[email protected]" has incorrect peer dependency "eslint@^5.0.0".
warning "test-crawler > react-scripts > @typescript-eslint/[email protected]" has incorrect peer dependency "eslint@^5.0.0".
warning "test-crawler > react-scripts > [email protected]" has incorrect peer dependency "@typescript-eslint/[email protected]".
warning "test-crawler > react-scripts > [email protected]" has incorrect peer dependency "@typescript-eslint/[email protected]".
warning "test-crawler > isomor > isomor-transpiler > @babel/[email protected]" has unmet peer dependency "@babel/core@^7.0.0-0".
[4/4] Building fresh packages...
success Installed "[email protected]" with binaries:
      - test-crawler
      - test-crawler-cli
Done in 96.66s.

Once I started it, all seemed to be good, apart that I get error when opening browser:

Start test crawler

> [email protected] isomor:server /usr/local/share/.config/yarn/global/node_modules/test-crawler
> node ../isomor-server/dist/bin/server.js

• info Starting server.
• info Created endpoints: [
  '/isomor/test-crawler/server-service/getSettings',
  '/isomor/test-crawler/server-service/getCrawlers',
  '/isomor/test-crawler/server-service/loadPresets',
  '/isomor/test-crawler/server-service/saveAndStart',
  '/isomor/test-crawler/server-service/getCrawler',
  '/isomor/test-crawler/server-service/getPages',
  '/isomor/test-crawler/server-service/getPins',
  '/isomor/test-crawler/server-service/getPin',
  '/isomor/test-crawler/server-service/setCode',
  '/isomor/test-crawler/server-service/getCode',
  '/isomor/test-crawler/server-service/getCodes',
  '/isomor/test-crawler/server-service/getThumbnail',
  '/isomor/test-crawler/server-service/pin',
  '/isomor/test-crawler/server-service/setZoneStatus',
  '/isomor/test-crawler/server-service/setZonesStatus',
  '/isomor/test-crawler/server-service/setStatus'
]
• success Server listening on port 3005!
• info Find API documentation at http://127.0.0.1:3005/api-docs
• GET / 404 4.235 ms - 139

And on browser it states Cannot GET /

Is there some installation step missing, or possibly some breakage happened with code?

Error upon fresh start

Using npm version 1.0.0-alpha.0. Startup works fine, but once I open browser, it spits two errors, one for path argument type and another for not finding projects apparently.

Log of those:

Start test crawler

> [email protected] isomor:server /usr/local/share/.config/yarn/global/node_modules/test-crawler
> node ../isomor-server/dist/bin/server.js

• info Starting server.
• info Created endpoints: [
  '/isomor/test-crawler/server-service/getSettings',
  '/isomor/test-crawler/server-service/getCrawlers',
  '/isomor/test-crawler/server-service/loadProject',
  '/isomor/test-crawler/server-service/loadProjects',
  '/isomor/test-crawler/server-service/saveProject',
  '/isomor/test-crawler/server-service/getCrawler',
  '/isomor/test-crawler/server-service/getPages',
  '/isomor/test-crawler/server-service/getPins',
  '/isomor/test-crawler/server-service/getPin',
  '/isomor/test-crawler/server-service/setCode',
  '/isomor/test-crawler/server-service/getCode',
  '/isomor/test-crawler/server-service/getCodes',
  '/isomor/test-crawler/server-service/getThumbnail',
  '/isomor/test-crawler/server-service/pin',
  '/isomor/test-crawler/server-service/setZoneStatus',
  '/isomor/test-crawler/server-service/setZonesStatus',
  '/isomor/test-crawler/server-service/setStatus',
  '/isomor/test-crawler/server-service/startCrawlerFromProject',
  '/isomor/test-crawler/server-service/startCrawlers'
]
• info Add static folder /usr/local/share/.config/yarn/global/node_modules/test-crawler/../test-crawler/build
• success Server listening on port 3005!
• info Find API documentation at http://127.0.0.1:3005/api-docs
• GET / 304 5.753 ms - -
• GET /static/css/main.1b7acf8b.chunk.css 304 3.890 ms - -
• GET /static/css/2.ff8bd605.chunk.css 304 4.144 ms - -
• GET /static/js/2.54874c91.chunk.js 304 3.331 ms - -
• GET /static/js/main.10031846.chunk.js 304 0.901 ms - -
• ERR TypeError [ERR_INVALID_ARG_TYPE]: The "path" argument must be of type string. Received type object
    at validateString (internal/validators.js:112:11)
    at Object.join (path.js:1039:7)
    at CrawlerProvider.<anonymous> (/usr/local/share/.config/yarn/global/node_modules/test-crawler/dist-server/server/lib/index.js:179:42)
    at Generator.next (<anonymous>)
    at /usr/local/share/.config/yarn/global/node_modules/test-crawler/dist-server/server/lib/index.js:8:71
    at new Promise (<anonymous>)
    at __awaiter (/usr/local/share/.config/yarn/global/node_modules/test-crawler/dist-server/server/lib/index.js:4:12)
    at CrawlerProvider.getAllCrawlers (/usr/local/share/.config/yarn/global/node_modules/test-crawler/dist-server/server/lib/index.js:178:16)
    at Object.getCrawlers (/usr/local/share/.config/yarn/global/node_modules/test-crawler/dist-server/server/service.js:21:28)
    at Object.<anonymous> (/usr/local/share/.config/yarn/global/node_modules/isomor-server/dist/lib/entrypoint.js:49:61) {
  code: 'ERR_INVALID_ARG_TYPE'
}
• GET /isomor/test-crawler/server-service/getCrawlers 500 5.700 ms - 64
• ERR [Error: ENOENT: no such file or directory, open '/home/pptruser/app/base/project.json'] {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: '/home/pptruser/app/base/project.json'
}
• GET /isomor/test-crawler/server-service/loadProjects 500 12.623 ms - 78
• GET /manifest.json 304 0.581 ms - -

Unable to run dev version

I followed the steps for setting the dev version.

When I run yarn dev I get in the 3rd window:
./src/pin/Pins.tsx
Cannot find file: 'Search.ts' does not match the corresponding name on disk: './src/search/search.ts'.

I tried installing the npm package: "case-sensitive-paths-webpack-plugin"

But I still get the same error.

Using node v11.5.0 on MacOs Mojave

Not able to run chrome crawler

When in project settings I set chrome-selenium I get an error


Error: Command failed: cd /Users/damian/.config/yarn/global/node_modules/test-crawler && ISOMOR_DIST_SERVER_FOLDER=/Users/damian/.config/yarn/global/node_modules/test-crawler/../test-crawler/dist-server ISOMOR_STATIC_FOLDER=/Users/damian/.config/yarn/global/node_modules/test-crawler/../test-crawler/build npm run isomor:server
    at checkExecSyncError (child_process.js:631:11)
    at execSync (child_process.js:668:15)
    at startTestCrawler (/Users/damian/.config/yarn/global/node_modules/test-crawler/test-crawler.js:22:5)
    at Object.<anonymous> (/Users/damian/.config/yarn/global/node_modules/test-crawler/test-crawler.js:15:5)
    at Module._compile (internal/modules/cjs/loader.js:759:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:770:10)
    at Module.load (internal/modules/cjs/loader.js:628:32)
    at Function.Module._load (internal/modules/cjs/loader.js:555:12)
    at Function.Module.runMain (internal/modules/cjs/loader.js:824:10)
    at internal/main/run_main_module.js:17:11

selenium-server and chromedriver installed globally
Test-crawler v3.5.6

yarn global add test-crawler issue?

[4/4] ⠈ test-crawler
[-/4] ⠈ waiting...
[3/4] ⠈ es5-ext
error E:\soft\tools\meta\test\node_modules\test-crawler: Command failed.
Exit code: 1
Command: test-crawler-driver-manager '[{"type":"Chrome"},{"type":"Gecko"},{"type":"IE"}]' node_modules/.bin/
Arguments:
Directory: E:\soft\tools\meta\test\node_modules\test-crawler
Output:
• info Test-crawler driver manager
undefined:1
'[{type:Chrome},{type:Gecko},{type:IE}]'
^

SyntaxError: Unexpected token ' in JSON at position 0
at JSON.parse ()
at E:\soft\tools\meta\test\node_modules\test-crawler-driver-manager\cli\cli.js:29:28
at Generator.next ()
at E:\soft\tools\meta\test\node_modules\test-crawler-driver-manager\cli\cli.js:9:71
at new Promise ()
at __awaiter (E:\soft\tools\meta\test\node_modules\test-crawler-driver-manager\cli\cli.js:5:12)
at start (E:\soft\tools\meta\test\node_modules\test-crawler-driver-manager\cli\cli.js:16:12)
at Object. (E:\soft\tools\meta\test\node_modules\test-crawler-driver-manager\cli\cli.js:35:1)

**Cards**

Cards can be added to your board to track the progress of issues and pull requests. You can also add note cards, like this one!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.