GithubHelp home page GithubHelp logo

google / security-crawl-maze Goto Github PK

View Code? Open in Web Editor NEW
146.0 5.0 37.0 201 KB

Security Crawl Maze is a comprehensive testbed for web security crawlers. It contains pages representing many ways in which one can link resources from a valid HTML document.

License: Apache License 2.0

Dockerfile 3.37% Python 21.52% HTML 53.00% TypeScript 7.10% JavaScript 15.01%

security-crawl-maze's Introduction

Security Crawl Maze

Security Crawl Maze is a comprehensive testbed for web security crawlers. It contains pages representing many (hopefully all) ways in which one can link resources from a valid HTML document. List of all the cases covered by Security Crawl Maze can be found below.

Crawling vs Security Crawling

Security crawlers are interested in different findings than regular web crawlers. They are not interested in maximizing content coverage but in maximizing code coverage. This application is supposed to provide a unified and extensive way of testing efficiency of web security crawlers. First release contains only static linking of resources from html documents but future development will focus on adding more complex cases such as: Single Page Applications (Angular, Polymer), dynamically generated content (Blogs, e-commerce systems) and many other.

Run / deploy the application

NOTE: Test cases for JS frameworks have to be built and bundled in order to work. If you use Docker, everything is automated. However, if you don't, you will have to build the projects manually.

The primary goal was to be able to run and deploy the app easily in any environment. Therefore, we provide a Dockerfile which enables you to deploy it to any cloud that is run by a provider of your choice. For local development or testing you can also make the app up and running quickly either in a local container or as a Python Flask app. Please, find the instructions below.

Run locally in a container

  1. pull the project and enter the project's directory
  2. build the docker image docker build -t crawlmaze .
  3. run the image and expose port 80 docker run -p 80:8080 --name crawlmaze crawlmaze
  4. to remove the container docker rm -f crawlmaze

Run locally as a Flask app (Does not support JS frameworks)

  1. pull the project and enter the project's directory
  2. install pip dependencies pip install -r requirements.txt
  3. run app python app.py

Deploy to Google Cloud/AWS/Azure

Run on Google Cloud or use the Dockerfile in the root directory to build a container image and deploy it to any other cloud of your choice.

Use public version

There is a publicly available instance of the application running at http(s)://security-crawl-maze.app.

Conventions

Naming

  • HTML folder contains directories named after tag names. e.g. html/body/a will contain tests for an <a> tag which is located in the HTML's body element,
  • HTML files are usually named after the attribute that links a resource. e.g. html/body/a/href.html will contain one test case for an href attribute inside an <a> tag,
  • Nested tags are placed in nested folders. e.g. html/body/form/button will contain tests for a <button> tag placed inside a <form> tag,
  • Resources that are expected to be found by crawlers end with '.found' suffix e.g. html/body/a/href.html will contain a link to http://<HOST>/html/body/a/href.found. This way it's easy to identify a test case that is not found by your crawler.
  • Files without extensions under the test-cases/ directory are required so that links to API endpoints are generated.

Expected results API endpoint

The application exposes an API endpoint that you can use to fetch a set of URLs that are expected to be found by crawlers. It is located under:

http://<HOST>/fetch-expected-results?path=<PATH>

where is a starting url of the crawl e.g.

http://<HOST>/fetch-expected-results?path=/html/body/form

returns:

[
    "/test/html/body/form/action.found",
    "/test/html/body/form/button/formaction.found"
]

Test cases

Implemented test cases (resources to be found) are available in the blueprints/utils/resources/expected-results.json file.

Adding a test case

  1. Create a file for your test case and place it in an appropriate directory.
    • If your test content is generated dynamically by an API endpoint, add a file without an extension (e.g. test-cases/headers/link). This is to make sure the link to the test case is generated and is discoverable by crawlers.
    • If you're NOT creating any new child folder in the test-cases/ directory go to point 2.
    • Otherwise you have to add a new blueprint directory with all the relevant components. You can reuse the structure of already existing blueprints.
  2. Add record which is to be found to the blueprints/utils/resources/expected_results.json file.
  3. Test your crawler with the new test case!
  4. Before creating a PR, make sure your code follows the Google Python Language Rules

Credits

Many of the test cases were borrowed from a document by cure53 HTTPLeaks.

License information

See the LICENSE file.

security-crawl-maze's People

Contributors

dacappo avatar hjkeller16 avatar karthikuj avatar knoxdev avatar mariussteffens avatar mtrzos avatar psiinon avatar pyneda avatar security-automation-team avatar suskind avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

security-crawl-maze's Issues

Problem with JavaScript Framework pages?

I dont think the JavaScript Framework pages are set up correctly.

  1. Launch a browser and open https://security-crawl-maze.app/javascript/frameworks/
  2. Click on the angular/ link (the others act in the same way)
  3. Click on the routerLink navigation link
  4. Click on the button event handler navigation button

In these cases the following URLs are put into the URL bar:

In neither case are any requests made by the browser - checked in both Chrome, Firefox and using dev tools + ZAP.

In any case the above URLs both return "Not Found" - the correct URLs to detect are:

So it looks like theres a missing test/ at the start of the URL paths.

Is this a bug or am I misunderstanding how these tests work?

Issue building docker image due to no version pinning of Angular CLI

Ran into an issue building the docker image. Looks like it's because this part of the docker image was not using a pinned version.

RUN npm install -g @angular/cli

I was getting this error

image

Text of the error:

docker build -t crawlmaze .

[+] Building 109.8s (24/37)                                                                                          docker:desktop-linux
 => [internal] load build definition from Dockerfile                                                                                 0.0s
 => => transferring dockerfile: 2.68kB                                                                                               0.0s
 => [internal] load .dockerignore                                                                                                    0.0s
 => => transferring context: 2B                                                                                                      0.0s
 => [internal] load metadata for docker.io/library/alpine:3.9                                                                        4.3s
 => [internal] load metadata for docker.io/library/node:14-alpine                                                                    4.1s
 => [auth] library/node:pull token for registry-1.docker.io                                                                          0.0s
 => [auth] library/alpine:pull token for registry-1.docker.io                                                                        0.0s
 => [builder  1/16] FROM docker.io/library/node:14-alpine@sha256:434215b487a329c9e867202ff89e704d3a75e554822e07f3e0c0f9e606121b33    5.3s
 => => resolve docker.io/library/node:14-alpine@sha256:434215b487a329c9e867202ff89e704d3a75e554822e07f3e0c0f9e606121b33              0.0s
 => => sha256:434215b487a329c9e867202ff89e704d3a75e554822e07f3e0c0f9e606121b33 1.43kB / 1.43kB                                       0.0s
 => => sha256:3380ed827b250d2db2bd38c15f090af2b303b3c9ebb42f5927cb5b9adeea7a6e 1.16kB / 1.16kB                                       0.0s
 => => sha256:d561716d42b6a03639723c382c62977347ed62f13925e836614583253308e6c3 6.45kB / 6.45kB                                       0.0s
 => => sha256:c41833b44d910632b415cd89a9cdaa4d62c9725dc56c99a7ddadafd6719960f9 3.26MB / 3.26MB                                       0.9s
 => => sha256:683339ce8d6b9be2ca150a8de67b895e20ea5594b91d3911c95b0b8fea3e314c 36.99MB / 36.99MB                                     4.0s
 => => sha256:4cf6a83c0e2af3c780abcda02cc33f9e812fdcb40b610ed1838281cc9ab94ec8 2.43MB / 2.43MB                                       1.5s
 => => extracting sha256:c41833b44d910632b415cd89a9cdaa4d62c9725dc56c99a7ddadafd6719960f9                                            0.1s
 => => sha256:686172e40c38722891b4004f55f6447548c8367968ac523a612591e0d92f9db3 447B / 447B                                           1.3s
 => => extracting sha256:683339ce8d6b9be2ca150a8de67b895e20ea5594b91d3911c95b0b8fea3e314c                                            1.2s
 => => extracting sha256:4cf6a83c0e2af3c780abcda02cc33f9e812fdcb40b610ed1838281cc9ab94ec8                                            0.0s
 => => extracting sha256:686172e40c38722891b4004f55f6447548c8367968ac523a612591e0d92f9db3                                            0.0s
 => [internal] load build context                                                                                                    0.0s
 => => transferring context: 107.14kB                                                                                                0.0s
 => [stage-1  1/14] FROM docker.io/library/alpine:3.9@sha256:414e0518bb9228d35e4cd5165567fb91d26c6a214e9c95899e1e056fcd349011        0.8s
 => => resolve docker.io/library/alpine:3.9@sha256:414e0518bb9228d35e4cd5165567fb91d26c6a214e9c95899e1e056fcd349011                  0.0s
 => => sha256:414e0518bb9228d35e4cd5165567fb91d26c6a214e9c95899e1e056fcd349011 1.64kB / 1.64kB                                       0.0s
 => => sha256:f920ccc826134587fffcf1ddc6b2a554947e0f1a5ae5264bbf3435da5b2e8e61 528B / 528B                                           0.0s
 => => sha256:9afdd4a290bf60cc642c5c85a91da9e08d3908d16b8fc96b8efd65716c02f0bb 1.51kB / 1.51kB                                       0.0s
 => => sha256:941f399634ec37b35e6764d0e6cf350593652f06f76586d45ddfc0d77de7a701 2.69MB / 2.69MB                                       0.7s
 => => extracting sha256:941f399634ec37b35e6764d0e6cf350593652f06f76586d45ddfc0d77de7a701                                            0.0s
 => [stage-1  2/14] RUN apk add --no-cache python3 &&     python3 -m ensurepip &&     rm -r /usr/lib/python*/ensurepip &&     pip3  16.1s
 => [builder  2/16] RUN npm config set unsafe-perm true                                                                              0.3s
 => [builder  3/16] RUN npm install -g @angular/cli                                                                                 34.1s
 => [stage-1  3/14] COPY requirements.txt /usr/src/app/                                                                              0.0s
 => [stage-1  4/14] RUN pip install -r /usr/src/app/requirements.txt                                                                 4.3s
 => [stage-1  5/14] COPY app.py /usr/src/app/                                                                                        0.0s
 => [stage-1  6/14] COPY blueprints /usr/src/app/blueprints                                                                          0.0s
 => [stage-1  7/14] COPY templates /usr/src/app/templates                                                                            0.0s
 => [stage-1  8/14] COPY test-cases /usr/src/app/test-cases                                                                          0.0s
 => [stage-1  9/14] RUN rm -rf /usr/src/app/test-cases/javascript/frameworks/angular/*                                               0.1s
 => [builder  4/16] RUN npm install -g polymer-cli                                                                                  31.7s
 => [builder  5/16] COPY test-cases/javascript/frameworks/angular /tmp/angular                                                       0.0s
 => [builder  6/16] WORKDIR /tmp/angular                                                                                             0.0s
 => [builder  7/16] RUN npm install                                                                                                 33.9s
 => ERROR [builder  8/16] RUN ng build --prod --baseHref=/javascript/frameworks/angular/                                             0.1s
------
 > [builder  8/16] RUN ng build --prod --baseHref=/javascript/frameworks/angular/:
0.124 Node.js version v14.21.3 detected.
0.124 The Angular CLI requires a minimum Node.js version of v18.13.
0.124
0.124 Please update your Node.js version or visit https://nodejs.org/ for additional instructions.
0.124
------
Dockerfile:18
--------------------
  16 |     WORKDIR /tmp/angular
  17 |     RUN npm install
  18 | >>> RUN ng build --prod --baseHref=/javascript/frameworks/angular/
  19 |
  20 |     # Build Polymer app.
--------------------
ERROR: failed to solve: process "/bin/sh -c ng build --prod --baseHref=/javascript/frameworks/angular/" did not complete successfully: exit code: 3

View build details: docker-desktop://dashboard/build/desktop-linux/desktop-linux/c2s3yrfszkrejvql1od0wla56

Fix

To fix it, I had to change that line of code to a version of Angular CLI that supports Node 14.

Here is what it looks like when I build the image after the fix:

image

I'll submit this as a PR shortly.

Docker build doesnt work

I was looking to see if the docker version had the same problem as #8 but it doesnt build :(
Used docker build -t crawlmaze .

 => [builder  7/16] RUN npm install                                                                                                                       34.7s 
 => ERROR [builder  8/16] RUN ng build --prod --baseHref=/javascript/frameworks/angular/                                                                   0.3s 
------                                                                                                                                                          
 > [builder  8/16] RUN ng build --prod --baseHref=/javascript/frameworks/angular/:                                                                              
#24 0.264 Node.js version v10.24.1 detected.                                                                                                                    
#24 0.264 The Angular CLI requires a minimum Node.js version of either v12.20, v14.15, or v16.10.                                                               
#24 0.264                                                                                                                                                       
#24 0.264 Please update your Node.js version or visit https://nodejs.org/ for additional instructions.
#24 0.264 
------
executor failed running [/bin/sh -c ng build --prod --baseHref=/javascript/frameworks/angular/]: exit code: 3

Is there a published image anywhere?
Doesnt appear to be one on DockerHub: https://hub.docker.com/search?q=crawlmaze

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.