GithubHelp home page GithubHelp logo

18f / concourse-compliance-testing Goto Github PK

View Code? Open in Web Editor NEW
17.0 116.0 7.0 259 KB

Concourse CI assets for Compliance Toolkit

Home Page: https://compliance-viewer.18f.gov/

License: Other

Shell 5.83% Ruby 94.17%

concourse-compliance-testing's Introduction

Concourse CI Compliance Testing

Build Status Code Climate

This is a Concourse pipeline that scans sites for vulnerabilities using OWASP ZAP. This is part of 18F's Compliance Toolkit project, and is essentially the back end of Compliance Viewer.

Adding a Project

The config/targets.json is a list of the projects to be scanned. Since ZAP can inject junk data if it's successful in finding certain vulnerabilities, we suggest using a staging URL. To get a new project added:

  1. Submit a pull request to this repository to add an entry in config/targets.json like this:

    {
      // Needs to be all lower-case.
      "name": "NAME",
      // (optional) Channel in the 18F Slack to get notifications in.
      "slack_channel": "CHANNEL",
      // Links to scan.
      "links": [
        {
          "url": "URL"
        }
      ]
    }
  2. After the pull request is merged, ask someone in #cloud-gov-highbar to run

    TARGET=<fly_target> rake init_targets
    TARGET=<fly_target> rake deploy

Attributes

  • name - This should be all lowercase.
  • slack_channel (optional) - This should be the channel where you'd like to get alerts for completed scans. If left out, the alerts will be sent to the default channel, currently #ct-bot-attack.
  • links - An array of links that should be scanned with ZAP. The results will be concatenated together.

Process Overview

Inputs

The running pipeline depends on this repository for the tasks to be performed and targets to scan. By default, the pipeline pulls the master branch for these tasks, but it can be pointed at a different branch for testing.

Outputs

Normal users of Compliance Toolkit do not need access to the Concourse CI. The pipeline publishes output in a few different modes.

Primarily, the pipeline publishes the ZAP scan results as a JSON file to S3. This is the information that is consumed by the user via Compliance Viewer.

The pipeline also published two types of Slack notifications. The first is a heartbeat notification; it is published to a central channel (currently #ct-bot-attack, but configurable in the pipeline) after every run to confirm that the run happened. This is for the Compliance Toolkit team to monitor that the process is functioning.

The second is for the project teams. It is published to the channel defined in targets.json, or the central channel (as the above notifications) if no channel is defined. It is only published if there is a change in the results. It also includes a link to the results in Compliance Viewer.

Process

For each project, there are two jobs defined, a scheduled job, and an on-demand job. This is due to an oddity in the way Concourse jobs are triggered. If there is a time-based trigger defined, you can not run it at another time. The scheduled job runs every day at midnight. All the project scans are triggered simultaneously, but there are a limited number of workers available. The scans will be queued until a worker becomes available.

Each scan is a multi-step process:

  1. Triggered at 12:00 AM.
  2. Retrieves scripts to run from the GitHub repository.
  3. Retrieves the prior scan results from S3.
  4. Performs some filtering/scrubbing of the prior scan results.
  5. Run the ZAP scan via zap-cli. The ZAP scan has several sub steps of its own:
    1. Run the spider the current target.
    2. Run the AJAX spider for the current target.
    3. Scan the target.
    4. Output the detected alerts.
  6. Repeat i-iv for every target defined for the project in targets.json.
  7. Concatenate the results files into a single file.
  8. Upload the results file to S3.
  9. Summarize the results and the difference between the prior and current scan.
  10. Post the two slack messages (heartbeat & notification, described above)
  11. Upload the summary results to S3.

These steps are performed for each project in a parallelized fashion.

Feedback

Give us your feedback! We'd love to hear it. Open an issue and tell us what you think.

Public domain

This project is in the worldwide public domain. As stated in CONTRIBUTING:

This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication.

All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.

concourse-compliance-testing's People

Contributors

adborden avatar adelevie avatar afeld avatar ctro avatar davidebest avatar dlapiduz avatar gbinal avatar jcscottiii avatar jeremiak avatar jmcarp avatar jseppi avatar juliaelman avatar linuxbozo avatar mogul avatar mzia avatar rogeruiz avatar shawnbot avatar stvnrlly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

concourse-compliance-testing's Issues

Move to a new CI/CD system

This repository has been using Travis CI, however access to Travis CI is now turning off. When you need to use this repository again, convert it to a CI/CD system which is in the ITSP such as Circle CI.

remove dependence on Team API

A few issues:

  • Team API development has been stalled for a while (for various reasons), and the data isn't getting updated.
  • We are overriding the Team API data for most entries in targets.json, so there's cognitive and technical complexity in merging the data.
  • The Team API isn't relevant for projects not created/maintained by 18F, so there's diminishing benefit as we open up Compliance Toolkit to a broader range of users.
  • This project will likely need to point to staging URLs rather than production (as scanning sites can be destructive/disruptive), so Compliance Toolkit will want a different list of URLs than the ones teams provide for the Team API.

@mtorres253 FYI!

shrink the pipeline

Short of Concourse having additional features to handle what I'm calling multi-object builds, having the large number of groups in a single pipeline is no longer working; for starters, the group names no longer wrap, so the ones off to the right are no longer clickable.

screen shot 2017-01-12 at 6 06 30 pm

Since Concourse now supports teams (and #127 will recommend having a dedicated team for these scans), I suggest we instead generate one pipeline per site, rather than one group per site. There are tradeoffs (like not being able to easily pause all at once), but I think it's worthwhile.

look into failing builds for new projects

Seems to be consistent across all new projects after following the new project steps. In filter-project-data:

/tmp/build/af6792a7/scripts/lib/team_data_filterer.rb:32:in `initialize': No such file or directory @ rb_sysopen - /tmp/build/af6792a7/project-data/project.json (Errno::ENOENT)
    from /tmp/build/af6792a7/scripts/lib/team_data_filterer.rb:32:in `new'
    from /tmp/build/af6792a7/scripts/lib/team_data_filterer.rb:32:in `read_json'
    from ./scripts/tasks/filter-project-data/task.rb:6:in `<main>'

e.g. https://ci.cloud.gov/pipelines/zap/jobs/zap-ondemand-cg-landing/builds/2

/cc @dlapiduz

filter-team-data is case sensitive, but it shouldn't be?

I added c2 in the targets.json, which caused me to get

WARN: `c2` is missing from Team API data.

in Concourse.

The project's name in the Team API is C2. Confusingly (but only tangentially related), https://team-api.18f.gov/public/api/projects/C2/ returns a 404, but https://team-api.18f.gov/public/api/projects/c2/ returns successfully.

clean up repository

  • Remove unused tasks/pipelines
  • Put tasks/pipelines into respective folders

@DavidEBest Happy to help with this – any quick pointers on which aren't needed?

test the tests

Would be really nice to be able to automatically verify that a pipeline works in a pull request (and have the commit status updated), without needing to take the contributor's word for it or pulling down and running the pipeline locally. Not sure how difficult this would be to do.

atf-eregs scan hangs creating a new ZAP session.

We saw this problem before, when we were relying on the new session command to loop over all the projects. When I ran fly intercept I found that it was stuck in ZAP, attempting to create a new session.

The ATF project consists of two VERY LARGE sites. I feel that there might be a bug with the new session creation code when there is a lot of data in the previous session. A possible work around/test to run would be to remove the new session mechanism, and just recreate the ZAP instance for every site. That'll incur a greater startup penalty, but that is less of a concern now that we aren't looping over all the projects within a single session.

http://ci-tooling.cloud.gov/pipelines/zap/jobs/zap-scheduled-atf-eregs/builds/1

https://trello.com/c/od2IZwIu/134-atf-eregs-scan-hangs-creating-a-new-zap-session

rename repository?

This repository is difficult to remember/find...maybe something like compliance-viewer-pipeline would make more sense? Or maybe we should actually merge this into https://github.com/18F/compliance-viewer? That being said, "Compliance Viewer" could be more aptly named "Compliance Checker" or something...

Create test-only rubocop config

The rubocop config we have is good, but too strict in some ways for running on test code. We should create a more relaxed version for running on the tests.

reduce false positives in uptime-check

Currently, that task reports "No links" for a number of projects that we want to ignore:

No `links` for about_yml.
No `links` for authdelegate.
No `links` for 18f-identity.
No `links` for dodsbir-scrape.
No `links` for fec-cms.
No `links` for fec-style.
No `links` for hmacauth.
No `links` for laptop.
No `links` for openFEC-web-app.
No `links` for SBIR-EZ.
No `links` for sbirez.
No `links` for team_api.
No `links` for uscis.
https://team-browser.18f.gov/ is NOT up.
https://18f.gsa.gov/team-browser/ is NOT up.
https://www.google.com/calendar/embed?src=gsa.gov_0samf7guodi7o2jhdp0ec99aks%40group.calendar.google.com&ctz=America/Los_Angeles is NOT up.
https://team-browser.18f.gov/team-browser/ is NOT up.

Issues identified in that list:

  • SBIR-EZ is a defunct project.
    • Is there a good way to indicate this in the about.yml?
  • about_yml and laptop don't have standalone sites—should links include a reference to the repository? We might want to say explicitly one way or another in the about.yml instructions.

Ideas for improvement:

  • Reach out to projects/teams where the about.yml seems out-of-date
  • Filter by status
    • I see this in some of the entries in /projects, but it's not listed in the instructions.
  • Filter by ____?
    • My initial thinking was "oh, there's just some field

Resources:

@mtorres253 @ertzeid @ccostino The background here is that I'm pulling a list of URLs from the links provided in the Projects list of the Team API, for automatically checking if the sites are up, doing security scans, etc. Any advice on how to do better filtering of that list?

Add cloud.gov sites to scheduled scans

Can we add:

  • cloud.gov
  • login.cloud.gov / uaa.cloud.gov
  • docs.cloud.gov
  • logs.cloud.gov
  • ci.cloud.gov
  • api.cloud.gov
  • console.cloud.gov
  • metrics.cloud.gov
  • community.cloud.gov

To the regular scans?

Thanks!

Bug: No such file or directory - project-data/project.json

filter-project-data step seems to be broken for all jobs 😢

https://ci.cloud.gov/pipelines/zap/jobs/zap-scheduled-cg-landing/builds/14

/tmp/build/af6792a7/scripts/lib/team_data_filterer.rb:32:in `initialize': No such file or directory @ rb_sysopen - /tmp/build/af6792a7/project-data/project.json (Errno::ENOENT)
    from /tmp/build/af6792a7/scripts/lib/team_data_filterer.rb:32:in `new'
    from /tmp/build/af6792a7/scripts/lib/team_data_filterer.rb:32:in `read_json'
    from ./scripts/tasks/filter-project-data/task.rb:6:in `<main>'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.