GithubHelp home page GithubHelp logo

whitemike889 / datagov-deploy Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gsa/data.gov

0.0 0.0 0.0 74.65 MB

Main repository for the Data.gov Platform

Home Page: https://www.data.gov

License: Other

Python 68.19% Shell 29.35% Makefile 1.77% Dockerfile 0.70%

datagov-deploy's Introduction

CircleCI

datagov-deploy

This is the main repository for the Data.gov Platform. We use this repository to track our team's work and for our Ansible playbooks that deploy all the Data.gov site components:

  • www.data.gov (WordPress)
  • catalog.data.gov (CKAN 2.3)
  • inventory.data.gov (CKAN 2.5)
  • labs.data.gov/dashboard (Project Open Data Dashboard)

Additionally, each host is configured with common Services:

  • Baseline OS Hardening
  • GSA IT Security Agents
  • TLS host certificates
  • Postfix email server
  • Filebeat (Logging)
  • New Relic (Infrastructure Monitoring)
  • Trendmicro (OSSEC-HIDS)
  • and more...

See our Roadmap for where we're taking Data.gov.

Environments

Production and staging environments are deployed to FAS Cloud Services (FCS, formerly BSP). Our sandbox environments are provisioned by datagov-infrastructure-live.

GSA VPN access is required to access production and staging.

Environment Deployed from ISP Jumpbox
mgmt master BSP datagovjump1m.mgmt-ocsit.bsp.gsa.gov
production master (manual) BSP datagov-jump2p.prod-ocsit.bsp.gsa.gov
staging release/* or master (manual) BSP datagov-jump2d.dev-ocsit.bsp.gsa.gov
sandbox develop (manual) AWS sandbox jump.sandbox.datagov.us
local feature branches laptop localhost

Usage

All deployments are done from the Jumpbox. They are already configured with these requirements:

Running playbooks

Once you're SSH'd into the jumpbox, follow these steps for deploy.

  1. Assume the ubuntu user and start a tmux session to prevent disconnects.

    $ sudo su -l ubuntu
    $ tmux attach
    

    Or if there are no existing tmux sessions, start a new one.

    $ tmux
    
  2. Switch to the datagov-deploy directory.

    $ cd datagov-deploy
    
  3. Check you are on the correct branch and up-to-date. The branch depends on the environment you're working with. When doing a release, you should be on release/YYYYMMDD

    $ git status
    $ git pull --ff-only
    
  4. Update python dependencies.

    $ pipenv sync
    
  5. Update Ansible role dependencies.

    $ pipenv run make vendor
    
  6. Run the playbook from the ansible directory.

    $ cd ansible
    $ pipenv run ansible-playbook site.yml
    

Common plays

These commands assume you've activated the virtualenv with pipenv shell or you can prefix each command with pipenv run e.g. pipenv run ansible.

Deploy the entire Platform, including Applications, into a consistent state.

$ ansible-playbook site.yml

If the playbooks failed to apply to a few hosts, you can address the failures and then retry with the --limit parameter and the retry file.

$ ansible-playbook site.yml --limit @site.retry

Or use --limit if you just want to focus on a single host or group.

$ ansible-playbook site.yml --limit catalog-web

Deploy the Catalog application.

$ ansible-playbook catalog.yml

Reboot any hosts, one by one, that require one e.g. after an apt-get dist-upgrade.

$ ansible-playbook actions/reboot.yml

Force a reboot even if no reboot is required. Use this if you just need to reboot hosts for any reason.

$ ansible-playbook actions/reboot.yml -e '{"force_reboot": true}' --limit ${host}

Note: for Ansible to parse boolean values in --extra-vars we use the JSON syntax in the above example.

Install the common Services.

$ ansible-playbook common.yml

Upgrade OS packages as a one-off command on all hosts. Note: If you find you're doing one-off ansible commands often, then you should consider creating a situational playbook.

$ ansible -m apt -a 'update_cache=yes upgrade=dist' all

Reload the apache2 service for catalog.

$ ansible -m service -a 'name=apache2 state=reload' catalog-web-v1

Run a one-off shell command. Just an example, don't ever run this ;)

$ ansible -m shell -a "/usr/bin/killall dhclient && dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases eth0" all

Tail the logs using dsh.

$ dsh -g catalog-web-v1 -M -c tail -f /var/log/ckan/ckan.custom.log

Application playbooks

Application playbooks deploy a single Application and its Services (e.g. apache2). We document supported tags and common variables here, but you should refer to the individual roles for the complete documentation.

These commands assume you've activated the virtualenv with pipenv shell or you can prefix each command with pipenv run e.g. pipenv run ansible.

Catalog

Provisions the Catalog app (catalog.data.gov).

$ ansible-playbook catalog.yml

Provision only catalog-web.

$ ansible-playbook catalog-web.yml

Provision only catalog-workers (harvesters).

$ ansible-playbook catalog-worker.yml
Common variables
Variable Description
catalog_ckan_app_version Tag, branch, or commit of catalog-app to deploy
Supported tags
Tag Description
pycsw Deploys only the PyCSW application
database Configure the database with CKAN and PyCSW users

Dashboard

Deploy the Project Open Data Dashboard.

$ ansible-playbook dashboard-web.yml
Common variables
Variable Description
project_git_version Tag, branch, or commit to deploy

Inventory

Deploy inventory.data.gov.

$ ansible-playbook inventory.yml
Common variables
Variable Description
inventory_ckan_app_version Tag, branch, or commit of ckan to deploy

PyCSW

PyCSW is our implementation of the Catalog Service for Web (CSW).

$ ansible-playbook pycsw.yml

Note: PyCSW is currently deployed as part of catalog.data.gov but probably should be deployed and scaled independently.

Common variables
Variable Description
pycsw_app_version Tag, branch, or commit of pycsw to deploy
Supported tags
Tag Description
database Configure the database with PyCSW user

Solr

Deploy Solr.

$ ansible-playbook solr.yml

WordPress

Deploys the www.data.gov (WordPress) application.

$ ansible-playbook datagov-web.yml
Common variables
Variable Description
project_git_version Tag, branch, or commit to deploy

Ansible inventory groups

We use several cross-cutting groups that allow us to deploy to different hosts and set inventory variables based on different dimensions of our hosts.

Stacks

These groups represent different major configurations of the base image.

  • v1 Ubuntu Trusty 14.04
  • v2 Ubuntu Bionic 18.04

Additionally, the application groups have a -v1 suffix e.g. catalog-web-v1. This helps us transition between stacks incrementally.

Application processes

These groups represent different processes of applications, e.g. web and worker processes which might be slightly different configurations of the same application.

  • catalog-admin web hosts for the catalog admin app (subset of catalog-web). This is CKAN with database write permissions.
  • catalog-web web hosts for the catalog app. CKAN is configured read-only.
  • catalog-harvester worker hosts for the catalog app.
  • dashboard-web web hosts for the Dashboard app.
  • inventory-web web hosts for the inventory app.
  • pycsw-web web hosts running the PyCSW application.
  • pycsw-worker worker hosts running the PyCSW jobs.
  • wordpress-web web hosts for the datagov/wordpress app.

Service groups

  • jumpbox host where Ansible playbooks are executed from.
  • solr Solr hosts.
  • elasticsearch Elasticsearch hosts in mgmt vpc only.
  • kibana Kibana hosts in mgmt vpc only.
  • efk_nginx EFK Nginx hosts in mgmt vpc only.

Meta groups

  • web meta group containing any hosts with a web server (e.g. apache2 or nginx).

Development

Most development happens in the role repositories using molecule. There are still a few roles here that you can develop on individually.

Requirements

Setup

We use pipenv to manage the Python virtualenv and dependencies. Install the dependencies with make.

$ make setup

Install third-party Ansible roles.

$ pipenv run make vendor

Any commands mentioned within this README should be run within the virtualenv. You can activate the virtual with pipenv shell or you can run one-off commands with pipenv run <command>.

Tests

Run the molecule test suites locally. You probably don't want to do this since it takes a long time and let CI do it instead. See below for more on how to work with individual test suites. Molecule tests rely on docker for running tests in containerized hosts.

$ pipenv run make test

You can set the concurrency parameter with make's -j parameter.

$ pipenv run make -j4 test

Lint your work.

$ make lint

Testing with molecule

Molecule is the preferred test suite for testing roles. Playbooks can be tested by including them in the molecule playbook.

Molecule is modular, so you must cd to the directory of the role you are testing.

$ cd roles/software/ckan/native-login
$ molecule test

During development, you'll want to run only the converge playbook to avoid creating/destroying the container every time.

$ molecule converge

If you have multiple scenarios, you can specify them individually.

$ molecule test -s <scenario>

Manual testing with Vagrant

This is a work in progress. The Vagrant setup does not include the mysql or postgres databases. The local Ansible inventory is also incomplete.

Where possible, you should use Docker and Molecule for developing and testing your roles. There are some scenarios that you might want to manually test in a virtual machine with Vagrant. For example, some tasks are captured in a playbook instead of a role and playbooks are not tested with Molecule.

Initialize the vagrant environment.

$ vagrant up

Test that you can connect to the vagrant instance with Ansible.

$ ansible -i inventories/local -m ping all

Connect to the VM for debugging.

$ vagrant ssh

Run the wordpress playbooks locally.

$ ansible-playbook -i inventories/local common.yml datagov-web.yml

The local VM is considered to be in all Ansible groups, so running the site.yml playbook will apply every app and role to the VM, likely failing in unexpected ways. For this reason, you should avoid running the site.yml playbook and instead run common.yml with the application playbook.

Clean up the VM after your test.

$ vagrant destroy

Ansible Vault

Inventory secrets are stored encrypted within the git repository using Ansible Vault. In order to decrypt them for editing or review, you'll need the Ansible Vault password.

Setup the vault password(s)

On invocations, you'll be prompted for the vault password. You should add the password to a file and configure Ansible so that it reads the password from file. In order to use git log to review vault history, you'll probably want previous passwords as well. First, create a directory for the passwords.

$ mkdir -m 0700 .secrets

Create a file (e.g. .secrets/ansible-secret-v2.txt) containing only a single Ansible Vault password. You can add additional passwords if you wish, one password per file. Then, set ANSIBLE_VAULT_IDENTITY_LIST to the list of password file paths (comma separated).

$ echo ANSIBLE_VAULT_IDENTITY_LIST=$(find $(pwd)/.secrets -type f | sort -r | paste -s -d, -) > .env

Your .env should look like:

ANSIBLE_VAULT_IDENTITY_LIST="/home/gsa/projects/datagov/datagov-deploy/.secrets/ansible-secret-v2.txt,/home/gsa/projects/datagov/datagov-deploy/.secrets/ansible-secret-v1.txt"

pipenv will load this .env file automatically if included at the root of the project.

On jumpbox hosts, the vault password file should be installed to /etc/datagov/ansible-secret.txt (group readable by operators). This is a manual step for initial jumpbox provisioning. Ubuntu may need to be added to the operators group.

Editing Vault secrets

If you have the Ansible Vault password (ask a team member), you can review and edit secrets with ansible-vault.

Review secrets in a vault.

$ ansible-vault view [path-to-vault.yml]

Edit secrets in a vault.

$ ansible-vault edit [path-to-vault.yml]

You can configure git to automatically decrypt Vault files for reviewing diffs.

$ git config --global diff.ansible-vault.textconv "ansible-vault view"

Deployment

Note: this is a work in progress.

Currently, deployment to BSP environments is done manually by running Ansible playbooks from the jumpbox hosts, within the BSP firewall. We are moving to automated continuous deployment via Jenkins CI server.

We still use CircleCI for majority of CI needs. Any tasks requiring access to the GSA network (like deployment) are handed over to Jenkins (via the Jenkins API).

Jenkins configuration

Using the configuration-as-code plugin, we are able to define the Jenkins configuration and its job configuration in code. After running the jenkins.yml playbook, there are a few manual steps that need to be done.

  1. Log into the new instance
  2. Configure credentials
  3. Add a CI bot user and API token

Log into Jenkins

Log into the Jenkins instance using jenkins_admin_username (default: admin) and jenkins_admin_password (see vault).

Configure credentials

Add credentials to manage secrets.

Id Type Description
ansible-vault-secret file File containing the password to the Ansible vault (ansible-secret-v2.txt).
datagov-sandbox file Root SSH private key file for the environment.
github-datagov-bot text Personal access token from the datagov-bot GitHub user.

Note: evaluate if we want to move the credential creation to configuration-as-code configuraiton.

CI user

Add a CI user. You can set a random password. You'll need this password to log in as the new CI user (Jenkins does not allow creating the API token as the admin user, you'll have to log in as the CI user itself).

Create an API key which you'll use below to configure CircleCI.

Assign the CI user to the build-manager so that the user is authorized to trigger a build.

CircleCI Setup

Add the following environment variables to CI configuration. These are required for the bin/jenkins_build script. Secret variables should be entered in the UI configuration only.

Variable Description Secret
JENKINS_USER The Jenkins user with access to the API. Y
JENKINS_API_TOKEN The API token for the Jenkins user. Y
JENKINS_JOB_TOKEN The job token specified in the job configuration (see jenkins_job_authentication_token). Y
JENKINS_URL The URL to the Jenkins instance. N

In the CI job configuration (.circleci/config.yml), run the bin/jenkins_build <job-name> script.

Troubleshooting

The CIS hardening benchmark sets a 027 umask, which means by default files are not world-readble. This is often a source of problems, where a service cannot read a configuration file.

datagov-deploy's People

Contributors

adborden avatar afeld avatar anup-khanal avatar avdata99 avatar chris-macdermaid avatar dano-reisys avatar dependabot-preview[bot] avatar dependabot[bot] avatar eric-asongwed avatar fuhuxia avatar hareeshreddyg avatar hkdctol avatar jasonschulte avatar jbrown-xentity avatar jjediny avatar kvuppala avatar mogul avatar neilhunt1 avatar pburkholder avatar philipashlock avatar pjsharpe07 avatar skipkeats avatar snyk-bot avatar sstevens21 avatar starsinmypockets avatar thejuliekramer avatar visar avatar woodt avatar ydave-reisys avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.