openfun / arnold Goto Github PK

View Code? Open in Web Editor NEW

41.0 11.0 8.0 1.91 MB

:construction_worker_woman: Deploy your applications to Kubernetes with Ansible

License: MIT License

Shell 13.01% Python 12.35% Dockerfile 0.59% Makefile 0.79% Jinja 73.25%

ansible docker-application deployment microservices-architecture edtech django reactjs

arnold's People

Contributors

Stargazers

Watchers

Forkers

olabi kinkir anzumbano psy0ch melsu90 tidypete ffroehli idea-africa

arnold's Issues

Purpose

OpenShift ressources (builds, deployements, pods, jobs) must be purged.
We can do it automatically with some configurations on templates.

Proposal

add auto-prune on all OpenShift objects.

Rename preprod env_type to preproduction

Purpose

The preprod environment name is not coherent with what we use in our apps.

Proposal

Rename the preprod environment to preproduction

Improve OpenShift's routes host

Purpose

Current hosts pattern should be improved as:

a subdomain should be up to 63 characters long (see RFC1123)
as a consequence of 1. the pattern logic should evolve to a subdomains tree.

Proposal

For feature deployment, move the original feature route:

lms_host: "{{ project_name }}-lms--{{ feature_title }}.{{ domain_name }}"

to something like:

lms_host: "{{ feature_title }}.lms.{{ project_name }}.{{ domain_name }}"

And for common routes:

lms_host: "lms.{{ project_name }}.{{ domain_name }}"
lms_host: "previous.lms.{{ project_name }}.{{ domain_name }}"

Discussion

For feature deployments, @sampaccoud suggested to use the PR id instead of the feature_title, and I think it's a great idea since the deployment will be generated by the CI platform.
This issue is tightly linked to #47 and should be handled first IMO.
We need to validate the fact that openshift-acme will be able to generate Let's encrypt certificates for each new route following this pattern.
In the same spirit, maybe the default project_name should be {{ env_type }}.{{ customer }} instead of {{ env_type }}-{{ customer }}

Purpose

Using black has become a routine at FUN. As suggested in #76, let's add this linter/formatter to our CI.

Proposal

add black python dependency
add black linting to the bin/lint script

Dependency

Black required python 3.6, hence, this issue depends on #78

Move Memcached to its own application

Bug Report

Expected behavior/code
Memcached should be deployed in "Replace" mode (see the Redis app) which implies having 2 instances mounted on the same volume just before switching from the blue to the green stack. This may screw-up its state,

"Replace" mode deployments imply a (very short) downtime. Memcached should therefore be in a separate app so that we can avoid these downtimes as much as possible.

Actual Behavior
Memcached is included in edxapp and is therefore:

subject to blue/green deployments,
redeployed each time edxapp is deployed.

Proposal

We propose moving Memcached to its own app. It will also allow sharing it between applications...

Move Memcached to a separate app
Make sure settings are specific for each app using Memcached

Fix requests dependency warning for urllib3

Purpose

The following warning message appeared recently:

/usr/local/lib/python2.7/dist-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)

Proposal

It should be fixed!

Group apps by their name in OpenShift's console

Purpose

OpenShift can group services belonging to the same app in the console. While the monitoring a project (customer + environment), we want to see all pods of one app, this can be achieved thanks to the app name filter.

Proposal

use {{ apps.name }} in templates to group all services under the same app in OpenShift

Add documentation for example projects

Purpose

As mentioned in #44, project samples (such as hello - see #45) will be moved to an examples/ directory. We will need to explain how to test Arnold with those example projects.

Proposal

Document example projects usage (thank you Captain Obvious ™️).

[CI] Add lint and tests

Add to the CI :

Make projects customizable

Purpose

We project to be customizable so that anyone can use Anrold as a tool to build their own project.

Proposal

Add a new directory with the following structure:

├── examples
│   └── hello
│       └── templates
│           └── ...
│       └── group_vars
│           └── ...
│   └── openedx
│       └── templates
│           └── ...
│       └── group_vars
│           └── ...
│   └── richie
│       └── templates
│           └── ...
│       └── group_vars
│           └── ...

In order to deploy one of these projects, the scripts should run docker mounting the templates and group_vars directories in the *arnold directory.

move the existing specific configuration to a private repository
delete the files directory as it won't be used for the moment
add the examples directory and split the current templates/vars into 3 projects: hello, openedx, richie
modify the _docker_run bash command to take the project name as an argument.

Factorize edxapp required initContainers

Purpose

We use OpenShift's initContainers to generate custom settings for an edxapp environment. This definition is repeated among multiple templates (DCs and jobs). This is not optimal. Even if we like DC or job templates to be fully declarative and self-contained, there is room for improvement.

Proposal

My first attempt was driven by Jinja's template inheritance, but it was a dead-end as the Ansible template lookup was failing when using the {% include "_partials/initContainers.yml.j2" %} template tag. I think it's the way to go but I did not succeed and postponed it to later.

[CI] Improve code testing

Purpose

This issue is a follow-up of the issue #33 ; we need to improve code testing.

Proposal

Add to the CI :

Plug Datadog APM

Add possibility to plug datadog APM to the apps.

Remove unused $ANSIBLE_VAULT_PASS environment variable from env.d/*

Purpose

We used the $ANSIBLE_VAULT_PASS variable to avoid having to type encrypted files ansible-vault password. This is no longer the case. I prefer having to type this password once in a while when creating OpenShift secrets.

Proposal

Remove ANSIBLE_VAULT_PASS from env.d/*

Show route urls at the end of a deployment

Purpose

When a deployment is finished we have to go to OpenShift to see the routes.

Proposal

Print the route urls as a debug message in Ansible at the end of the deployment process.

[CI] Add deployment test with a local OpenShift installation

Purpose

This issue is a follow-up of the issue #4 ; we need to test the deployment strategy.

Proposal

install minishift locally
test the init_project.yml playbook with the hello project
test the deploy.yml playbook with the hello project

Create bootstrap playbook

Purpose

On each new project the Dev or Ops must do :

init
deploy
swich routes

Proposal

To bootstrap a project with all apps on the current route, we can create a bootstrap_project.yml playbook who does these 3 steps.

Enforce OpenShift host for a given environment

Purpose

The OpenShift host is set in the env.d environment files. This means that we could end-up mistakenly deploying a "feature" environment to the production host for example.

Proposal

Set the K8S_AUTH_HOST in group_vars and jinja templates instead of env.d files.

Add a variable openshift_host in group_vars:
* in main.yml set it to the value of the K8S_AUTH_HOST environment variable
* in each environment files, set it to their specific value (minishift, openshift dev or openshift prod)
Set a "metadata" in the create_objects playbook to enforce the host for all tasks in the playbook as specified in openshift_raw documentation https://docs.ansible.com/ansible/2.5/modules/openshift_raw_module.html

Create applications route during project initialization is not required

Purpose

Routes must be created from the first application deployment, not during initialization and then patched during the deployment.

Proposal

remove the create_static_services_routes.yml template as it's almost a copy of the deploy_patch_route.yml template
ensure the deploy_get_stamp_from_route.yml tasks work without any route already created for the project

add apps.services.*.host on auto-discovery apps method

Purpose

Currently, the creation of static routes is generic and based on the presence of host variables in a service.
If we use the directory apps auto-discovery method, we do not have this host variable on dict.

Proposal

On app default var file (main.yml) create list whit the services who need's route.
The generic patern of the routes is : (previous|current|next).{{ app.service.name }}.{{ project_name }}.{{ domain_name }}

[doc] Rename index.md to readme.md

For more accessibility can we rename the index.md files to readme.md ?

Add env_type & customer labels to OpenShift objects

Purpose

OpenShift labels are a useful way to filter objects and act on it quickly in the console or via the oc CLI. When using the console, objects are restricted to a namespace (customer + env_type), so there was no real need to add the customer or env_type label. Now, that we are using Datadog to monitor OpenShift, adding those labels would be useful as they are used a meta-data to qualify objects (pods, etc.) in Datadog's interface.

Proposal

Add customer and env_type labels to every objects

Secure bin/bootstrap by asking confirmation before deleting any existing project

Feature Request

The bin/bootstrap command starts by deleting any existing project. It does not ask for confirmation so there is a risk of someone unintentionally losing information.

Add a confirmation message before deleting the project,
Allow forcing this confirmation by setting the "--force" or "-f" option.

Switch routes for blue green deployments

Purpose

Currently, our deployments create new complete stacks including routes to access each stack.
In a functional architecture, we need to manage routes differently to allow uninterrupted service when making a new deployment.

Proposal

Stop creating routes together with the stack of svc/dc/ep.

Create only 3 routes for each service. The following hosts are examples for the lms:

previous: "previous.lms.{{ project_name }}.{{ domain_name }}"
current: "lms.{{ project_name }}.{{ domain_name }}"
next: "next.lms.{{ project_name }}.{{ domain_name }}"

Each time we make a new deployment, the following actions are taken:

create the new stack,
create/update the next route:
- if the "next" route pre-existed, update it to point to the new stack. Delete the stack that was previously pointed by the "next" route, making sure that it is not pointed by the "current" route (it is normally not possible but could happen if something went wrong in a route switch... so let's check that explicitly before deleting it),
- if the "next" route did not exist, create it and point it to the new stack.

Each time we confirm a deployment by triggering a route switch, the following actions are taken:

point the "current" route to the stack that was pointed by the "next" route,
point the "previous" route to the stack that was pointed by the "current" route just before we changed it in 1)
delete the stack that was pointed by the "previous" route just before we changed it in 2)
delete the next route

Add support for multiple host per-service

Purpose

We currently only support one host per service definition to create underlying routes. As some apps (like edxapp) may have many routes pointing to the same service (e.g. nginx), we need to improve hosts management.

Proposal

Move the app.service.host type from a simple string to a list of dictionaries with the route, the targeted service name and the targeted port.

Make development the default environment

Purpose

An env.d/development file is required to start using Arnold with Minishift.

Proposal

We want the default environment to be used without any work from the developer:

Rename the env.d/base to env.d/development.
Update doc to explain that the environment file must be explicitly set or will default to environment

Consider switching base image to ubuntu

Purpose

As mentioned in #76, Arnold's docker image is a tool, so it can be ubuntu-based instead of debian-based. Switching to Ubuntu would allow to use more recent packages, particularly, python 3.6.

Proposal

switch to tiny ubuntu:18.04 base image
update apt package names and versions
upgrade python packages versions (?)

Add create_users job

Purpose

We need to automate the user account creation for every Django service/flavor/customer Arnold will deploy.

Proposal

Add a create_users job, that will actually create a superuser and other profiles using django management commands and vaulted credentials.

Consider using "sugar scripts" for production

Purpose

We wrote bash scripts to ease the developer experience. For now, they are intended to get used only for development purpose. But, I think we must also consider using those scripts in other environments (even staging or production) to manually interact with a running stack.

I see two main advantages in doing so:

decrease typos or missing parameters when typing long commands,
increase work flow reproducibility.

Proposal

We need to make environments more configurable instead of hard-coding minishift-specific rules.

Remove unused '.vault_pass.sh'

Purpose

.vault_pass.sh was used to ease ansible vault decryption from OpenShift. It is no longer required.

Proposal

Remove this file from the repository 😄

auto-discovery apps method don't find symbolic link

Bug Report

Expected behavior/code
A symbolic link app in apps dir must be included in available_apps

Actual Behavior
A symbolic link app in apps is ignored.

Environment

Arnold version: 0.1.3-alpha
Platform: Ubuntu 16.04.1

Replace KVM by VirtualBox as the preferred option to run Minishift

Purpose

KVM is presented as the preferred option to run Minishift. Most developpers will already have VirtualBox on their laptop but not KVM. Furthermore we experienced network issues when running KVM that made us prefer VirtualBox.

Proposal

Replace KVM by VirtualBox:

as the preferred way to install Minishift in documentation
as the default option in our scripts

Allow limiting objects to specific environments

Purpose

Some objects are created in environments where they should not e.g. endpoints in development/feature, acme in development/feature, database containers in staging/preproduction/production, etc.

Proposal

Improve the structure of objects in group_vars to allow this configuration:

endpoints should only be created in staging/preproduction/production
mysql and mongodb containers should only be created in feature/development
acme should not be created in development

Move CI from GitLab to CircleCI

Purpose

Historically, this project has been initiated using our own GitLab instance, hence, we designed a GitLab-CI-based continuous integration (CI) workflow. When we decided to open-source this project and move it to GitHub, we lost our CI. As it is a prerequisite for this project (for every project in fact), we need to improve and re-implement it.

Proposal

As we usually use CircleCI, let's stick to it for Arnold. We have the advantage that CircleCI and GitLab-CI philosophies are quite close. Therefor, we might thought that this will be an easy task, but it's not. The way we designed our original CI workflow is not sustainable for an open source project (the last step is a deployment to our OpenShift instance).

My first idea is to use a trashable MiniShift instance in our CI workflow, but this will be a really tricky part.

Dear contributors, I invite you to make proposals; nothing has been decided yet.

Remove useless richie patient0/production secret

Purpose

We used to deploy to production in our default CI. This is no longer the case.

Proposal

Remove the following vaulted secret: group_vars/secret/patient0/production/richie/credentials.vault.yml

Split the test-bootstrap CI job in multiple jobs (one per application)

Purpose

As mentioned in #89, at the time of writing, our CI fails at the test-bootstrap job that raises a timeout error from CircleCI (jobs cannot take longer than 10 minutes to execute).

Proposal

To bypass CircleCI limitation, we will try to split the test-bootstrap job in multiple jobs (ideally one per app). 🤞

Add "add customer" playbook

Purpose

Adding a customer to deploy may require creating configMaps, vaults, etc. It may be useful to automate this task.

Proposal

Create a add_customer.yml playbook that:

creates customer-specific vars in group_vars/customer/foo/main.yml from a template file (customer_main.dist.yml)
creates customer-specific vars in group_vars/customer/foo/{{ env_type }}/main.yml from the same template file (customer_main.dist.yml)
creates the group_vars/customer/foo/{{ env_type }}/secrets/ directory
creates the group_vars/customer/foo/{{ env_type }}/configs/ directory

with foo the name of this new customer.

Allow customizing the list of configmaps per customers and environments

purpose

The playbook that creates configmaps automatically find them in the templates/configmap directory and pushes them all to OpenShift. We want to be able to define which configmaps are created for each customer or env_type.

For example, the "edxapp-fixtures" configmap should only be created for the development or feature environments.

Proposal

Apply the same mechanism as for OpenShift objects to configmaps:

List configmaps in an Ansible variable called openshift_configmaps that can be overridden by customer/env_type,
Stop autodiscovering configmaps in the playbook but use the openshift_configmaps variable.

Improve jobs reliability in deployment

Purpose

OpenShift jobs to run migrations or collectstatic tasks for deployed apps are currently unreliable since they are created as is even if the targeted service is not up yet. We need to be able to:

run a job on a target pod only when this pod is up and ready
run jobs respecting a particular sequence (e.g. LMS migrations prior to CMS migrations)

Proposal

configure pods' health check mechanism and use it to run a job only when appropriate (it implies that used containers/services implements such health check urls, see #69)
use oc to get the job execution status (wait loop)
define jobs "workflow" in the app vars/main.yml config

Move playbooks in a playbooks/ directory

Purpose

We now have too many playbooks at the project root.

Proposal

Move the all playbooks in a playbooks/ directory
Fix sugar scripts to match this new path
Fix documentation
Fix relative paths in playbooks

Consider dropping minishift for development

Purpose

Using minishift and thus a VM to run OpenShift adds a complexity layer for development and is not required. Let's switch to an all-in-one docker container that runs OpenShift!

Proposal

use oc cluster [up|down]
add oc registry to insecure docker registries *
update sugar scripts to minishift-less predicates
update Arnold's documentation

* /etc/docker/daemon.json

{
    "insecure-registries": [
        "172.30.0.0/16",
    ]
}

Reactivate cms & lms route objects

Purpose

Routes defined in group_vars/all/main.yml are overridden by group_vars/all/openshift_routes.yml. At the time of writing the cms & lms routes have been inactivated and are thus not created by the create_objects playbook / task list.

Proposal

Un-comment the following two lines and we will be good to go:
https://github.com/openfun/arnold/blob/master/group_vars/all/openshift_routes.yml#L9

Document configMaps override using the files/ directory

Purpose

The files/ directory has been left almost empty, and its purpose may not appear obvious as it is not documented at all.

Proposal

The files/ directory may be used to override project's configMap (see tasks/create_config.yml). Let's document this feature!

[CI] Publish Arnold's Docker image to Docker Hub

Feature Request

Arnold's image takes a long time to build. It should not be a necessary step for newcomers.

Proposal

The Continuous Integration in CircleCI should push Arnold's Docker image to Docker Hub after building it.

This has already been done on Richie here: https://github.com/openfun/richie/blob/master/.circleci/config.yml#L432

Publish Arnold's Docker image to DockerHub when a build is triggered by a git tag
Push an image with the same tag as the git tag AND push the same image with a "latest" tag so that it is always up-to-date.

Add health checks to all our applications

Purpose

We need to configure health checks.

Proposal

I propose following Mozilla's Docker flow:

first consider adding https://github.com/mozilla-services/python-dockerflow to each of our projects,
configure healthchecks in OpenShift to monitor the /heartbeat and /lbheartbeat endpoints.

Add 404 page

Purpose

When a route in not defined OpenShift serve the default 404 page : Application is not available

Proposal

Create a 404 template to serve a 404 page.

Refactoring code

Use Ansible possibility to :

follow the DRY way
make the code more modular
use more vars in template
reduce possible dev mistake

[CI] Clone CircleCI in gitlab-ci file

Feature Request

To have a exemple of functional CI on Gitlab. It should not be usefull to have a gitlab-ci in parallel of Circle CI.

Proposal

Translate the content of CircleCI file into .gitlab-ci.yml

Build and push documentation to readthedoc

Purpose

As a first approach, we drafted a project documentation as a navigable suite of markdown formatted files. Let's switch to a more convivial documentation pushed to readthedoc.

Proposal

Constraints

We want to keep it in markdown format
It must stay browsable from GitHub

Implementation

Make documentation structure compatible with readthedocs
Add a builder for the documentation (bash or Makefile)
Add a CI step to automate the publication process

Use dev Docker images for the "development" environment

Purpose

In the development environment, we want to have more tools installed on our images.

Proposal

Use dev Docker images only for the development environment:

For edxapp
For Richie