GithubHelp home page GithubHelp logo

ecs-conex's Introduction

[deprecated] ecs-conex

⚠️ This repository is deprecated and will no longer be maintained ⚠️.

If you’re looking for alternatives to building docker images for AWS ECS, we recommend checking out AWS CodeBuild.

What is ecs-conex?

ECS Container Express is a continuous integration service for building Docker images and uploading them to ECR repositories in response to push events to Github repositories.

Dockerfile

The Dockerfile contains the commands required to build an image, or snapshot of your repository, when you push to GitHub. This file is located in the root directory of your application code. If you are using private npm modules, your Dockerfile might require some additional commands as listed over here

ECR Repository

ecs-conex will create one ECR repository for each Github repository, and each time a push is made to the Github repository, a Docker image will be created for the most recent commit in the push. The image will be tagged with the SHA of that most recent commit. Also, if the most recent commit represents a git tag, the tag's name will also become an image in the ECR repository.

Usage

You only need to run ecs-conex's watch.sh script once to subscribe your repository to the ecs-conex webhook. For more information about associating these resources, see the Getting started documentation.

Documentation

ecs-conex's People

Contributors

amishas157 avatar brendanmcfarland avatar dnomadb avatar emilymdubois avatar ianshward avatar ingalls avatar jqtrde avatar norchard avatar perrygeo avatar rclark avatar sbma44 avatar tadiseshan avatar vsmart avatar yhahn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ecs-conex's Issues

ecs-conex Node 4 EoL

Hello @taraadiseshan ! I’m a bot from your friendly neighborhood security team! Our systems have detected a lambda function in this repository running Node 4 (or older). Node 4 officially went end-of-life on April 30 of this year, and is no longer supported or receiving security patches from the Node community. As such, we’d like to get this repo updated at your earliest convenience.

Our systems aren’t perfect, so it’s possible this issue was created in error:

  • if there is nothing in this repo running outdated Node, please leave a comment to that effect and close the issue
  • if this repo is deprecated, no longer in use, or not deployed on our infrastructure, please leave a comment to that effect and close the issue
  • if you are not the right person to contact about this codebase, please leave a comment and tag in the most appropriate person or team you know of to handle it

Thank you so much for your help! If you have any questions or concerns, don’t hesitate to reach out!

Best,
~ Versioning Looker-Outer Bot 3001™

updating-cat-frozen

Do not overwrite existing images

We should have safeguards in place that images cannot get overwritten. Later overwrites could lead to unexpected versions of dependencies that are "more modern" than the image should be.

Implement a forced timeout

Let's force a timeout of 60 minutes. Presently, its possible for workers to hang indefinitely, and that can lead to a stack that can't process any more messages.

Use cached layers if repository has a yarn lockfile

Right now, images are built with the --no-cache flag. Using cached layers is a way to significantly decrease build times, and long build times are one of the biggest bummers about our current CI flow.

One of the arguments pro --no-cache is that without it, npm install with semver version identifiers in package.json could lead to images that use old cached layers for node.js dependencies. This could lead to unexpected (and very non-deterministic) mismatches between your local environment and your production environment.

Yarn's use of a lockfile that pins node.js dependency versions and is committed in the repo avoids this misstep, and makes me wonder if we could drop the --no-cache flag if there's yarn file in the repo.

However there are still a few other questions to weigh against such a decision:

  • You would only get build-time caching benefits sometimes and not all the time. This depends on whether your conex worker task lands on an EC2 that still has the cached layers from a previous build.

  • Due to ^^, you'd may want to try and keep cached layers laying around on the EC2s for longer, and this leads to disk space management problems.

It may be worth exploring this anyways, without adjusting anything about how we have our EC2s clean up old images/layers. If we can demonstrate a significant benefit for projects with hefty node.js dependency trees or huge unix package dependencies, it may be worthwhile.

cc @springmeyer @scothis @mcwhittemore @mapsam @GretaCB

--no-cache all the builds

We should always build fresh images, without relying on a previous cache. This will make conex builds take longer, but it will insure that conex builds are in sync with builds that may be done locally or on other parts of a CI pipeline (e.g. travis or circle)

  • remove the code which attempts to download a before image
  • add code that refuses a build if the after image already exists in ECR
  • specify --no-cache on all builds
  • [ ](victory lap) this allows us to always remove the image we just built after it has been uploaded to ECR (see #29)

cc @emilymdubois @jakepruitt

Build multiple images

On AWS, a "task" can run one or more docker images. I can definitely imagine reasons why a Github repository might contain multiple dockerfiles to build more than one image. It would be worth considering how this scenario would be handled by this repo.

Large push payloads

InvalidParameterException: Container Overrides length must be at most 8192

A large push payload that hits the webhook will be rejected when watchbot attempts to run the task. The Lambda proxy ought to pluck out the parts of the commit message that ecs-conex needs in order to circumvent this situation.

Conex support for multiple images in the same repo

Conex currently only builds one container per repo.

We have a repo, we'll call it tabby-cat, which defines multiple containers and uses docker-compose as well as custom bash scripts to build and push the multiple images to ECR. The multiple images are tagged with:

tabby-cat:<git-sha>-rails
tabby-cat:<git-sha>-cgimap
tabby-cat:<git-sha>-orcd

where each of those sub-images is defined in a different part of the services section of the docker-compose.yml file.

It would be great if we could rewrite conex to support multiple containers.

Plan for how to do this

The outputs of docker-compose build are multiple images with the format tabby-cat_rails, tabby-cat_cgimap, etc., where tabby-cat is the name of the folder and rails is the name of the section of the docker compose file. Using some string transformations, I think we could re-tag these images with the <repo>:<gitsha>-<subimage> format and push all of them to ECR.

cc/ @Yuffster @rclark

Can webhooks work?

API Gateway allows you to create an API key for your endpoint, and then expects that key to be provided as the x-api-key header in POST requests to the endpoint.

Github, on the other hand, allows you to specify a secret, and then uses that to provide an HMAC digest of the payload as the X-Hub-Signature header in the POST request.

I emailed Github support to try and learn if they have any intention of ever supporting custom headers. An alternative would be to:

  • configure api gateway to reject any request lacking the X-Hub-Signature header
  • setup the lambda function to check the X-Hub-Signature header against the expected value for the payload provided, and proceed if acceptable.

cc @zmully

Maybe not --quiet

docker build --no-cache --quiet ${args} --tag ${repo}:${after} ${tmpdir}

Suppressing the build output makes logs less useful to developers trying to determine why a particular image failed to build.

production not on master/ unhandled promise rejection

While running out the eng standards inventory (#141) I noticed that latest master is not deployed to production. production is still on 112e8ae0.

I started the update and it failed - no healthy tasks could start up because of this error:

(node:1) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: maxJobDuration: not a number

next steps

  • what introduced this regression?
  • deploy a safe fix

cc @mapbox/assembly-line

Load AWS credentials from environment

Right now, during a build ecs-conex attempts to read AWS credentials from the EC2 metadata service. It should read credentials from the environment first, then fall back to the metadata service if there are none in environment.

Cleanup strategy

Currently after building enough image the host that ecs-conex is running on will run out of space and be 🙅 blocked from processing any more jobs. (aws/amazon-ecs-agent#349 (comment))

  • Stopgap: maybe we clean up an image immediately after it's complete
  • Ideally: Keeping images around after a build will speed up subsequent builds. Some kind of LRU-like behavior around the images kept would be nice.

Do not retry

If a build fails, it should either

  • never be retried, or
  • retried some (small) number of times

log-in to private ECR before building image

Sometimes the base images used by Dockerfiles are located in private ECR repos. ecs-conex needs an option to log into private repos as needed to grab base images before building the docker file.

ECR image limits

The documented limit to the number of images in an ECR repository is 1000. The ecr:ListImages request provides no insight into which images are older and which are younger -- in fact there's no clear indication of the sorting order.

How will we manage repository size?

cc @yhahn @emilymdubois @emilymcafee

Optionally, save image tarballs on S3

ecs-conex should allow the user to provide a list of S3 buckets, perhaps spanning several regions. If provided, each build job should

  • make a tagged version of the image as <service name>:<git sha>
  • docker save the tagged image and gzip the file
  • upload it into the specified buckets:
s3://<bucket name>/images/<service name>/<git sha>.tar.gz

cc @jakepruitt @zmully

Invalid signatures in Github payloads

@scothis has noticed that some Github payloads sent when a PR is merged have been getting rejected by conex with a 403. I checked and Github is actually providing an incorrect signature in the POST that it sends. I've filed a support request with Github for this.

It is unclear if this problem is repository-specific or not, but in case anyone else encounters it, the current workaround is to push a subsequent empty commit directly to the master branch of your repo. This will fire a webhook with the correct signature.

cc @yhahn @emilymcafee @jakepruitt @emilymdubois

Issues with --force and squashed merge

It appears that push events related to rewriting the git tree can cause conex to fail on this line. The current behavior is to delete the event from the queue and send a failure notification.

Is this the right approach? I'd like to catch one of these push payloads and understand why they refer to commits that are no longer part of the tree.


When performing a "Squash and Merge" from a PR on Github, conex receives two push payloads. One works, the other doesn't. For example:

[Tue, 17 May 2016 00:13:28 GMT] [ecs-conex] [ecd7dcfa-c378-411a-83e1-01c547b4f14a] processing commit 0000000000000000000000000000000000000000 by rclark to refs/heads/twice of mapbox/ecs-watchbot
[Tue, 17 May 2016 00:13:28 GMT] [ecs-conex] [ecd7dcfa-c378-411a-83e1-01c547b4f14a] Cloning into '/mnt/data/xj4q70'...
[Tue, 17 May 2016 00:13:29 GMT] [ecs-conex] [ecd7dcfa-c378-411a-83e1-01c547b4f14a] fatal: reference is not a tree: 0000000000000000000000000000000000000000

We should be able to silently ignore payloads with this .after sha of all zeros. I've also noticed that the payload has .head_commit: null and .commits: [].

More readme

Todo:

  • how do I use it in an ongoing basis for several repositories?
  • how do I use bootstrap.sh to setup ecs-conex in my own account?
  • how do I use manual.ecs-conex.sh, and why would I want to?

Watching private repositories

The stack is provided a GitHub token that has permission to clone private repositories. watch.sh should accept something like a github user or team name, to make sure that the owner of the token being used is listed as a collaborator with (at least) read permission to the repository.

related #2

Eng standards inventory

Required Elements

If any elements in the below list are not checked, this repo will fail standards compliance.

  • Not running node 4 or below
  • Has at least some test coverage?
  • Has a README?
  • Has no hard-coded critical secrets like API keys?

Rubric

  • 1 pt Is in Version Control/Github ✅ (free points)
  • 1-2 pt node version:
    • 2 pt Best: running node 8+ 🏅
    • 1 pt Questionable: node 6
    • 0 pt Not ok: running node4 or below ⛔️
  • 1 pt No hard-coded config parameters?
  • 1 pt No special branches that need to be deployed?
  • 1 pt All production stacks on latest master?
  • 1 pt No hard-coded secrets like API keys?
  • 1 pt No secrets in CloudFormation templates that don’t use [secure]?
  • 1 pt CI enabled for repo?
  • 1 pt Not running Circle CI version 1? (Point awarded if using Travis)
  • 1 pt nyc integrated to show test coverage summary?
  • 1-3 pt test coverage percentage from nyc?
    • 3 pt High coverage: > 90%
    • 2 pt Moderate coverage: between 75 and 90% total coverage
    • 1 pt 0 - 74% test coverage
  • 1-2 pt evidence of bug fixes/edge cases being tested?
    • 2 pt Strong evidence/several instances noted
    • 1 pt Some evidence
  • 1 pt no flags to enable different functionality in non-test environments?
  • 1 pt Has README?
  • 1-2 pt README explains purpose of a project and how it works to some detail?
    • 2 pt High (but appropriate) amount of detail about the project
    • 1 pt Some detail about the project documented, could be more extensive
  • 1 pt README contains dev install instructions?
  • 1 pt README contains CI badges, as appropriate?
  • 1-2 pt Code seems self-documenting: file/module names, function names, variables? No redundant comments to explain naming conventions?
    • 2 pt Strongly self-documented code, little to no improvements needed
    • 1 pt Some evidence of self-documenting code
  • 1 pt No extraneous permissions in IAM roles?
  • 1 pt Stack has alarms for AWS resources used routed to PagerDuty? (CPU utilization, Lambda failures, etc.)
  • 1 pt Stack has other appropriate alarms routed to PagerDuty? (Point awarded if no other alarms needed)
  • 1 pt Alarms documented?
  • master branch protected?
    • 1 pt PRs can only be merged if tests are passing?
    • 1 pt PRs must be approved before merging?
  • 2 pt BONUS: was this repo covered in a deep dive at some point?

Total possible: 30 points (+2 bonus)
Grading scale:

Point Total Qualitative Description Scaled Grade
28+ points Strongly adheres to eng. standards 5
23-27 points Adheres to eng. standards fairly well 4
18-22 points Adheres to some eng. standards 3
13-17 points Starting to adhere to some eng. standards 2
9-12 points Following a limited number of eng. standard practices 1
< 9 points Needs significant work, does not follow most standards 0

Repo grade: 3 (21 points)

cc @mapbox/assembly-line

reveal GH authentication failure in `watch`

If GithubAccessToken is set, but the token doesn't have enough permissions, the error from Github ends up hidden here, causing the next line to fail with an unhelpful error.

I wonder if we could inspect the response from the Github API to make sure it hasn't errored before proceeding.

Make conex a transform stream

ecs-conex should announce when it has completed a build. This announcement could be an SNS message to a topic that conex controls, or maybe a custom cloudwatch event?

The message body could include

  • the details of the commit message
  • the repository URIs for the images that it dropped onto ECR

cc @jakepruitt @zmully

ecs-conex and tags

A moment ago I committed be77d88, tagged it as v0.2.0 and then git push && git push --tags.

These two pushes resulted in two conex jobs, but the ECR repository did not end up with an image tagged with the git sha -- only the v0.2.0 image exists. My hunch is that because the two images are identical, ECR doesn't retain both? This could lead to a botched deploy if the deploy tool assumes that it can use the sha in the stack's GitSha parameter.

I'm not sure if there's a way to mitigate this, maybe we just need to document the behavior?

cc @emilymdubois @yhahn @emilymcafee

ecs-conex check not appearing in github pull request checks

I've set up ecs-conex successfully: It shows up as an installed app in my repo's "Integration and Services" and docker images are getting built successfully on each commit.

The only issue is that the PR check is missing from the GitHub UI.

@arunasank noted that we should look at

ecs-conex/utils.sh

Lines 42 to 50 in 8014ea9

function github_status() {
local status=$1
local description=$2
curl -s \
--request POST \
--header "Content-Type: application/json" \
--data "{\"state\":\"${status}\",\"description\":\"${description}\",\"context\":\"ecs-conex\"}" \
${status_url} > /dev/null
}

cc @rclark @arunasank - not urgent unless this is impacting other users. Thanks!

Tests

I think a good test suite would

  • build the image from the Dockerfile in this repo, then
  • run the image, building an image for another, real Github repository
  • [maybe] run the cloudformation template, triggering a repository build via webhook

cc @karenzshea

"Install docker binary matching EC2 version" (sic)

From https://github.com/mapbox/ecs-conex/blob/master/Dockerfile#L15-L17:

# Install docker binary matching EC2 version
RUN curl -sL https://get.docker.com/builds/Linux/x86_64/docker-1.11.1.tgz > docker-1.11.1.tgz
RUN tar -xzf docker-1.11.1.tgz && cp docker/docker /usr/local/bin/docker && chmod 755 /usr/local/bin/docker

Except the Docker binary on the EC2 is a somewhat moving target. It'll probably be safest if conex enforces version consistency at runtime, checking the docker binaries it's going to run and getting access (Somehow?) to the host's docker version to make sure that the docker binaries within conex are going to talk to a docker service/socket that it's expecting.

cc @mapbox/platform

Issues with rapid pushes

push or pull ${accountid}.dkr.ecr.us-east-1.amazonaws.com/${reponame} is already in progress

If this occurs conex will exit, watchbot will retry the job and send an error notification. Desired behavior would be a silent retry (exit code 4)

Change manifest format on ECR images

Ref: #97 (comment)

When you push and pull images to and from Amazon ECR, your container engine client (for example, Docker) communicates with the registry to agree on a manifest format that is understood by the client and the registry to use for the image.

When you push an image to Amazon ECR with Docker version 1.9 or older, the image manifest format is stored as Docker Image Manifest V2 Schema 1. When you push an image to Amazon ECR with Docker version 1.10 or newer, the image manifest format is stored as Docker Image Manifest V2 Schema 2.

When you pull an image from Amazon ECR by tag, Amazon ECR returns the image manifest format that is stored in the repository, but only if that format is understood by the client. If the stored image manifest format is not understood by the client (for example, if a Docker 1.9 client requests an image manifest that is stored as Docker Image Manifest V2 Schema 2), Amazon ECR converts the image manifest into a format that is understood by the client (in this case, Docker Image Manifest V2 Schema 1).

Next actions

  • Update Docker version on ecs-conex
  • Set up a stack to pull and re-push existing images on the ECR - this will be a one time task for existing images.

cc/ @mapbox/platform

Provide Better Error Message for Timeout

It seems like there is a default limit of 20 minutes for building Docker images set here.

For more involved Docker images this is often not enough. I didn't get any error message for hitting the timeout, all I saw was Docker build getting randomly killed during building the image.

Is it possible to issue a timeout-reached error message?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.