GithubHelp home page GithubHelp logo

degica / barcelona Goto Github PK

View Code? Open in Web Editor NEW
52.0 52.0 6.0 1.88 MB

PaaS built on top of AWS

License: MIT License

Ruby 94.75% CSS 0.12% HTML 2.63% Makefile 0.20% Shell 0.02% Go 1.15% JavaScript 0.33% Python 0.52% Dockerfile 0.29% Procfile 0.01%
aws barcelona docker ecs paas rails

barcelona's Introduction

⚔️ Degica Quest ⚔️

Welcome brave Ruby warrior. An epic adventure awaits you.

🛠 How to Play

Install the rubygem

gem install degica

And then execute:

$ degica

💪 Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/degica/degica. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

©️ License

MIT

barcelona's People

Contributors

camelmasa avatar davidsiaw avatar dependabot[bot] avatar essa avatar ftlam11 avatar greysteil avatar iorin0225 avatar jdgc avatar k2nr avatar kenta-s avatar maduranga avatar matafc avatar metallion avatar ochko avatar pmq20 avatar resonious avatar rramsden avatar shioyama avatar showwin avatar yuuki77 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

barcelona's Issues

Track deployment history

Right now I have no idea how to do it but it would be very useful if Barcelona store deployment histories.
Having deploy history we can do like bcn deploy rollback

Support Kinesis Stream as log destination

This is our internal request. Currently Barcelona supports only Logentries as the log destination but that is not flexible enough to support our log requirements.
We need to send some logs to Logentries and some others to Elastic Search.
it doesn't make sense for Barcelona to support many log drivers, instead I want to support Kinesis Stream and barcelona users can pull logs from Kinesis Stream. This way users can use logs stored in Kinesis Stream by implementing "puller" component (usually I guess Lambda)

Find a way to deploy Barcelona to ECS cluster

the current barcelona API is deployed in heroku https://barcelona-demo.herokuapp.com but for production, Barcelona have to be deployed to a trusted infrastructure: ECS + secure VPC.

Because Barcelona cannot be used to deploy Barcelona, I have to write CloudFormation template which describes ECS cluster, service, and task definition dedicated for barcelona API service.

Use cloud-config in EC2 userdata

Currently Barcelona is using a bash format for EC2 userdata. this should be written in cloud-config format because it's easy to maintain long running services

HTTP Proxy

Looks like ECS agent started supporting HTTP proxy aws/amazon-ecs-agent#211
It's still experimental feature and once it gets stable, I'd like to setup HTTP proxy(squid) in Barcelona clusters so that I can remove NAT instances

Expire login token

to gain higher security. For now I'll start with fixed "1 day" expiration

Add swap file

Sometimes a container instance dies. I don't know why but I guess that is because the memory usage was full.

I thought ECS instances could work without swap because memory limitation of each container is strictly defined and there is additional reserved memory 128MB which is not used by ECS.

but in general it is a good practice to use swap

Scheduled tasks(cron)

This is the last essential feature that Barcelona currently doesn't have. Even without this feature, you can run a cron container for scheduled jobs but it's not resource-efficient and also it requires every application to be implemented cron tasks in their own way.

What I'd like to implement is the following heritage configuration

{
  "name": "komoju-production",
  "services": [
    // ...
  ],
  // ...
  "scheduled_tasks": [
    {
      "schedule": "*/15 * * * *",
      "command": "rake cron:expire_old_payments"
    }
  ],
  // ...
}

Fine-grained permission control per team

The current permission control is so simple that we cannot control user's accessibility to districts and service. Per-user permission is I think overkill so I propose per-team permission as follows.

Team resource

I propose to add Team resource(model) which holds the following permissions:

Name Default Description
districts [] a list of districts which team members can access to
role member One of admin, developer, or readonly
github_team A name of GitHub's team which Barcelona team is linked to
github_organization A name of GitHub's organization which Barcelona team is linked to

User

  • Each user can belong to multiple teams
  • User should first provide GitHub token
    • Barcelona get github information about the user and decides which teams the user will belong to

Roles

Each Team has one of the following roles.

admin role

admin role can do anything to districts the team belong to

developer role

developer role can do anything admin can do except following actions:

  • updating / deleting / creating a district
  • launch / terminate a container instance
  • deleting a heritage
  • updating / creating / deleting a team
  • updating / deleting a user (except himself)

readonly role

Maybe we don't need this role?

readonly role can do only the following actions:

  • show a district
  • show a heritage
  • show a team
  • show a user

@essa what do you think about this? Is this too complicated or not so flexible?

Add automated deploy trigger

This is not required feature but it would be very useful if Barcelona has deploy trigger API. The workflow would be as follows

  1. new commits are pushed to GitHub master branch
  2. GitHub sends webhook to quay.io(docker registry)
  3. quay.io starts building docker image with the newest repository
  4. when finished, quay.io sends webhook to barcelona This is the trigger API
  5. barcelona triggers ECS's update_service API for staging environment

Save (encrypted) AWS keys in DB

(If infra migrations successfully completed) We will have at least 3 districts:

  • Komoju(AWS komoju account)
  • Degica production (degica2 account)
  • Degica staging(degica3 account)

those districts will belong to different aws accounts and therefore Barcelona needs to store AWS credentials in DB per district.

m4.large instance failed to launch

When I tried to change cluster instance type to m4.large the init script failed with this error message: error reading information on service barcelona: Invalid argument

I had never seen this error until today So I think the current configuration cannot work with m4.large instance (and probably other instance types. I've used t2.* instance types)

Integrate with AutoScaling

The current algorithm of picking a subnet for a new instance is:

subnet_id: section.subnets.sample.subnet_id

Which is really bad for high availability :(

Replace system tasks with EC2 Run Command API

Now that AWS provides run command API for tokyo region we can use it to manage systems.
Currently that kind of system tasks is done by ECS's RunTask API but it is kind of a magic approach so I want to replace it with more straightforward shell scripting

Provide CloudFormation template for a district

Related #125

The reason why I haven't provide a district template is that users may want to setup VPC differently. A user may need

  • VPC with public subnets only
  • VPC with public/subnets and NAT instances
    • they may want to setup high availability NAT instances
  • VPC with public/subnets and outbound proxy services

So desired configuration is totally different per user depending on users needs

But with NAT gateway there is no reason not to choose public/private subnets with the managed NAT gateway. Barcelona, as an opinionated private PaaS powered by ECS, should provide a pre-configured VPC template

Implement user roles and permissions

we need two user roles mostly because of PCI requirements

  • admin
    • can access all districts
    • can sudo in container instances
    • can register non-admin user's public key
  • non-admin
    • cannot sudo
    • can access only districts permitted by admin users

to make things simple and easy, I will just use GitHub teams

  • GitHub admin developers team is Barcelona's admin users
  • GitHub developers team is Barcelona's non-admi users

FInd a way to safely deregister and terminate container instances

Currently ECS agent doesn't have a way to safely terminate ECS container instances which run long-running service or background jobs. see aws/amazon-ecs-agent#130 and aws/amazon-ecs-agent#126

What I want to do is

  • Add "terminate container instances" endpoint to Barcelona
  • When the endpoint is called, Barcelona call StartTask API to run a utility container that deregister instance, stop running containers and finally terminate the instance.

limit access to environment variables

Currently any users can get and post environment variables which could be security exploit. Our first step would be introducing HeritagePolicy which defines who can access to environment variables of a particular heritage.

Confirm (and change) logentries TLS configuration

At the time when I setup Logentries plugin, (if I remember correctly) I followed the official documentation and setup to use api.logentries.com:2000 for TLS-encrypted TCP connection but now the document changed https://logentries.com/doc/rsyslog/ and it says data.logentries.com:443 should be used.

I don't know what configuration is correct so I sent an email to logentries. If they say data.logentries.com:443 should be used (which means api.logentries.com:20000 is legacy?) I'll update logentries plugin configuration

API Schema

JSON hyper schema? swagger? Maybe it's better to use JSON hyper schema

Don't retry delayed job if it fails

Retrying failed jobs is meaningless. at least for now, if job fails something goes wrong in barcelona code and it's not recoverable by retrying. it's enough to notify error to slack

Integrate with VPC NAT gateway

I'll work on this once CloudFormation supports NAT gateway. With NAT gateway we don't need proxy plugin and public sections

Make sure that all degica delayed jobs finish in 30 seconds

when ECS's stop_task is executed, docker first sends SIGTERM signal to a container and 30 seconds later if container still is not stopped, docker then sends SIGKILL to the container, the same signal kill -9 sends. That 30 seconds cannot be customized

For example when I deploy a new version of an application, ECS first spin up a new container with the new image, and when the new container reaches a steady state, ECS tries to stop old containers. If delayed_jobs running inside the old container was processing long-running job, the job would be forcedly killed after 30 seconds

Can't patch user when container instances are zero

I failed to update public_key of '/user'

I tested it with Rails console. And find it raises Aws::ECS::Errors::InvalidParameterException: Container Instances cannot be empty.

User tries to run UpdateUserTask for all District when public_key is updated. But some of District may have no container instances. Such DistrictSection should be skipped.

I think SystemTask#run should skip executing task if container_instance_arns is empty.

@k2nr

I will make a PR for it, but I'd like to confirm if you prefer checking it at early phase like DistrictSection#update_instance_user_account or not.

Support before_deploy scripts

To make it possible to run migration before starting deploy, before_deploy script should be supported in heritage JSON.

I'm planning to support the following JSON format:

{
    "name": "my-application",
    "container_name": "quay.io/my-application",
    "container_tag": "v100",
    "before_deploy": [  // Adding this
        "bundle exec rake db:migrate"
    ],
    "services": [
        {
            "name": "web",
            "cpu": 1024,
            "memory": 512,
            "public": false,
            "port_mappings": [
                {"lb_port": 80, "container_port": 3000}
            ]
        }
    ]
}

deploy lock

If 2 or more deployments are triggered simultaneously something bad could happen. it's hard to guess what would happen but maybe Barcelona should lock deployments

Speed up specs

Why is it so slow? currently running all specs (107 examples) takes 76 seconds

Heritage access key

In most cases for automation, we don't need access keys which can access everything. for example travis deployments only needs access for a heritage

Support ECR

With ECR, ecs-agent get permissions to pull via ECR API which means that a user doesn't need to setup dockercfg, which makes initial setup easier.

I think there are 2 issues:

  • unlike ecs-agent bcn run depends on the standard docker pull so bcn run doesn't work without addidtional dockercfg setting.
  • Not related to Barcelona itself but we heavily depend on quay automated build in our deployment pipeline and ECR doesn't have automated image build solution

terminate_instance API doesn't terminate an EC2 instance

When terminate_instance API is called, barcelona do safe-termination by running k2nr/ecs-instance-terminator on the target instance.

ecs-instance-terminator do the following procedure sequentially:

  1. Deregister a container instance from ECS cluster
  2. Stop all docker containers
  3. sleep $STOP_TIMEOUT (120 by default)
  4. Terminate instance

only the final step doesn't work as expected

Change barcelona service_type to web

Sometimes API request or response couldn't reach correctly because I guess barcelona process is currently not behind a reverse proxy. changing Barcelona to a web should fix this problem.

What is hard is service_type is immutable so we need to somehow recreate barcelona but if I delete barcelona we cannot create barcelona because barcelona doesn't exit

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.