GithubHelp home page GithubHelp logo

bespoke's Introduction

NAME

Bespoke - an asset management system for every possible artifact involved with IT operations

FIRST

The thoughts below are a first-pass braindump. Take with salt and flame me at [email protected], @AlTobey on twitter, or "tobert" on Freenode IRC ##devops or ##infra-talk.

CHOICES

This implementation is in perl. I chose perl because I can write it as fast as I think of things, but the backend is by no means specific to perl. I will likely implement pieces in ruby, javascript/node.js, and python as personal experiments.

For the CAS, I chose SHA512 so I never have to think about that choice again or offer any kind of configuration around it. On 64-bit machines, it's plenty fast and storage is cheap.

The UUID's for metadata are using perl Data::UUID's default, which appears to be UUIDv1. It doesn't matter one whit so that could change down the road, though it will likely stay some kind of UUID.

The choice of both SHA512 and UUID are to make scalability as implicit as possible. The filesystem can be fragmented along the first two levels of the directory structure that uses the first 32 bits of the SHA512 in hex. It would be trivial to divide the storage across 256 vnodes and distribute those to many machines, but distributed systems is not a goal at the moment. The UUID-based metadata files allow adding metadata instances with little regard to other instances or coordination.

SYNOPSIS

Bespoke's goal is to track /all/ of the artifacts in your system - where system means all your operating systems, firmware, switch configs, scripts, binaries, packages, and intermediate objects, etc.

It will track all of these things across the past, present, and future. So down the road when you get asked to do forensics on some logs from a month ago, you don't have to wonder what the bit-for-bit configuration was when the logs were generated. You'll have the bits.

In order to accomplish this goal, Bespoke implements a simple Content Addressible Storage system on top of an underlying filesystem. A CAS system offers free deduplication at the cost of having metadata loosely coupled from the data it points at. For Bespoke, this is great, since it needs to generate a bunch of different views into the data anyways.

DESCRIPTION

The infrastructure as code idea is growing in leaps and bounds right now. The current state of the art focuses this (grossly) around two subsystems, configuration management (e.g. Puppet, Chef, BCFG2, Spine, CFEngine, etc.) and real-time control systems (mcollective, fabric, capistrano).

Obviously, if you're managing your infrastructure as code, you should manage that code in some kind of source control system. This is good and true and dissenters should burn.

This leaves out a ton of data though.

packages

OS packages, jars, gems, pars, eggs, tarballs, rsync modules

hand-rolled binaries

/usr/local/bin/bash /sbin/busybox

miscellaneous scripts

You know you still have some. Cron jobs, config scripts, monitoring hacks, etc.

/usr/local/bin/*.sh

deployed code

The stuff engineering flings over the wall. It could be anything.

intermediate artifacts

Puppet graph dumps, DDL (database table definitions) dumps, transition scripts, not-really-tempfiles

The reason a lot of this isn't kept around is optimization. A 200-node network could easily chew up a few terabytes keeping everything but the meat-and-potatoes data on a secondary system.

Bespoke solves the problem by providing its CAS and a multitude of views into the CAS so runtime data can be forward engineered, reverse engineered, and queried in as many useful ways as possible.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.