GithubHelp home page GithubHelp logo

anser's Introduction

anser -- Database Migration Toolkit

Summary

Anser is a toolkit for managing evolving data sets for applications. It focuses on on-line data transformations and providing higher-level tools to support data modeling and access.

For the Evergreen project, anser allows us to treat these routine data migrations, data back fills, and retroactively changing the schema of legacy data as part of application code rather than one-off shell scripts.

Overview

In general, anser migrations have a two-phase approach. First a generator runs with some configuration and an input query to collect input documents and creation migration jobs. Then, the output of these generators, are executed in parallel

You can define generators either directly in your own code, or you can use the configuration-file based approach for a more flexible approach.

Concepts

There are three major types of migrations:

  • simple: these migrations perform their transformations using MongoDB's update syntax. Use these migrations for very basic migrations, particularly when you want to throttle the rate of migrations and avoid the use of larger difficult-to-index multi-updates.
  • manual: these migrations call a user-defined function on a bson.RawDoc representation of the document to migrate. Use these migrations for more complex transformations or those migrations that you want to write in application code.
  • stream: these migrations are similar to manual migrations; however, they pass a database session and an iterator to all documents impacted by the migration. These jobs offer ultimate flexibility.

Internally these jobs execute using amboy infrastructure and make it possible to express dependencies between migrations. Additionally the MovingAverageRateLimitedWorkers and SimpleRateLimitingWorkers were developed to support anser migrations, as well as the adaptive ordering local queue which respects dependency-driven ordering.

Considerations

While it's possible to do any kind of migration with anser, we have found the following properties to be useful to keep in mind when building migrations:

  • Write your migration implementations so that they are idempotent so that it's possible to run them multiple times with the same effect.
  • Ensure that generator queries are supported by indexes, otherwise the generator processes will force collection scans.
  • Rate-Limiting, provided by configuring the underlying amboy infrastructure, focuses on limiting the number of migration (or generator) jobs executed, rather than limiting the jobs based on their impact.
  • Use batch limits. Generators have limits to control the number of jobs that they will produce. This is particularly useful for tests, but may have adverse effects on job dependency, particularly if logical migrations are split across more than one generator function.

Installation

Anser uses Go modules. To download the modules

make mod-tidy

Resources

Please consult the godoc for most usage. Most of the API is in the top level package; however, please do also consider the model and bsonutil package.

Additionally you can use the interfaces db package as a wrapper for mgo to access MongoDB which allows you to use mocks as needed for testing without depending on a running database instance.

Project

Please file feature requests and bug reports in the EVG project of the MongoDB Jira instance. This is also the place to file related amboy and grip requests.

Future anser development will focus on supporting additional migration workflows, supporting additional MongoDB and BSON utilities, and providing tools to support easier data-life-cycle management.

anser's People

Contributors

bsamek avatar bynn avatar dependabot[bot] avatar john-m-liu avatar johndaniels avatar julianedwards avatar kimchelly avatar syev avatar tychoish avatar ybrill avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.