GithubHelp home page GithubHelp logo

humperdink's Introduction

Humperdink

"He can track a falcon on a cloudy day." -- Princess Buttercup

About

Humperdink is a tool to track finite data sets in high performance environments. Want to know what translation keys or Rails views are actually in use at runtime? Humperdink's your man.

Humperdink includes two core classes: DirtySet and Tracker. The design of DirtySet is to track what items in the finite set being tracked have been persisted and which ones have not. At a configurable size and/or duration, the DirtySet will persist any dirty items. The goal is to use memory to store everything and infrequently persist, to cause as little disruption as possible to the performance of your application.

To a similar end, the Tracker class provides some infrastructure to allow plugging in different persistence instances and some error handling to ensure that if something goes wrong, the tracking will quickly shut itself down and get out of the way.

At this early stage of development, Humperdink is designed to be a generic tracking mechanism that will still need some integration work depending on what data you want to track, and only provides Redis persistence.

It also supports configuration options that will allow for easy tracking within long running processes (e.g. Unicorn), short running processes (e.g. cron jobs or Rake tasks) or more unique forking setups (e.g. Resque).

Example - i18n Keys

Included in the source is an example of one way to integrate a Tracker into the I18n gem and track all keys being passed into the translate method.

examples/i18n/key_tracker.rb

Bundler.require
require File.expand_path('../i18n_util', __FILE__)

class KeyTracker
  def initialize(redis, key)
    redis_set = Humperdink::RedisDirtySet.new(:redis => redis, :key => key, :max_dirty_items => 9)
    @tracker = Humperdink::Tracker.new(redis_set, :enabled => true)
  end

  def on_translate(locale, key, options = {})
    begin
      if @tracker.tracker_enabled
        requested_key = normalize_requested_key(key, options)
        @tracker.track(requested_key)
      end
    rescue => e
      @tracker.shutdown(e)
    end
  end

  def normalize_requested_key(key, options)
    separator = options[:separator] || I18n.default_separator
    # this is a cheap way to reduce the amount of string manipulation
    # performed inside normalize_keys, based on the presumption that
    # the :scope is infrequently used. If that presumption is not true
    # then there may be some performance concerns with tracking many
    # translate calls in a short period of time.
    if options[:scope]
      requested_key = I18n.normalize_keys(nil, key, options[:scope], separator).join(separator)
    else
      requested_key = key.to_s
    end
    requested_key
  end
end

module KeyTrackerBackend
  def key_tracker
    @key_tracker
  end

  def key_tracker=(value)
    @key_tracker = value
  end

  def translate(locale, key, options = {})
    @key_tracker.on_translate(locale, key, options) if @key_tracker
    super
  end
end

def setup
  I18nFaker.new.load_em_up(:total => 2500, :max_depth => 7)
  @redis = Redis.connect(:url => 'redis://127.0.0.1:6379/8')
  @all_keys = KeyDumper.new.dump_all_fully_qualified_key_names.to_a
  @redis_key = 'humperdink:example:i18n'
  @redis.del(@redis_key)

  tracker = KeyTracker.new(@redis, @redis_key)
  I18n.backend = I18n::Backend::Simple.new
  I18n.backend.class.class_eval { include KeyTrackerBackend }
  I18n.backend.key_tracker = tracker
end

def execute
  @all_keys[0..99].each do |key|
    I18n.translate(key)
  end
end

def verify
  stored = @redis.smembers(@redis_key)
  raise "count mismatch #{stored.length}" unless stored.length == 100
  stored.each do |k|
    raise 'unknown key' unless @all_keys.include?(k)
  end
  puts 'OK'
end

setup
execute
verify

Configuration Options

Humperdink provides many different options to allow flexible control over the timing and frequency of potentially expensive persistence calls.

The DirtySet can be configured with these options:

  • :clean_timeout - number of seconds to wait between write calls.
  • :max_dirty_items - threshold on count of items needing to be persisted.
  • :exclude_from_clean - a regular expression to filter out items to be persisted.
  • :max_clean_items - caps the amount of already persisted data, to restrict memory usage in exchange for potential redundant persistence.

The RedisDirtySet adds an additional option:

  • :clean_at_exit to persist when the process exits.

Different Configuration Contexts

For a long running process, it should be sufficient to configure either max_dirty_items or clean_timeout if not both.

For short running processes, only the clean_at_exit option may be of any value.

For Resque, which runs events in a child process and bypasses any at_exit blocks, we came up with a ForkSavvyRedis wrapper and ForkPiping mixin which will ensure tracked items from child processes are piped up to the parent process and persisted through it.

Future Plans

The design of Humperdink is expected to evolve as its original design within LivingSocial was terribly coupled to I18n concepts. The 0.0.x versions here are a first shot at re-use, but there still is some confusion and inconsistency within the classes in regards to configuration and event listeners. In addition, it would be nice to add support for other persistence layers and ready-to-go options for tracking specific data sets, like i18n keys, Rails views or whatever others uses can be found.

If anyone in the community finds this tooling useful, we welcome your input.

FAQ

  • You misspelt "Humperdinck"

    Yeah, well, you know, that's just, like, your opinion, man.

humperdink's People

Contributors

chrismo avatar doredesign avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.