GithubHelp home page GithubHelp logo

mozilla / kpiggybank Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jedp/kpiggybank

1.0 13.0 2.0 328 KB

INACTIVE - KPI interaction data store

License: Other

JavaScript 88.76% CSS 10.28% Shell 0.96%
inactive unmaintained

kpiggybank's Introduction

KPIggyBank

A backend store for Key Performance Indicator (KPI) data.

Node.JS + CouchDB.

Piggybanks have a somewhat peculiar protocol in that they are write-many read-once. You can put as many blobs (coins, bills, gift cards, it doesn't care because it's document-centric) as you want into piggybank (within allocated storage of course), but the extraction of resources from piggybank (coins, etc) is a one-time destructive operation (what old-timers referred to as "smashing it on the floor").

Requirements

  • An accessible CouchDB server for persistence.
  • Node.JS (0.6.17 or greater)

Installation

  • git clone: https://github.com/mozilla/kpiggybank
  • npm install

Testing

  • npm test

The test suite simulates throwing a thousand login sequences at the KPI store.

It is anticipated that, with 1 million users, BrowserID will generate some 100 sign-in activities per second. The test suite requires that kpiggy bank can completely store and retrieve records at a rate at least twice as fast as this.

If you want to experiment with the server without having couch installed, use the in-memory data store:

DB_BACKEND=memory node lib/server.js

Note that the in-memory data is not saved anywhere. It's just for testing.

Running

For configuration, the file env.sh.dist can be copied to env.sh and edited. kpiggybank will look for the following environment variables:

  • DB_BACKEND: One of "couchdb", "memory", "dummy". Default "couchdb".
  • DB_HOST: IP addr of couch server. Default "127.0.0.1".
  • DB_PORT: Port number of couch server. Default "5984".
  • DB_NAME: Name of the database. Default "bid_kpi".
  • DB_USER: Username for database if required. Default "kpiggybank".
  • DB_PASS: Password for database if required. Default "kpiggybank".
  • HOST: "127.0.0.1"
  • PORT: Port for the kpiggybank server. Default "3000".
  • MODE: Governs how verbose logging should be. Set to "prod" for quieter logging. Default "dev".

Start the server like so:

  • npm start

Or like so:

  • node lib/server.js

Or change your env configuration with something like:

  • DB_NAME=bid_kpi_test npm start

When running kpiggybank for the first time on a given database, it will ensure that the db exists, creating it if it doesn't.

Please note that the database named bid_kpi_test is deleted as part of the test suite.

Running on AWS

You can use in-tree awsbox scripts to deploy kpiggybank on Amazon's cloud infrastructure.

This process is now just like the process of deploying browserid on AWS, see: https://github.com/mozilla/browserid/blob/dev/docs/AWS_DEPLOYMENT.md

The one modification is that kpiggybank's deploy script ignores mail setup.

JS API

Methods

  • api.saveData(blob [, callback]) - save a hopefully valid event blob
  • api.fetchRange([ options, ] callback) - fetch some or all events
  • api.count(callback) - get number of records in DB
  • api.followChanges() - connect to event stream

Events

  • change - a newly-arrived json blob of delicious KPI data
  • error - oh noes

Examples

The HTTP API calls are wrapped for convenience in a JS module. You can of course call the HTTP methods directly if you want. Example of using the JS API:

    var API = require("lib/api");
    var api = new API(server_host, server_port);
    api.saveData(yourblob, yourcallback);

The callback is optional.

To query a range:

    var options = {start: 1, end: 42}; // optional 
    api.fetchRange(options, callback);

options are ... optional, so you can get all records like so:

    api.fetchRange(callback);

Subscribe to changes stream. The changes stream is an event emitter. Use like so:

    api.followChanges()  // now subscribed

    api.on('change', function(change) {
        // do something visually stunning
    });

HTTP API

Post Data

Post a blob of data to /wsapi/interaction_data.
The post data should contain a JSON object following the example here: https://wiki.mozilla.org/Privacy/Reviews/KPI_Backend#Example_data

In particular, the timestamp field is required, and should be a unix timestamp (seconds since the epoch); not an ISO date string.

  • url: /wsapi/interaction_data
  • method: POST
  • required param: {data: <your data blob>}

Get Data

Retrieve a range of records; returns a JSON string.

  • url: /wsapi/interaction_data?start=<date-start>&end=<date-end>
  • method: GET

Count Records

Retrieve a count of the number of records; returns a JSON encoded number.

  • url: /wsapi/interaction_data/count
  • method: GET

License

All source code here is available under the MPL 2.0 license, unless otherwise indicated.

kpiggybank's People

Contributors

jaredhirsch avatar jedp avatar kparlante avatar ozten avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

kparlante

kpiggybank's Issues

readableDate is hosed.

somehow we've regressed here, readable date is:

  "readableDate": "undefined undefined-PM-11:30:00"

This is looking at recent kpi data while ssh'd into our prod KPI instances.

Document couchdb versions

In README or elsewhere, document version or versions of CouchDB.

Probably version we're targeting in production is enough.

Bonus points for Ubuntu and RHCE package names.

There appears to be a memory leak with the collector

From jedp#8, logged by @jrgm

In the process of driving some high load for interaction_data against the collector, it was noticed that RSS was steadily growing. I suspect there is a leak in that code path, but haven't looked further. And it wasn't a careful experiment as I was playing with kill and iptables tricks before it was noticed, so I may have set the collector on a bad code path (but still...).

Anyways, this could use some investigation in a clone of the kpiggybank-stage instance.

kpiggybank-stage slowed in throughput during a long loadtest on aws browserid stage environment.

About 2013-04-22T12:00, the throughput of POSTs recorded in /home/app/code/kpiggybank.log dropped from ~15 req/sec to ~2 req/sec. This then caused the browserid processes to begin buffering up the KPI blobs in memory, so RSS on that process grew from the normal ~60MB to ~200MB, and would have kept growing until max heap size for v8 was exceeded and the browserid process would have crashed.

Not sure how to debug this further, as the logs say very little.

I restarted kpiggybank-stage, and the backlog of requests on the browserid process eventually cleared.

Wiki changes

FYI: The following changes were made to this repository's wiki:

  • defacing spam has been removed

  • the wiki has been disabled, as it was not used

These were made as the result of a recent automated defacement of publically writeable wikis.

Revisit architecture for sending data from browserid to kpiggybank

The original plan was to use http (when everything lived behind mozilla firewall).

Moving to AWS, the current plan is to use https.

It has been suggested we move to use UDP. Another thought was to write to the filesystem and have a separate process send the data to kpiggybank.

Don't block on this, but take another look at it after we get up and running.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.