GithubHelp home page GithubHelp logo

scales's Introduction

scales - Metrics for Python

Travis build status

Tracks server state and statistics, allowing you to see what your server is doing. It can also send metrics to Graphite for graphing or to a file for crash forensics.

scales is inspired by the fantastic metrics library, though it is by no means a port.

This is a brand new release - issue reports and pull requests are very much appreciated!

Installation

You can get a release from PyPI:

pip install scales

Or you can get it from GitHub:

git clone https://github.com/Cue/scales

cd scales

python setup.py install

The HTTP statistics viewer in scales requires one of the following web frameworks:

Flask

Tornado

Twisted

If you aren't sure, go with Flask; it's compatible with most every other event loop. You can get it with pip install flask.

Scales is tested with Python 2.7 and 3.3. For some reason it does not work with PyPy; pull requests for this are welcome, if you can figure out what's up.

How to use it

Getting started and adding stats only takes a few lines of code:

from greplin import scales

STATS = scales.collection('/web',
    scales.IntStat('errors'),
    scales.IntStat('success'))

# In a request handler

STATS.success += 1

This code will collect two integer stats, which is nice, but what you really want to do is look at those stats, to get insight into what your server is doing. There are two main ways of doing this: the HTTP server and Graphite logging.

The HTTP server is the simplest way to get stats out of a running server. The easiest way, if you have Flask installed, is to do this:

import greplin.scales.flaskhandler as statserver
statserver.serveInBackground(8765, serverName='something-server-42')

This will spawn a background thread that will listen on port 8765, and serve up a very convenient view of all your stats. To see it, go to

http://localhost:8765/status/

You can also get the stats in JSON by appending ?format=json to the URL. ?format=prettyjson is the same thing, but pretty-printed.

The HTTP server is good for doing spot checks on the internals of running servers, but what about continuous monitoring? How do you generate graphs of stats over time? This is where Graphite comes in. Graphite is a server for collecting stats and graphing them, and scales has easy support for using it. Again, this is handled in a background thread:

graphitePeriodicPusher = graphite.GraphitePeriodicPusher('graphite-collector-hostname', 2003, 'my.server.prefix.')
graphitePeriodicPusher.allow("*") # Logs everything to graphite
graphitePeriodicPusher.start()

That's it! Numeric stats will now be pushed to Graphite every minute. Note that, by default, if you don't use allow, nothing is logged to graphite.

You can also exclude stats from graphite logging with the forbid(prefix) method of the GraphitePeriodicPusher class.

Timing sections of code

To better understand the performance of certain critical sections of your code, scales lets you collect timing information:

from greplin import scales

STATS = scales.collection('/web',
    scales.IntStat('errors'),
    scales.IntStat('success'),
    scales.PmfStat('latency'))

# In a request handler

with STATS.latency.time():
  do_something_expensive()

This will collect statistics on the running times of that section of code: mean time, median, standard deviation, and several percentiles to help you locate outlier times. This happens in pretty small constant memory, so don't worry about the cost; time anything you like.

You can gather this same kind of sample statistics about any quantity. Just make a PmfStat and assign new values to it:

for person in people:
  person.perturb(42)
  STATS.wistfulness = person.getFeelings('wistfulness')

Metering Rates

Scales can track 1/5/15 minute averages with MeterStat:

from greplin.scales.meter import MeterStat

STATS = scales.collection('/web', MeterStat('hits'))

def handleRequest(..):
  STATS.hits.mark() # or .mark(NUMBER), or STATS.hits = NUMBER

Class Stats

While global stats are easy to use, sometimes making stats class-based makes more sense. This is supported; just make sure to give each instance of the class a unique identifier with scales.init.

class Handler(object):

  requests = scales.IntStat('requests')
  latency = scales.PmfStat('latency')
  byPath = scales.IntDictStat('byPath')

  def __init__(self):
    scales.init(self, '/handler')


  def handleRequest(self, request):
    with self.latency.time():
      doSomething()
    self.requests += 1
    self.byPath[request.path] += 1

Gauges

Simple lambdas can be used to generate stat values.

STATS = scales.collection(scales.Stat('currentTime', lambda: time.time())

Of course this works with arbitrary function objects, so the example above could also be written:

STATS = scales.collection(scales.Stat('currentTime', time.time)

Hierarchical Stats + Aggregation

Stats can inherit their path from the object that creates them, and (non-gauge) stats can be aggregated up to ancestors.

class Processor(object):
  """Example processing management object."""

  threadStates = scales.HistogramAggregationStat('state')
  finished = scales.SumAggregationStat('finished')

  def __init__(self):
    scales.init(self, '/processor')
    self.threads = 0


  def createThread(self):
    threadId = self.threads
    self.threads += 1
    SomeThread(threadId).start()



class SomeThread(object):
  """Stub of a processing thread object."""

  state = scales.Stat('state')
  finished = scales.IntStat('finished')


  def __init__(self, threadId):
    scales.initChild(self, 'thread-%d' % threadId)


  def processingLoop(self):
    while True:
      self.state = 'waitingForTask'
      getTask()
      self.state = 'performingTask'
      doTask()
      self.finished += 1

This will result in a stat at the path /processor/finished which counts the total of the finished stats in each SomeThread object, as well as per-object stats with paths like /processor/thread-0/finished. There will also be stats like /processor/state/waitingForTask which aggregates the number of threads in the waitingForTask state.

Authors

Greplin, Inc.

License

Copyright 2011 The scales Authors.

Published under The Apache License, see LICENSE

scales's People

Contributors

aholmberg avatar alex avatar dparrol avatar frewsxcv avatar gregbowyer avatar hynek avatar joeshaw avatar kevinclark avatar northisup avatar onilton avatar peterscott avatar pydeveloper94 avatar r3m0t avatar robbywalker avatar sketerpot avatar thobbs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scales's Issues

use case examples

I think it would help many folks a lot if a few lines where added at the top explaining certain use cases and giving some examples of what exactly Tracks server state and statistics, allowing you to see what your server is doing means... Basically making it clear that it's not something akin to Nagios but more about application monitoring...

Usage within multiprocess environment (like gunicorn)

Is there any way to use this library within a multiprocess environment, like when run with Gunicorn (with workers > 1).

Currently each process will collect metrics - and if you want to expose them with Flask statsHandler, it will fail to bind due to all using the same port. This can be disabled or fixed by doing your own stats handler.

However the metrics which are gathered would be per-process. If you enable the graphite writer, each process will write out metrics with the same hostname and then clobber the other processes.

Is there a way around this?

Only (quick) idea I have come up with is using PID of each process to uniquely prefix the metrics. But then when a process is restarted, it will cause lots of extra data to be written on Graphite server.

I am looking into using the gunicorn hooks as well - but wanted to check here first if this is something already solved?

Syntax warning due to comparison of literals using is

find . -iname '*.py' | grep -Ev 'vendor|example|doc|tools|sphinx' | xargs -P4 -I{} python3.8 -Wall -m py_compile {}
./src/greplin/scales/__init__.py:158: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if subContext is not '':

PmfStat: do not accumulate values forever

Hi, I love this Scales module, it's been very useful. However, one thing that bothers me is this PmfStat. Suppose I am measuring the login delays on a web site. If I leave the web server running for a week, the pmf stats reported to graphite are a combination of the current login delays but also taking into account all the login delays over the entire week. This means that, if there is an anomaly, and login times shoot up to become 10 times bigger, this change will hardly even be noticed in the mean value reported by PmfStat.

If I am sending data every minute to graphite I would like this data to represent only the last minute interval, not days or weeks of accumulated data. I am not sure how to achieve this with Scales.

Clarification about pypy support

I want to known if pypy still as not supported plataform, and if so what is preventing it. I see that travis is running with pypy also I've a project using scales with PyPy without any apparent issue.

release and update to pypi

The version on pypi is from a long time ago, so if the current branch is stable can you release a 1.0.3?

I'm confused about MeterStatDict

This is mainly about naming things, but tell me if I'm way wrong on this.

So I have the following stats:

STATS = scales.collection('/api',
    meter.MeterStat('api_v1'),
    meter.MeterStat('multiplexed'),
    scales.IntDictStat('status_code'),
    scales.PmfStat('latency'),
    )

I can go STATS.status_code[200] += 1, but what I really want to do is have a default dictionary of meters so I could do STATS.status_code[200].mark(), but MeterStatDict doesn't seem to be anything like IntDictStat.

Can somebody explain/document the thoughts behind some of these data structures?

p.s. you guys going to py con? I'd love to meet up.

GraphitePeriodicPusher from 1.0.7 fails in Python3

Using the below code:

from greplin import scales
from greplin.scales import graphite
import time
import logging
import sys

def main():
    # Configure logging
    logging.basicConfig(level=logging.DEBUG, stream=sys.stdout)
    # Gather stats
    stats = scales.collection('/reporter',scales.IntStat('hits'))
    stats.hits += 1
    # Push stats
    pusher = graphite.GraphitePeriodicPusher('192.168.0.10', 2003, 'mypc')
    pusher.allow("*")
    pusher.start()
    # Do something for a while
    for val in range(100):
        time.sleep(1)
        logging.info(val)
        stats.hits = val

Running in python 2.7.6 correctly sends the data to graphite.
Running in python 3.4.0 logs the exact same messages, but fails to send:
Nothing is received when watching carbon logs on the graphite server.

What can I do to send data to graphite from python3?

Reseting stats

Is there no way to reset the stats. I want to push out stats periodically reset after that. Is it possible to do this in the present implementation?

Question concerning `real` value of PmfStat

via src/greplin/scales/__init__.py

def addValue(self, value):
    """Updates the dictionary."""
    self['count'] += 1
    self.__sample.update(value)
    if time.time() > self.__timestamp + 20 and len(self.__sample) > 1:
      self.__timestamp = time.time()
      self['min'] = self.__sample.min
      self['max'] = self.__sample.max
      self['mean'] = self.__sample.mean
      self['stddev'] = self.__sample.stddev

      percentiles = self.__sample.percentiles([0.5, 0.75, 0.95, 0.98, 0.99, 0.999])
      self['median'] = percentiles[0]
      self['75percentile'] = percentiles[1]
      self['95percentile'] = percentiles[2]
      self['98percentile'] = percentiles[3]
      self['99percentile'] = percentiles[4]
      self.percentile99 = percentiles[4]
      self['999percentile'] = percentiles[5]

Several stats about the current metric are saved. However, it doesn't appear that the raw value is being preserved. Is this correct? If so, what is the reason this value isn't being preserved?

Inserting self['value'] = value in the above statement worked locally.

Example on how to do dynamic stats collection?

(I couldn't find any other good contact info for this software, so I'm filing it as an issue. It would be good to add an example of this to the README. :)

What's the best way to dynamically define stats to collect? I don't know until runtime what the different possible values are. (Think different "types" that are pulled from a database at runtime.)

data = get_data()
stats = scales.collection('/data', *[scales.IntStat(n) for n in names])

for d in data:
    # increment the count
    setattr(stats, d, getattr(stats, d) + 1)

This is kinda ugly but it works. Is there a nicer way?

Python 3.8 compatibility

Hello.

scales is not compatible with Python 3.8 because escape function from cgi module was removed as deprecated since Python 3.2.

I am going to prepare a PR to fix this.

ZeroDivisionError in PmfStatDict.addValue()

Calculating self.__sample.stddev for a PmfStatDict after calling addValue results in a ZeroDivisionError when the list of samples has 1 element but its count is 2 or greater, as when an operation takes zero time (e.g. when unit testing with time.time() patched out). This is due to these lines in ExponentiallyDecayingReservoir.update (samplestats.py:151):

    priority = self.__weight(timestamp - self.startTime) / random.random()

    self.count += 1
    if (self.count <= self.size):
      self.values[priority] = value

priority is obviously 0 when timestamp - self.startTime is 0, thus self.samples() returns a list of length 1 (self.values.values()) while self.count is 2 or greater. Because self.count decides len(self) for a Sampler, the test at the top of

  @property
  def stddev(self):
    """Return the sample standard deviation."""
    if len(self) < 2:
      return float('NaN')
    # The stupidest algorithm, but it works fine.
    arr = self.samples()
    mean = sum(arr) / len (arr)
    bigsum = 0.0
    for x in arr:
      bigsum += (x - mean)**2
    return sqrt(bigsum / (len(arr) - 1))

in Sampler (samplestats.py:54) returns False, allowing the following code to execute, with the inevitable ZeroDivisionError when it divides by len(arr) - 1.

Syntax error in greplin/scales/graphite.py file

import greplin.scales.graphite as graphite
Traceback (most recent call last):
File "", line 1, in
File "/Library/Python/2.7/site-packages/greplin/scales/graphite.py", line 112
elif type(value) in type_values and len(name) < 500:
^
SyntaxError: invalid syntax

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.