vimeo / graph-explorer Goto Github PK

A graphite dashboard powered by structured metrics

Home Page: http://vimeo.github.io/graph-explorer/

License: Apache License 2.0

Python 74.91% CSS 0.17% HTML 0.21% JavaScript 11.73% Smarty 12.69% Shell 0.27%

graph-explorer's Introduction

Graph explorer

A highly interactive dashboard to satisfy varying ad-hoc information needs across a multitude of metrics in a very powerful way:

The core of graph-explorer is a database containing your metrics extended with tags (key-value pairs that represent server, service, type, unit, ...)
You can use expressive queries which leverage this metadata to filter targets, group them into graphs, process and aggregate them on the fly. Something like SQL but metrics for rows and a list of graph definitions as a result set. All graphs are built dynamically.

The graphs themselves support annotated events and are also interactive because it uses timeserieswidget Furthermore, we aim for minimal, hackable code and as simple as possible deploy/install.

It also has:

dashboards which are pages that show N queries along with their results (0-N graphs each) and a (url-driven) field that gets applied to all queries, which you can use to narrow down to a specific server, apply a timeframe, etc.
an alerting system allowing you to set thresholds on queries or plain old graphite query strings.

Learn the basics

Tutorial
The query interface explained
introduction to metrics 2.0 & Graph-Explorer: https://vimeo.com/87194301

Metrics 2.0

In graphite, a metric has a name and a corresponding time series of values. Graph-explorer's metrics are structured: they contain key-value tags that describe all their attributes, the unit, the metric type, etc. You can generate the tag database by using plugins that parse metrics using regular expressions, or by tagging them as they flow into graphite. See the Structured Metrics page

GEQL, the Graph-Explorer Query Language

the Graph-Explorer Query Language is designed to:

be minimal, use a simple syntax and get a lot done with little input.
let you compose graphs from metrics in a flexible way: you can use tags and pattern matching to filter, group, process and aggregate targets and manipulate how the graph gets displayed.
let you create custom views of the exact information you need, and let you compare and correlate across different aspects.

At the most basic level you start by typing patterns that will filter out the metrics you're looking for. Then, you can extend the query by typing statements that have special meanings.

For more information see the Graph-Explorer Query Language page

Dependencies

python2: python2.6 or higher
install elasticsearch and run it (super easy, see docs just set a unique cluster name)
Graphite 0.9.10 or higher (tested with 0.9.12)
python2-pysqlite (if you want to use the alerting feature)

Installation

Using docker

you can easily use docker and the vimeo/graph-explorer docker image. Follow the instructions there

Via operating system packages

not sure which distros have graph-explorer packages. TBA.

Via python

There's two ways to go about this, from source and via pypi (see below).

But first an optional, though recommended step. Using virtualenv, you can install all packages in an isolated directory, so that you never have issues with conflicting library versions, conflicts with packages from other package managers, you can easily remove the install, etc.

path=/where/do/you/want/to/install  # this can be anywhere
virtualenv $path
source $path/bin/activate

The actual installation takes care of all dependencies and works the same whether you use virtualenv or not. See below for either the pypi or the git source approach.

From pypi

Pypi is the python package repository.

pip install graph-explorer

From source

Get a code checkout, initialize all git submodules and go in the directory, like so:

git clone --recursive https://github.com/vimeo/graph-explorer.git && cd graph-explorer

This will give you the latest bleeding edge code (master branch), which may be buggy. You can switch to the latest stable release with git checkout v<version>

The releases page has more info, but don't download from there, the downloads don't contain the needed submodules! Graph Explorer version numbering is based on semver.

Install:

python setup.py install

Instead of that, if you want to hack on Graph-Explorer, you can run.

python setup.py develop

This is like an installation, but it links back to the code. So when you run graph-explorer, it will automatically reload the server when you modify any python file, and changes in assets (js, css, ...) are visible for new requests. Templates however are cached by bottle and still need a manual restart for changes to become effective.

Configuration of graph-explorer

make a copy of config_example.cfg and edit it. Note that string values don't need wrapping quotes.
have a look at preferences.py, this is where you can configure timezone, targets colors, a few graph options, etc.
populate an elasticsearch database with structured metrics

Configuration of graphite server

you'll need a small tweak to allow this app to request data from graphite. For apache2 this works:

Header set Access-Control-Allow-Origin "*"
Header set Access-Control-Allow-Methods "GET, OPTIONS, POST"
Header set Access-Control-Allow-Headers "origin, authorization, accept"

Running

Make sure Graph-Explorer can write to the directories that you configured the log (and if enabled, alerting database file) to be in.

Linux / Unix

default, with Paste (included):

run_graph_explorer.py my_config_file.cfg and your page is available at <ip>:8080

alternatively, if you use gunicorn, you can run it with multi-workers like so: gunicorn -w 4 app:'default_app()' -b 0.0.0.0:8080

Windows

python %VIRTUAL_ENV%\scripts\run_graph_explorer.py my_config_file.cfg and your page is available at <ip>:8080

or with Powershell:

python $env:VIRTUAL_ENV/scripts/run_graph_explorer.py my_config_file.cfg

Troubleshooting

no graphs show up and I don't know why.

first check in the top section if there are target matching and 'total graphs' is > 0.
if not, your query expression might be too restricting, or maybe it didn't find your metrics from metrics.json (see 'targets matching: x/total')
if yes, check for any errors in the javascript console, (in firefox you need firebug, in chrome and such 'tools->javascript console')

also check all network requests in the network tab, and make sure they return http 200 or 304 especially, check that the http requests to graphite/render/?<...> return actual data. you may be suffering from this graphite bug or this graphite bug or maybe your graphite version is too old.

I get some error wrt graphite/apache cors access restriction

see section 'Configuration of graphite server' above

Graph Explorer pulls too much data over the network

This is because graphite doesn't support consolidation on its data API yet.

Documentation

wiki page

Unit tests

python setup.py test

Getting in touch

irc: #graph-explorer on freenode
github issues for bugs, feature requests, questions, feedback

graph-explorer's People

Contributors

Stargazers

Watchers

Forkers

xentac rjw1 obazoud lwz7512 sseshachala pcn thinkdnt finkregh alistairking glancesx areski waytai splee ldmosquera zehome adblockist bittorrent atdt rlodge ripe-ncc notsqrt thepaul yoanvillanueva knightseal presto53 web5design aculich-forks xiian k7 qz267 techdragon hackranger susiem bamdadd typerandom jaanek kgeis fkarb oswell brownman90 eliothedeman lemonhall yilab akshatknsl mikeatwork englishm underley tarnfeld kaeawc x-trade bnkr silky despiegk shundezhang mysl ezc hercynium deejay1 inthecloud247 why-not-sky videofy wfxiang08 zhurs nix8 modulexcite williamren ahlfors dbirchak jeffawang anukat2015 mwasyou i-am-dror tzubal mpower4ru geolibrerian dkrul solertis connectthefuture 1514louluo daniellyons178 sportsbitenews morganjk delta1766 huoyijie wl-tsui cnxtech connectionmaster isabella232 huiwenhan nameisnotvailable seanpm2001

graph-explorer's Issues

unify dashboards with default interface, use proper javascript framework

dashboards should be:

editable through web UI(add/remove queries, edit queries, set default value for apply to all)
saveable and loadable through web UI

the default interface:

should be able to add more queries
should be able to hide the query stats section to save space.

=> both features can be merged into one interface.
ideally using something a proper js toolkit like Angularjs, which would be beneficial anyway because right now the templates are a little messy and we have some explicit javascript/jquery logic that could be simplified by using a js toolkit that does databinding. It would simplify the code and provide a clean separation between api and frontend.

Elasticsearch request timed out

update_metrics - ERROR - sorry, something went wrong: HTTPConnectionPool(host='127.0.0.1', port=9200): Request timed out. (timeout=30)

But it seems that elasticsearch works fine when I get the cluster information through the same host and port.

Proper usage of Submodules

Hi,

I realize this is a small stumbling block, but it's very important to make sure people can use something you are going to publish.

Please fix the submodules :) thanks

tasseo@host1:~/graph-explorer$ git submodule update --init Submodule 'graphitejs' ([email protected]:Dieterbe/graphitejs.git) registered for path 'graphitejs' Cloning into 'graphitejs'... The authenticity of host 'github.com (207.97.227.239)' can't be established. RSA key fingerprint is 16:27:ac:a5:76:28:2d:36:63:1b:56:4d:eb:df:a6:48. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'github.com,207.97.227.239' (RSA) to the list of known hosts. Permission denied (publickey). fatal: The remote end hung up unexpectedly Clone of '[email protected]:Dieterbe/graphitejs.git' into submodule path 'graphitejs' failed

Adding instrumentation for graph-explorer

Per our conversations on irc/#graphite, I've added a first stab at an easy way to add instrumentation to graph-explorer to make it easier to see where it's spending time. The multi-threaded nature of the app means that it's not drop-in simple to just use the cProfile module, so here's an instrument.py module that adds a couple of decorators.

I've pushed it to pcn/graph-explorer in master, you can look at pcn@0a8e2d2.

This commit isn't appropriate to pull - I deliberately butchered app.py as a simple demonstration, but if this works for you I will clean it up for a pull request.

Enhancement: Trend lines to graphs

Ability to display a trend line based on the displayed data. This would be done by the client-side rendering library as Graphite does not support this.

NameError: name 'query' is not defined

I get this error when I start up the application:

NameError: name 'query' is not defined

I try entering in 'cpu', just to see if it needs something to start with, but same error.

data files not always loaded

since #12 : if you have no cache files present, then start GE, then run update_metrics.py (successfully), but go to a page like /debug, /debug/metrics, /inspect, it will keep waiting for the data files to be loaded, but they will never load because only the index page does that.

ideally i think we would use the bottle hook decorator to check this for every load of a page that requires the data files

Kudos and how to deal with rates?

H[ea]llo Dieter,

On my quest to find a suitable dashboard for Graphite I found your project. Your project is much more intuitive than all those 'create & save graph' dashboards. I have been thinking about using RE to group metrics together, but your tagging solution makes it much more powerful and intuitive. Kudos!

While deepdiving in your code, I get the feeling that the project is not yet stabile. Is that correct? This feeling is based on the project homepage containing a screenshot with a 'suggested queries' listing, for which I could not find code. And I could not get the "pattern_graph" template config working (but maybe I need to look better).

An issue that's currently preventing me to do a real setup is the lack of support for rates (ever-increasing counters). Most of my metrics are not routed thru statsd and still require a scaleToSeconds(nonNegativeDerivative(my.metric),1) skeleton. Do you have a suggestion for me?

Keep on coding! And I will try to contribute.

Much appreciated / hartelijk bedankt,

Renzo

"avg over" assumes minutely resolution

use config options or somehow query graphite for this.

How to scale it?

When there is a single worker, it will be blocked when do '@route("/refresh_data")'

backend.update_data log messages not shown in update_metrics run

when you run update_metrics, the log messages emitted from backend.update_data(s_metrics) aren't shown. (we do this since #12 )

integration tests with elasticsearch

to make sure, for different GEQL features, we get the correct results from ES.

not sure what's the best way to do this.

Update_metrics.py stuck after listing metrics

update_metrics.py is hanging after list_metrics, What could cause the problem, is it some python modules that are not loaded?

oot@mon-gui:/opt/graph-explorer# ./update_metrics.py 
2013-12-18 11:20:18,500 - update_metrics - INFO - fetching/saving metrics from graphite...
2013-12-18 11:20:18,590 - update_metrics - INFO - generating structured metrics data...
2013-12-18 11:20:18,590 - update_metrics - DEBUG - loading metrics
2013-12-18 11:20:18,595 - update_metrics - DEBUG - removing outdated targets
2013-12-18 11:20:18,596 - update_metrics - DEBUG - making sure index exists..
No handlers could be found for logger "elasticsearch"
2013-12-18 11:20:18,599 - update_metrics - DEBUG - making sure shard is started..
2013-12-18 11:20:18,600 - update_metrics - DEBUG - shard is ready!
2013-12-18 11:20:18,601 - update_metrics - DEBUG - making sure index exists..
2013-12-18 11:20:18,603 - update_metrics - DEBUG - making sure shard is started..
2013-12-18 11:20:18,604 - update_metrics - DEBUG - shard is ready!
2013-12-18 11:20:18,606 - update_metrics - DEBUG - removed 0 metrics from elasticsearch
2013-12-18 11:20:18,606 - update_metrics - DEBUG - updating targets
2013-12-18 11:20:18,607 - update_metrics - DEBUG - list_metrics with 24 plugins and 4340 metrics
2013-12-18 11:20:18,887 - update_metrics - DEBUG -          plugin name   metrics upgrade ok  metrics upgrade bad      metrics ignored
2013-12-18 11:20:18,887 - update_metrics - DEBUG -        native_proto2                    0                    0                    0
2013-12-18 11:20:18,887 - update_metrics - DEBUG -               carbon                   13                    0                    0
2013-12-18 11:20:18,887 - update_metrics - DEBUG -             collectd                    0                    0                    0
2013-12-18 11:20:18,888 - update_metrics - DEBUG -                  cpu                  420                    0                    0
2013-12-18 11:20:18,888 - update_metrics - DEBUG - diamondcollectortime                    0                    0                    0
2013-12-18 11:20:18,888 - update_metrics - DEBUG -            diskspace                  234                    0                    0
2013-12-18 11:20:18,888 - update_metrics - DEBUG -             filestat                    0                    0                    0
2013-12-18 11:20:18,888 - update_metrics - DEBUG -               iostat                 2376                    0                    0
2013-12-18 11:20:18,888 - update_metrics - DEBUG -                 load                   90                    0                    0
2013-12-18 11:20:18,889 - update_metrics - DEBUG -               memory                  252                    0                    0
2013-12-18 11:20:18,889 - update_metrics - DEBUG -              network                  448                    0                    0
2013-12-18 11:20:18,889 - update_metrics - DEBUG -             sockstat                  144                    0                    0
2013-12-18 11:20:18,889 - update_metrics - DEBUG -               statsd                    0                    0                    0
2013-12-18 11:20:18,889 - update_metrics - DEBUG -                swift                    0                    0                    0
2013-12-18 11:20:18,889 - update_metrics - DEBUG - swift_object_auditor                    0                    0                    0
2013-12-18 11:20:18,890 - update_metrics - DEBUG -  swift_object_server                    0                    0                    0
2013-12-18 11:20:18,890 - update_metrics - DEBUG -   swift_proxy_server                    0                    0                    0
2013-12-18 11:20:18,890 - update_metrics - DEBUG -       swift_tempauth                    0                    0                    0
2013-12-18 11:20:18,890 - update_metrics - DEBUG -                  tcp                  288                    0                    0
2013-12-18 11:20:18,890 - update_metrics - DEBUG -                  udp                    0                    0                    0
2013-12-18 11:20:18,890 - update_metrics - DEBUG -               vmstat                   72                    0                    0
2013-12-18 11:20:18,891 - update_metrics - DEBUG -     catchall_diamond                    0                    0                    0
2013-12-18 11:20:18,891 - update_metrics - DEBUG -      catchall_statsd                    0                    0                    0
2013-12-18 11:20:18,891 - update_metrics - DEBUG -             catchall                    3                    0                    0
2013-12-18 11:20:18,891 - update_metrics - DEBUG - making sure index exists..
2013-12-18 11:20:18,893 - update_metrics - DEBUG - making sure shard is started..
2013-12-18 11:20:18,894 - update_metrics - DEBUG - shard is ready!
^CTraceback (most recent call last):
  File "./update_metrics.py", line 36, in <module>
    backend.update_data(s_metrics)
  File "/opt/graph-explorer/backend.py", line 76, in update_data
    s_metrics.update_targets(metrics)
  File "/opt/graph-explorer/structured_metrics/__init__.py", line 337, in update_targets
    self.es_bulk(bulk_list)
  File "/opt/graph-explorer/structured_metrics/__init__.py", line 275, in es_bulk
    self.es.bulk(index='graphite_metrics', doc_type='metric', body=body)
  File "/opt/graph-explorer/structured_metrics/elasticsearch-py/elasticsearch/client/utils.py", line 70, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/opt/graph-explorer/structured_metrics/elasticsearch-py/elasticsearch/client/__init__.py", line 522, in bulk
    params=params, body=self._bulk_body(body))
  File "/opt/graph-explorer/structured_metrics/elasticsearch-py/elasticsearch/transport.py", line 223, in perform_request
    status, raw_data = connection.perform_request(method, url, params, body, ignore=ignore)
  File "/opt/graph-explorer/structured_metrics/elasticsearch-py/elasticsearch/connection/http_urllib3.py", line 44, in perform_request
    response = self.pool.urlopen(method, url, body, **kw)
  File "/opt/graph-explorer/structured_metrics/urllib3/urllib3/connectionpool.py", line 428, in urlopen
    body=body, headers=headers)
  File "/opt/graph-explorer/structured_metrics/urllib3/urllib3/connectionpool.py", line 288, in _make_request
    httplib_response = conn.getresponse(buffering=True)
  File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
    response.begin()
  File "/usr/lib/python2.7/httplib.py", line 407, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
    line = self.fp.readline()
  File "/usr/lib/python2.7/socket.py", line 447, in readline
    data = self._sock.recv(self._rbufsize)
KeyboardInterrupt

implement "avg by", akin to "sum by"

i.e. use avgerageSeries instead of sumSeries.

but note:

you can't sum by and avg by the same tag key in one query
however you should be able to use both at the same time (on different tag keys), i.e. do something like "avg by server sum by action"

so for example, given these 4 metrics that represent a latency:

server=web1.action=submit 10
server=web1.action=generate 12
server=web2.action=submit 6
server=web2.action=generate 7

I want to graph the latency (averaged across the servers), but the latencies must be added together across the actions (because total latency is the sum of individual latencies)

Fresh clone does not init submodules

$ git clone --recursive https://github.com/vimeo/graph-explorer.git
Cloning into graph-explorer...
remote: Counting objects: 2095, done.
remote: Compressing objects: 100% (1081/1081), done.
remote: Total 2095 (delta 1125), reused 1955 (delta 996)
Receiving objects: 100% (2095/2095), 3.44 MiB | 1.30 MiB/s, done.
Resolving deltas: 100% (1125/1125), done.
No submodule mapping found in .gitmodules for path 'DataTablesPlugins'

I think running git rm --cached DataTablesPlugins and committing should fix this.

don't show something like 'sumSeries (4 values)' if all those values are the same

sometimes with 'sum by' over more than 1 tag, for some metrics being summed together one of the tags is the same, (or it maybe has 2 different values? and still shows them as 4 cause it counts totals, not uniques?)

example, our internal query "remover swift /s sum by n1,n4" shows " sumSeries (4 values) sumSeries (4 values) " the latter all being files_processing.

switch elasticsearch client to official one

ES now has official clients, we should probably use
http://www.elasticsearch.org/guide/en/elasticsearch/client/python-api/current/_overview.html

that said, rawes served us pretty well.
benefits of official client:

will probably become the de facto over time
no extra deps like pytz
might fix #56 in an elegant way

use of scaleToSeconds function bumps graphite version requirement

@thepaul
is there an alternative? probably not and I should just update our graphite server @ vimeo, but it seems convenient if we can be compatible with older versions too.

elasticsearch timeout

127.0.0.1 - - [30/Oct/2013:16:36:59 -0400] "POST /graphs/ HTTP/1.1" 500 748 "http://localhost:8080/index/tc%20wait%20vimeo-df" "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0"
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 49506)
Traceback (most recent call last):
  File "/home/dieter/workspaces/eclipse/graph-explorer/paste/httpserver.py", line 1069, in process_request_in_thread
    self.finish_request(request, client_address)
  File "/usr/lib/python2.7/SocketServer.py", line 334, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib/python2.7/SocketServer.py", line 651, in __init__
    self.finish()
  File "/usr/lib/python2.7/SocketServer.py", line 710, in finish
    self.wfile.close()
  File "/usr/lib/python2.7/socket.py", line 279, in close
    self.flush()
  File "/usr/lib/python2.7/socket.py", line 303, in flush
    self._sock.sendall(view[write_offset:write_offset+buffer_size])
error: [Errno 32] Broken pipe
----------------------------------------
Traceback (most recent call last):
  File "/home/dieter/workspaces/eclipse/graph-explorer/bottle.py", line 764, in _handle
    return route.call(**args)
  File "/home/dieter/workspaces/eclipse/graph-explorer/bottle.py", line 1575, in wrapper
    rv = callback(*a, **ka)
  File "/home/dieter/workspaces/eclipse/graph-explorer/app.py", line 326, in graphs
    return handle_graphs(query, False)
  File "/home/dieter/workspaces/eclipse/graph-explorer/app.py", line 348, in handle_graphs
    return render_graphs(query, deps=deps)
  File "/home/dieter/workspaces/eclipse/graph-explorer/app.py", line 398, in render_graphs
    (query, targets_matching) = s_metrics.matching(query)
  File "/home/dieter/workspaces/eclipse/graph-explorer/structured_metrics/__init__.py", line 313, in matching
    self.assure_index()
  File "/home/dieter/workspaces/eclipse/graph-explorer/structured_metrics/__init__.py", line 174, in assure_index
    self.es.post('graphite_metrics', data=body)
  File "/home/dieter/workspaces/eclipse/graph-explorer/structured_metrics/rawes/rawes/elastic.py", line 61, in post
    return self.request('post', path, **kwargs)
  File "/home/dieter/workspaces/eclipse/graph-explorer/structured_metrics/rawes/rawes/elastic.py", line 83, in request
    return self.connection.request(method, new_path, **kwargs)
  File "/home/dieter/workspaces/eclipse/graph-explorer/structured_metrics/rawes/rawes/http_connection.py", line 40, in request
    response = self.session.request(method, "/".join((self.url, path)), **args)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 357, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 460, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 360, in send
    raise Timeout(e)
Timeout: HTTPConnectionPool(host='graphiteserver', port=9202): Read timed out. (read timeout=30)

might be related / same issue as #45

no graph data when there are many null data points

In my (admittedly suboptimal) graphite configuration, my collector is submitting data every 5 mins, but carbon is writing every minute, causing 4/5ths of my whisper data to be null data points. In this case, graph-explorer does not show any data at all unless I add a moving average like "avg over 5M"

Some example data from the POST response looks like:
[{"target": "servers.name.loadavg.01", "datapoints": [[null, 1385333760], [null, 1385333820], [null, 1385333880], [0.0, 1385333940], [null, 1385334000], [null, 1385334060], [null, 1385334120], [null, 1385334180], [0.0, 1385334240], [null, 1385334300], [null, 1385334360], [null, 1385334420], [null, 1385334480], [0.0, 1385334540], [null, 1385334600], [null, 1385334660], [null, 1385334720], [null, 1385334780], [0.01, 1385334840], [null, 1385334900], [null, 1385334960], [null, 1385335020], [null, 1385335080], [0.0, 1385335140], [null, 1385335200], [null, 1385335260], [null, 1385335320], [null, 1385335380], [0.0, 1385335440], [null, 1385335500], [null, 1385335560], [null, 1385335620], [null, 1385335680],

graph-explorer seems not to be able to get graphite data.

Hi,
Congratulations for graph-explorer!

I have one issue:
I have a working graphite+diamond installation.
I installed graph-explorer, started it, but in browser when I go to http://localhost:8080/debug, I get the message:
server is waiting until structured metrics dataset is ready. can't continue

How could I address this?
In the graph-explorer's docs, there's a mention to a file named "metrics.json".
I don't have that file. From what I understand, graphite is supposed to create that.

Andrei

How do I set the correct timezone for graph-explorer for non-american time?

Hi,

I have my timezone in graphite setup to be:
TIME_ZONE = 'Europe/London'

Graphs in graphite show the correct time but graphs in graph-explorer show offset (I assume North American default).

I can see references in timeserieswidget to setting the tz value correctly but am unsure where this needs to go or whether this is being wrapped by any other code.

Any help/pointers would be appreciated.

thanks.

Create python package

I understand that you don't want to depend on external packages, but on the other hand on some distros it would be much easier to package and deploy graph-explorer if python package would be made.

For start simple setup.py file would be nice ;)

Support for static images

Could you support for PNG images in Graph-explorer?

I know Javascript rendered graphs is fancier than PNG, but transferring large JSON datafiles takes time & resources to create, transfer and render, resulting in worse user experience than using PNG.

Your excellent tool allows for Google-style searching, but when using too generic terms (every new user will start with such queries) results in requesting way too many graphs for way too many metrics and/or datapoints.

Because I noticed a big delay between starting a query and getting graphs on my screen I did some performance testing by firing off an API call for 42 metrics/targets, each having 1 datapoint per 10 seconds, ranging the from-param from 2 to 50 hours, ranging the output-param and executing every test 3 times.

Graph below contains the test results. The from-param is put on the X-axis. The total-time in seconds (incl. download) is put on the Y-axis.

I have used less-than-optimal hardware, so please ignore the Y-axis value and focus on the difference between JSON, SVG and PNG.

document features introduced in #59

extended regex syntax (with the specifics of anchoring) on GEQL wiki page
automatic derivation for counters, wraparound/max_value tag, how to prevent it
more complete unit conversions. this already has a section on the GEQL page, which now needs to be reworked.

slowness when rendering a graph with a lot of targets

if you didn't filter down a lot, it can happen that graphs contain hundreds of metrics.

due to the grouping, there's basically no limit to how many targets appear on one graph. (upto limit, if all targets go on one graph)
200 targets on 1 graph (not aggregated) takes about 30s to render for me. it could be way worse, esp. with aggregation.
the same set of targets being rendered on one graph, vs on multiple graphs, is a big difference. I think the problem is, when graphite gets a big request (with multiple targets), it handles them all in sequence instead of in parrallel (need to confirm this). timeserieswidget used to have code that splits up a request in multiple subrequest, i could reintroduce that, or fix graphite to use parrallellisation (unlikely), than in GE/timeserieswidget we just need to do something to make the graph's height match the legend.

or in GE limit the amount of targets per graph.
(i.e. max targets per graph 50 or so)
you can see the difference usually with something like catchall statsd vs catchall statsd group by n1,n2,n3,n4

PRO of splitting up to multiple graphs:

you can see the entire course of a metric without scrolling
so many metrics on one graph can be overkill anyway (though not always, and the interactive popups and future highlighting support can alleviate this)

CON of splitting up to multiple graphs:

can be hard to relate targets on different graphs
doesn't really help for other tools.

we should probably always render <prefix><unit> as unit

@thepaul
when user requests unit=MiB/s or unit=TB or whatever, we should probably always process the data as B/s or B.

why? flot/timeserieswidget automatically shows markers on the Y axis with prefixes if needed. if it's showing B but you have tens of millions on them, it will show labels such 10M, 20M, etc.
say the user requests unit=kB and a value is 10M it would show as 10k kB, which seems pretty pointless and even confusing. I would rather have GE automatically show me 10M of unit B.

timeserieswidget actually supports a prefixes option allowing us to toggle between SI and IEC,
we actually already leverage this through perferences ( https://github.com/vimeo/graph-explorer/blob/master/preferences.py#L23)

so what I think we should do when we request unit=MiB is set unit=B and suffixes=binary

No Graphs, Just Lists

I can list metrics but cannot graph them. What might I be doing wrong?

I am using Sensu to feed Graphite with metrics. Here is an example from my local Vagrant env:
localhost.localdomain.cpu.total.user
localhost.localdomain.cpu.total.system
etc...

I have run upload metrics successfully:
[gexplorer@localhost ~]$ ./update_metrics.py
2013-10-13 23:14:07,385 - update_metrics - INFO - fetching/saving metrics from graphite...
2013-10-13 23:14:07,610 - update_metrics - INFO - generating structured metrics data...
2013-10-13 23:14:07,610 - update_metrics - DEBUG - loading metrics
2013-10-13 23:14:07,611 - update_metrics - DEBUG - removing outdated targets
2013-10-13 23:14:08,439 - update_metrics - DEBUG - removed 0 metrics from elasticsearch
2013-10-13 23:14:08,440 - update_metrics - DEBUG - updating targets
2013-10-13 23:14:08,591 - update_metrics - DEBUG - indexed 13 metrics
2013-10-13 23:14:08,591 - update_metrics - INFO - success!

When I query the elasticsearch server I get some metric indexes:
[gexplorer@localhost ~]$ curl http://localhost:9200/_search?pretty
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 37,
"max_score" : 1.0,
"hits" : [ {
"_index" : "graphite_metrics",
"_type" : "metric",
"_id" : "localhost.localdomain.cpu.total.irq",
"_score" : 1.0, "_source" : {"tags": ["what=unknown", "plugin=catchall", "target_type=unknown", "source=unknown", "unit=unknown", "n1=localhost", "n2=localdomain", "n3=cpu", "n4=total", "n5=irq", "n6=", "n7="]}
}, {
"_index" : "graphite_metrics",
"_type" : "metric",
"_id" : "localhost.localdomain.cpu.intr",
"_score" : 1.0, "_source" : {"tags": ["what=unknown", "plugin=catchall", "target_type=unknown", "source=unknown", "unit=unknown", "n1=localhost", "n2=localdomain", "n3=cpu", "n4=intr", "n5=", "n6=", "n7="]}
}, {
"_index" : "graphite_metrics",
"_type" : "metric",
"_id" : "localhost.localdomain.cpu.total.steal",
"_score" : 1.0, "_source" : {"tags": ["what=unknown", "plugin=catchall", "target_type=unknown", "source=unknown", "unit=unknown", "n1=localhost", "n2=localdomain", "n3=cpu", "n4=total", "n5=steal", "n6=", "n7="]}
}, {
"_index" : "graphite_metrics",
"_type" : "metric",
"_id" : "localhost.localdomain.cpu.cpu0.guest",
"_score" : 1.0, "_source" : {"tags": ["what=unknown", "plugin=catchall", "target_type=unknown", "source=unknown", "unit=unknown", "n1=localhost", "n2=localdomain", "n3=cpu", "n4=cpu0", "n5=guest", "n6=", "n7="]}
}, {
"_index" : "graphite_metrics",
"_type" : "metric",
"_id" : "carbon.agents.localhost_localdomain-a.memUsage",
"_score" : 1.0, "_source" : {"tags": ["what=bytes", "plugin=carbon", "target_type=gauge", "agent=localhost_localdomain-a", "type=carbon_mem", "unit=B"]}
}, {
"_index" : "graphite_metrics",
"_type" : "metric",
"_id" : "localhost.localdomain.cpu.cpu0.irq",
"_score" : 1.0, "_source" : {"tags": ["what=unknown", "plugin=catchall", "target_type=unknown", "source=unknown", "unit=unknown", "n1=localhost", "n2=localdomain", "n3=cpu", "n4=cpu0", "n5=irq", "n6=", "n7="]}
}, {
"_index" : "graphite_metrics",
"_type" : "metric",
"_id" : "localhost.localdomain.cpu.ctxt",
"_score" : 1.0, "_source" : {"tags": ["what=unknown", "plugin=catchall", "target_type=unknown", "source=unknown", "unit=unknown", "n1=localhost", "n2=localdomain", "n3=cpu", "n4=ctxt", "n5=", "n6=", "n7="]}
}, {
"_index" : "graphite_metrics",
"_type" : "metric",
"_id" : "carbon.agents.localhost_localdomain-a.committedPoints",
"_score" : 1.0, "_source" : {"tags": ["what=datapoints", "plugin=carbon", "target_type=gauge", "agent=localhost_localdomain-a", "type=committed", "unit=datapoints"]}
}, {
"_index" : "graphite_metrics",
"_type" : "metric",
"_id" : "localhost.localdomain.cpu.cpu0.nice",
"_score" : 1.0, "_source" : {"tags": ["what=unknown", "plugin=catchall", "target_type=unknown", "source=unknown", "unit=unknown", "n1=localhost", "n2=localdomain", "n3=cpu", "n4=cpu0", "n5=nice", "n6=", "n7="]}
}, {
"_index" : "graphite_metrics",
"_type" : "metric",
"_id" : "localhost.localdomain.cpu.cpu0.iowait",
"_score" : 1.0, "_source" : {"tags": ["what=unknown", "plugin=catchall", "target_type=unknown", "source=unknown", "unit=unknown", "n1=localhost", "n2=localdomain", "n3=cpu", "n4=cpu0", "n5=iowait", "n6=", "n7="]}
} ]
}
}

I can list the cpu metrics in the graph explorer web app:
GEQL == list cpu total

Statement
list
Limit
500
Patterns
target_type=unit=cputotal
Group by
target_type=unit=server
Aggregation
avg by none, sum by none
X-axis
from -24hours, to now
Y-axis
min None, max None
Targets matching
9/37
Graphs matching
0/0
Graphs from matching targets
0
Total graphs
0
Tag legend
whattarget_typesourceunitn1n2n3n4n5n6n7plugin
localhost.localdomain.cpu.total.user
localhost.localdomain.cpu.total.guest
localhost.localdomain.cpu.total.idle
localhost.localdomain.cpu.total.softirq
localhost.localdomain.cpu.total.irq
localhost.localdomain.cpu.total.nice
localhost.localdomain.cpu.total.iowait
localhost.localdomain.cpu.total.steal
localhost.localdomain.cpu.total.system

When I try to graph the cpu totals with the following GEQL I get no graph:
GEQL == cpu total

Statement
graph
Limit
500
Patterns
target_type=unit=cputotal
Group by
target_type=unit=server
Aggregation
avg by none, sum by none
X-axis
from -24hours, to now
Y-axis
min None, max None
Targets matching
9/37
Graphs matching
0/0
Graphs from matching targets
1
Total graphs
1
Tag legend
whattarget_typesourceunitn1n2n3n4n5n6n7plugin
unknownunknownunknownunknownlocalhostlocaldomaincputotalcatchall

Metric strings prefix configurable

First of all: what an awesome app! I'm really amazed how quickly I managed to get it to work!

What I really miss (or haven't found yet) is how to prepend the metric strings if you don't follow the graph-explorer conventions. Within the structured_metrics/plugins directory I changed two of the plugins regular expressions to make it to work in our graphite environment as we chose to shard by environment and function.

For example I could have changed the collectd plugin from:

'match': '^collectd\.(?P<server>.+?)\.(?P<collectd_plugin>.+?)(?:-(?P<collectd_plugin_instance>.+?))?\.(?P<type>.+?)(?:-(?P<type_instance>.+?))?\.(?P<value>.+)$',

to:

'match': '^prd\.www\.collectd\.(?P<server>.+?)\.(?P<collectd_plugin>.+?)(?:-(?P<collectd_plugin_instance>.+?))?\.(?P<type>.+?)(?:-(?P<type_instance>.+?))?\.(?P<value>.+)$',

You can see we could have sharded here on environment prd (production) and function www. I can imagine others might also have similar cases where they have prepended their metric strings with their own prefix.

Elasticsearch shard needs time to activate after running update_metrics.py

Starting with an empty elasticsearch, if you run update_metrics.py it can fail with this error:

$ ./update_metrics.py
2013-08-16 02:30:46,132 - update_metrics - INFO - fetching/saving metrics from graphite...
2013-08-16 02:30:46,136 - update_metrics - INFO - generating structured metrics data...
2013-08-16 02:30:46,136 - update_metrics - DEBUG - loading metrics
2013-08-16 02:30:46,137 - update_metrics - DEBUG - removing outdated targets
2013-08-16 02:30:47,305 - update_metrics - ERROR - sorry, something went wrong: (ElasticException(...), 'ElasticSearch Error: {"error":"SearchPhaseExecutionException[Failed to execute phase [init_scan], total failure; shardFailures {[_na_][graphite_metrics][0]: No active shards}]","status":500}')

However, waiting a minute and then rerunning update_metrics.py will succeed.

$ ./update_metrics.py
2013-08-16 02:38:49,983 - update_metrics - INFO - fetching/saving metrics from graphite...
2013-08-16 02:38:49,990 - update_metrics - INFO - generating structured metrics data...
2013-08-16 02:38:49,990 - update_metrics - DEBUG - loading metrics
2013-08-16 02:38:49,991 - update_metrics - DEBUG - removing outdated targets
2013-08-16 02:38:50,052 - update_metrics - DEBUG - removed 0 metrics from elasticsearch
2013-08-16 02:38:50,053 - update_metrics - DEBUG - updating targets
2013-08-16 02:38:50,103 - update_metrics - DEBUG - indexed 16 metrics
2013-08-16 02:38:50,103 - update_metrics - INFO - success!

The theory is that the elasticsearch shard needs some time to activate, and graph-explorer should handle this gracefully.

weird wraparound check

@thepaul

query.py
170:        wraparound = target['tags'].get('wraparound')
171:        if wraparound is not None:
172:            cls.apply_graphite_function_to_target(target, 'nonNegativeDerivative', wraparound)

I don't see 'wraparound' being set anywhere?
either way, anything that counts up and wraps around warranting a nonNegativeDerivative should by definition have target_type counter.
so let's just check for that and remove the 'wraparound' tag stuff.

split graphite_url option in two

graphite_url_server and graphite_url_client; the first will be used by GE's update_metrics and such (connect from GE server), the latter will be used to connect from users's browser, clientside (tswidget)

add some way to filter anthracite events based on tags

In environments with lots of and different types of events, having all events displayed on all graphs will not be useable. It would be nice to have a way to display no events or all events or some combination of events with various tags.

Can't POST metrics from Graphite

Hi Dieterbe,

Wanted to test graph-explorer, but for some reason it's unable to POST anything from Graphite render, because it doesn't form URL fully. Here's some some debug info to illustrate that.

When I query for mem_usage (this metric is present in Graphite and in graph-explorer's metrics.json), graph-explorer generated page shows that one metric was found and there's graph present, but nothing's visible:

Patterns: target_type=what=mem_usage
Group by: target_type=what=server
Avg by:
Sum by:
From: -24hours
To: now
Limit: 500
Statement: graph
# targets matching: 1/29
# graphs matching: 0/0
# graphs from matching targets: 1
# total graphs: 1

I've noticed that the POST request that graph-explorer sends to Nginx hosting Graphite looks like this:

5.32.125.220 - - [01/Oct/2013:15:19:17 +0000] "POST /render/ HTTP/1.1" 200 3772 "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"

Where it requests for '/render/' and doesn't form a full target URL with the metric name and parameters.

On graph-explorer side it seems to be processing just fine, but there's no data to process:

5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /index/mem_usage HTTP/1.1" 200 8584 "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/jquery-ui.min.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/jquery.tswidget.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/graphite_helpers.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/rickshaw/vendor/d3.min.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/rickshaw/vendor/d3.layout.min.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/rickshaw/rickshaw.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/flot/jquery.flot.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/flot/jquery.flot.selection.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/flot/jquery.flot.time.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/flot/jquery.flot.stack.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/jquery.flot.axislabels.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/rickshaw/rickshaw.css HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/timezone-js/src/date.js HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "POST /graphs/ HTTP/1.1" 200 6465 "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"
5.32.125.200 - - [01/Oct/2013:15:43:45 +0000] "GET /timeserieswidget/tz/northamerica HTTP/1.1" 304 - "http://5.32.60.23:8080/index/mem_usage" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0"

Why doesn't it form request in full format ('/render/?<...>')? Have I missed anything in configuration? I've Graphite and Graph-Explorer running on two VMs in the same VLAN.

Can't start graph-explorer.py without using ./ or /

~/dev/graph-explorer% python graph-explorer.py
Traceback (most recent call last):
File "graph-explorer.py", line 7, in
os.chdir(os.path.dirname(file))
OSError: [Errno 2] No such file or directory: ''

adding ./ starts it successfully:

~/dev/graph-explorer% python ./graph-explorer.py
2013-04-02 21:30:40,557 - app - DEBUG - app starting

Access-Control-Allow-Origin Error

Access-Control-Allow-Origin Error:

We keep getting CORS errors when graph-explorer tries to communicate with the graphite server.
XMLHttpRequest cannot load http://:9080/render/. Origin http://:8080 is not allowed by Access-Control-Allow-Origin.

So we have graph-explorer running and can see the metrics but the graphs don't render.

graph-explorer and graphite are on the same machine on different ports.

We have added the following to /etc/apache2/sites-available/graphite:
Header set Access-Control-Allow-Origin "*"
Header set Access-Control-Allow-Methods "GET, OPTIONS, POST, HEAD, PUT, DELETE"
Header set Access-Control-Allow-Headers "X-Requested-With, X-Requested-By, Origin, Authorization, Accept, Content-Type, Pragma"

We have tried putting it directly in the VirtualHost section, in <Location "/"> , <Location "/render"> and nothing has worked.
We are not super experienced in configuring apache - so we must be missing something here and will appreciate any help.

rename "avg/sum by" to "avg/sum across"

since we aggregate across different values of a given tag, the terminology should IMHO be avg/sum across <tag> not avg/sum by <tag>.

thoughts?

Mousing over a graph label should bring the corresponding line to the "front"

For example,

I'm pretty sure the line for "success" is just hidden behind the line for "total", but it would be nice to make sure. If when mousing over the "success" label in the graph legend, the line for "success" should be drawn on top of the one for "total". This would also be helpful for graphs which have a large number of component metrics, even when the lines don't necessarily overlap.

make catchall bucket optional

it should be possible to choose whether you want a catchall bucket or not.
i think this makes sense for both aggregation and grouping.
a simple syntax and implementation could be like so:

# no catchall
>>> 'us-east|us-west'.split('|')
['us-east', 'us-west']
# with catchall
>>> 'us-east|us-west|'.split('|')
['us-east', 'us-west', '']

show buckets in query explanation section

for avg by, sum by, and group by which allow patterns for buckets,
show them in query explanation section

Line 11 in update_metrics.py is buggy

os.chdir(os.path.dirname(file))

in update_metrics.py doesn't always work. On Python 2.7.3 file returns the filename and os.path.dirname returns an empty string. It should be

os.chdir(os.path.dirname(os.path.abspath(file)))

so that dirname would return the containing directory.

Graphing JS errors when metric contains a ":"

The generated JS that's built at graph time uses the metric name as part of the variables, and if the metric name contains special characters (in our case, a ":"), the JS will throw an error.

Full eval'ed JS below, but here's the line from it that errors:

var graph_flot_SomeApi::someAPI__gauge__app2 = $.extend({}, defaults, graph_data);

Full JS

        $(document).ready(function () {
        var graph_data = {"promoted_constants": {"cluster": "jphx1", "app": "fulfill", "resource": "APIHandler__handleGetStuff", "plugin": "jetson"}, "from": "-24hours", "until": "now", "constants": {"what": "SomeApi::someAPI", "target_type": "gauge", "server": "app2"}, "targets": [{"variables": {"type": "requests", "unit": "requests/m"}, "id": "localhost__handleGetStuff.SomeApi::someAPI.measurements", "target": "localhost__handleGetStuff.SomeApi::someAPI.measurements"}, {"variables": {"type": "min", "unit": "ms"}, "id": "localhost__handleGetStuff.SomeApi::someAPI.min", "target": "localhost__handleGetStuff.SomeApi::someAPI.min"}]};
        graph_data['constants_all'] = jQuery.extend({}, graph_data['constants'], graph_data['promoted_constants']);

        $("#service::someAPI__gauge__app2").html(get_graph_name("SomeApi::someAPI__gauge__app2", graph_data));
        vtitle = get_vtitle(graph_data);
        if (vtitle != "") {
            graph_data["vtitle"] = vtitle;
        }

        // interactive legend elements -> use labelFormatter (specifying name: '<a href..>foo</a>' doesn't work)
        // but this function only sees the label and series, so any extra data must be encoded in the label
        labelFormatter = function(label, series) {
            var data = JSON.parse(label);
            if(data.name) {
                // name attribute is already set. this is probably a predefined graph, not generated from targets
                return data.name;
            }
            name = "";
            // at some point, we'll probably want to order the variables; just like how we compose graph titles.
            $.map(data["variables"], function (v,k) { name += " " + display_tag(k, v);});
            // there's nothing about this target that's not already in the graph title
            if (name == "") {
                name = "empty";
            }
            return get_inspect_url(data, name);
        }
        $.map(graph_data['targets'], function (v,k) {
            v["name"] = JSON.stringify(v, null, 2);
        });
        var defaults = {
            graphite_url: "http://localhost:9000/render/",
            from: "-24hours",
            until: "now",
            height: "300",
            width: "740",
            line_stack_toggle: 'line_stack_form_flot_SomeApi::someAPI__gauge__app2',
            series: {stack: true, lines: { show: true, lineWidth: 0, fill: true }},
            legend: {container: '#legend_flot_SomeApi::someAPI__gauge__app2', noColumns: 1, labelFormatter: labelFormatter },
            hover_details: true,
            zoneFileBasePath: '../timeserieswidget/tz',
            tz: "America/New_York",
        };
        var graph_flot_SomeApi::someAPI__gauge__app2 = $.extend({}, defaults, graph_data);
        $("#chart_flot_SomeApi::someAPI__gauge__app2").graphiteFlot(graph_flot_SomeApi::someAPI__gauge__app2, function(err) { console.log(err); });
        //$("#chart_flot_SomeApi::someAPI__gauge__app2").graphiteHighcharts(graph_flot_SomeApi::someAPI__gauge__app2, function(err) { console.log(err); });
        // TODO: error callback should actually show the errors in the html, something like:
        // function(err) { $("#chart_flot_SomeApi::someAPI__gauge__app2").append('<span class="label label-important">' + err + '</span>'); }
    });

anthracite events do not display correctly

When configured to use anthracite, the events don't display over the top of the graphs, but rather as a jumbled blob at the bottom of the page.

incorrect "Couldn't parse query: 'group by' (target_type=, unit=, server), 'sum by ()' and 'avg by (server)' cannot list the same tag keys " when using buckets

`swift_proxy_server GET timing 200 object median group by server:dfvimeo|lvim avg by server avg over 10M"

-->

Couldn't parse query: 'group by' (target_type=, unit=, server), 'sum by ()' and 'avg by (server)' cannot list the same tag keys
Traceback (most recent call last):
File "/opt/graph-explorer/app.py", line 390, in render_graphs
query = Query(query)
File "/opt/graph-explorer/query.py", line 26, in __init__
self.parse(query_str)
File "/opt/graph-explorer/query.py", line 92, in parse
(', '.join(self['group_by'].keys()), ', '.join(self['sum_by'].keys()), ', '.join(self['avg_by'].keys())))
Exception: 'group by' (target_type=, unit=, server), 'sum by ()' and 'avg by (server)' cannot list the same tag keys

when using buckets, this should be allowed.
the first bucket catches 2 diff. values of server, the other one about 10.
they should each avg by server, after doing the group by server with the buckets.

support clustered graphite

basically get metrics.json from multiple locations, and use one of the locations to query to (and graphite automatically distributes the targets when querying)

show graph rendering errors on screen instead of web console.

if timeserieswidget can't connect to graphite, or if there's CORS errors, those should be clearly displayed on screen, instead of hidden in the console or network tab of your browsers' dev tools

carbon-json-data prefixed with '.'

Hi,

i cloned the package, changed the config (graphite_url) and started the server.
Then i tried to open http://localhost:8080/index/carbon and got not data in my graphs because the script tries to GET these targets:

alias(.carbon.agents.foo.cache.size,'.carbon.agents.foo.cache.size')
alias(.carbon.agents.foo.memUsage,'.carbon.agents.foo.memUsage')

Any idea why there is a '.' as prefix?

Hi, I have copied some css style of GE into my new app

As I described in title: the https://github.com/huoxy/graphite-observer has copied some style sheet from GE.

I want you know that. :-)

download_metrics_json in backend should use urljoin for /metrics/index.json

Small issue but if you use urljoin, download_metrics_json would still work even if the user added a trailing slash to the end of the graphite_url

I had that in my URL and it failed in a non-obvious way.

Adding this made it work either way:

from urlparse import urljoin
response = urllib2.urlopen(urljoin (self.config.graphite_url, "/metrics/index.json"))

Get an error when going to the "debug" menu

I just setup graph-explorer for the first time. After I start it, I goto the debug menu and after a moment I get a 500 error. The console on the box running it throws the following error:

2013-07-12 13:43:33,271 - app - DEBUG - load_data() start
2013-07-12 13:43:38,245 - app - DEBUG - load_data() end ok
DEBUG:app:load_data() end ok
Traceback (most recent call last):
File "/opt/graph-explorer/bottle.py", line 764, in _handle
return route.call(*_args)
File "/opt/graph-explorer/bottle.py", line 1625, in wrapper
rv = callback(_a, *_ka)
File "/opt/graph-explorer/bottle.py", line 1575, in wrapper
rv = callback(_a, **ka)
File "/opt/graph-explorer/app.py", line 306, in view_debug
graphs_targets, graphs_targets_options = build_graphs_from_targets(targets_all)
File "/opt/graph-explorer/app.py", line 491, in build_graphs_from_targets
graphs[graph_key] = graph_option(graphs[graph_key])
File "/opt/graph-explorer/preferences_color.py", line 119, in apply_colors
graph['targets'][i]['color'] = color_assign_cpu[t]
KeyError: u'cpu.idle'

Almost all of my data is being populated using collectd.