pyr / cyanite Goto Github PK
View Code? Open in Web Editor NEWcyanite stores your metrics
Home Page: http://cyanite.io
License: Other
cyanite stores your metrics
Home Page: http://cyanite.io
License: Other
I think this would bring cyanite closer to the features of carbon with the exception of python pickle format. We are looking at use cases of sending data to brokers to fan out messaging to several cyanite instances.
I'm using the latest (as of now) clone of Cyanite and Cassandra 2.0.5 on a single node. I've created the metric namespace by doing bin/cqlsh < doc/schema.cql
and cyanite has been packaged using leiningen 2.3.4.
When I start up cyanite using the configuration given in README.md
I can see it using a large amount of CPU and after I've sent metrics to it using echo "test.metric $RANDOM
date +%s" | nc 127.0.0.1 2003
I'm not seeing anything when I do SELECT * FROM metric.metric ;
on Cassandra.
If I crank up the logging level to trace
in cyanite.yaml
I can see the following fly by multiple times a second in cyanite.log
:
---
TRACE [2014-03-09 16:59:32,852] clojure-agent-send-off-pool-0 - com.datastax.driver.core.Connection - [localhost/127.0.0.1-2] writing request QUERY SELECT path from metric;
TRACE [2014-03-09 16:59:32,852] New I/O worker #2 - com.datastax.driver.core.Connection - [localhost/127.0.0.1-2] request sent successfully
TRACE [2014-03-09 16:59:32,860] New I/O worker #2 - com.datastax.driver.core.Connection - [localhost/127.0.0.1-2] received: ROWS [path(metric, metric), org.apache.cassandra.db.marshal.UTF8Type]
---
I've never used Cassandra before but set it up using these instructions and everything looked good. I was able to insert data and retrieve it.
This has happened to me on Mac and Linux (both using Java 1.7.0_51
). I'm hoping there's something basic that I've missed. Any ideas?
Hi there,
Currently working with a carbon relay receiving about 1.2million metrics per minute.
For testing purposes, have deployed a two node cassandra cluster with each cassandra node having a cyanite process attempting to write metrics. This setup seems to function fine when I throw some simple stress-test metrics at it.
However, when I direct a portion of our production metrics at the cluster, CPU utilisation hops to near 100% for the cyanite process across all available cores (currently 8 per instance) - and continues to spin with this usage even after I cease sending metrics. Cassandra writes and CPU utilisation remain very low throughout this ~5% usage on a single core.
I initially thought that @addisonj had a pull request (#37) that would address this issue, as there were a number of exceptions being thrown in the cyanite.log file relating to badly formed metrics. However, after manually merging the pull request and retrying the issue persists (although the formatting exceptions are now being handled elegantly!)
Any pointers for this one? Quite excited to get cyanite working on our production metric volume!
-Paul
I recently have lots of lost metrics when rendering data from graphite-api or grafana.
http://graphite-api:8000/render?target=collectd.server.memory.memory-used&from=-1h
the lost rate is higher for recent metrics then for older.
I also noticed that the lost happens when I have this error in cyanite logs, (not sure if it's related)
ERROR [2014-07-11 11:24:55,209] New I/O worker #63 - lamina.core.utils - error on inactive probe: tcp-server:error
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(Unknown Source)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.read(Unknown Source)
at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at aleph.netty.core$cached_thread_executor$reify__8830$fn__8831.invoke(core.clj:78)
at clojure.lang.AFn.run(AFn.java:22)
at java.lang.Thread.run(Unknown Source)
I should also mention that I gather data from 25 installed in collectd servers, and the server where cyanite/cassandra are installed is 12 Go RAM good processor and SSDs.
I'am using cyanite commit e708113
the infra layout/config:
can you help me identify/resolve the issue plz ? thanks!
I think it's possible that an exception while attempting to update path-db
in store.clj
can stop the whole process from continuing.
If I replace update-path-db-every
with:
(defn update-path-db-every
"At each interval, fetch all known paths, and store the
resulting set in path-db"
[session interval]
(while true
(try
(->> (alia/execute session pathq)
(map :path)
(set)
(reset! path-db))
(catch Exception e
(error e "failure while updating path db")))
(Thread/sleep (* interval 1000))))
... I get:
ERROR [2014-03-26 10:22:57,505] clojure-agent-send-off-pool-0 - org.spootnik.cyanite.store - failure while updating path db
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /1.1.1.1 (Timeout during read), /2.2.2.2 (Timeout during read))
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:64)
at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:269)
at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:183)
at com.datastax.driver.core.Session.execute(Session.java:111)
at qbits.alia$execute.doInvoke(alia.clj:190)
at clojure.lang.RestFn.invoke(RestFn.java:421)
at org.spootnik.cyanite.store$update_path_db_every$fn__86.invoke(store.clj:123)
at org.spootnik.cyanite.store$update_path_db_every.invoke(store.clj:122)
at org.spootnik.cyanite.store$cassandra_metric_store$fn__93.invoke(store.clj:140)
at clojure.core$binding_conveyor_fn$fn__4145.invoke(core.clj:1910)
at clojure.lang.AFn.call(AFn.java:18)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /1.1.1.1 (Timeout during read), /2.2.2.2 (Timeout during read))
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103)
at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:170)
... 3 more
I think the actual exception is my problem - not sure whether my Cassandra setup is working properly (this is my first time playing with Cassandra). Though it could be an issue with the number of paths it's attempting to pull back. nodetool cfstats
tells me that my metric
keyspace has approximately 273k rows.
When querying from graphite-web to cyanite I get the following stacktrace :
DEBUG [2014-02-26 09:39:24,730] New I/O worker #7 - so.grep.cyanite.http - got request: {:remote-addr 127.0.0.1, :scheme :http, :request-method :get, :query-string path=d-dbsinf-0001_adm_dev10_aub_example_net.df-boot.df_inodes-free&from=1393321164&to=1393407564, :action :metrics, :content-type nil, :keep-alive? false, :uri /metrics, :server-name localhost, :params {:path d-dbsinf-0001_adm_dev10_aub_example_net.df-boot.df_inodes-free, :from 1393321164, :to 1393407564}, :headers {user-agent Python-urllib/2.7, connection close, host 127.0.0.1:8080, accept-encoding identity}, :content-length nil, :server-port 8080, :character-encoding nil, :body nil}
DEBUG [2014-02-26 09:39:24,730] New I/O worker #7 - so.grep.cyanite.http - fetching paths: d-dbsinf-0001_adm_dev10_aub_example_net.df-boot.df_inodes-free
DEBUG [2014-02-26 09:39:24,735] New I/O worker #7 - so.grep.cyanite.store - fetching paths from store: (d-dbsinf-0001_adm_dev10_aub_example_net.df-boot.df_inodes-free) 10 60480 1393321164 1393407564 8641
ERROR [2014-02-26 09:39:24,763] New I/O worker #7 - so.grep.cyanite.http - could not process request
java.lang.NullPointerException
at clojure.lang.Numbers.ops(Numbers.java:942)
at clojure.lang.Numbers.lt(Numbers.java:219)
at clojure.core$_LT_.invoke(core.clj:859)
at clojure.core$range$fn__4269.invoke(core.clj:2668)
at clojure.lang.LazySeq.sval(LazySeq.java:42)
at clojure.lang.LazySeq.seq(LazySeq.java:60)
at clojure.lang.RT.seq(RT.java:484)
at clojure.core$seq.invoke(core.clj:133)
at clojure.core$map$fn__4207.invoke(core.clj:2479)
at clojure.lang.LazySeq.sval(LazySeq.java:42)
at clojure.lang.LazySeq.seq(LazySeq.java:60)
at clojure.lang.RT.seq(RT.java:484)
at clojure.core$seq.invoke(core.clj:133)
at clojure.core.protocols$seq_reduce.invoke(protocols.clj:30)
at clojure.core.protocols$fn__6026.invoke(protocols.clj:54)
at clojure.core.protocols$fn__5979$G__5974__5992.invoke(protocols.clj:13)
at clojure.core$reduce.invoke(core.clj:6177)
at so.grep.cyanite.store$fetch.invoke(store.clj:232)
at so.grep.cyanite.http$fn__14101.invoke(http.clj:79)
at clojure.lang.MultiFn.invoke(MultiFn.java:227)
at so.grep.cyanite.http$wrap_process$fn__14111.invoke(http.clj:97)
at so.grep.cyanite.http$wrap_process.invoke(http.clj:93)
at so.grep.cyanite.http$start$handler__14122.invoke(http.clj:115)
at aleph.http.netty$start_http_server$fn$reify__13496$stage0_13482__13497.invoke(netty.clj:77)
at aleph.http.netty$start_http_server$fn$reify__13496.run(netty.clj:77)
at lamina.core.pipeline$fn__3666$run__3673.invoke(pipeline.clj:31)
at lamina.core.pipeline$resume_pipeline.invoke(pipeline.clj:61)
at lamina.core.pipeline$start_pipeline.invoke(pipeline.clj:78)
at aleph.http.netty$start_http_server$fn$reify__13496.invoke(netty.clj:77)
at aleph.http.netty$start_http_server$fn__13479.invoke(netty.clj:77)
at lamina.connections$server_generator_$this$reify__13275$stage0_13261__13276.invoke(connections.clj:376)
at lamina.connections$server_generator_$this$reify__13275.run(connections.clj:376)
at lamina.core.pipeline$fn__3666$run__3673.invoke(pipeline.clj:31)
at lamina.core.pipeline$resume_pipeline.invoke(pipeline.clj:61)
at lamina.core.pipeline$start_pipeline.invoke(pipeline.clj:78)
at lamina.connections$server_generator_$this$reify__13275.invoke(connections.clj:376)
at lamina.connections$server_generator_$this__13258.invoke(connections.clj:376)
at lamina.connections$server_generator_$this__13258.invoke(connections.clj:371)
at lamina.trace.instrument$instrument_fn$fn__6374$fn__6408.invoke(instrument.clj:140)
at lamina.trace.instrument$instrument_fn$fn__6374.invoke(instrument.clj:140)
at clojure.lang.AFn.applyToHelper(AFn.java:161)
at clojure.lang.RestFn.applyTo(RestFn.java:132)
at clojure.lang.AFunction$1.doInvoke(AFunction.java:29)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at lamina.connections$server_generator$fn$reify__13322.run(connections.clj:407)
at lamina.core.pipeline$fn__3666$run__3673.invoke(pipeline.clj:31)
at lamina.core.pipeline$resume_pipeline.invoke(pipeline.clj:61)
at lamina.core.pipeline$subscribe$fn__3699.invoke(pipeline.clj:118)
at lamina.core.result.ResultChannel.success_BANG_(result.clj:388)
at lamina.core.result$fn__1349$success_BANG___1352.invoke(result.clj:37)
at lamina.core.queue$dispatch_consumption.invoke(queue.clj:111)
at lamina.core.queue.EventQueue.enqueue(queue.clj:327)
at lamina.core.queue$fn__1980$enqueue__1995.invoke(queue.clj:131)
at lamina.core.graph.node.Node.propagate(node.clj:282)
Using Cassandra version 2.0.6
I start up graphite-api, cyanite and apache-cassandra. I start sending metrics from a DropWizard application and within minutes my Cassandra process dies. Just trying to find out if you are aware of any issues writing to Cassandra.
Sometimes I am able to write metrics for a few minutes, other times the crash is more immediate.
In Cassandra:
INFO [OptionalTasks:1] 2014-04-07 09:03:53,102 MeteredFlusher.java (line 63) flushing high-traffic column family CFS(Keyspace='metric', ColumnFamily='metric') (estimated 72791577 bytes)
INFO [OptionalTasks:1] 2014-04-07 09:03:53,103 ColumnFamilyStore.java (line 785) Enqueuing flush of Memtable-metric@133887464(20720419/72791577 serialized/live bytes, 339679 ops)
INFO [FlushWriter:12] 2014-04-07 09:03:53,104 Memtable.java (line 331) Writing Memtable-metric@133887464(20720419/72791577 serialized/live bytes, 339679 ops)
INFO [FlushWriter:12] 2014-04-07 09:03:54,526 Memtable.java (line 371) Completed flushing /datos/monitoring/apache-cassandra/data/metric/metric/metric-metric-jb-27-Data.db (6649345 bytes) for commitlog position ReplayPosition(segmentId=1396879560863, position=548607
In Cyanite:
ERROR [2014-04-07 22:00:42,062] New I/O worker #25 - lamina.core.utils - error on inactive probe: tcp-server:error
clojure.lang.ExceptionInfo: Query prepare failed {:query "UPDATE metric USING TTL ? SET data = data + ? WHERE tenant = '' AND rollup = ? AND period = ? AND path = ? AND time = ?;", :type :qbits.alia/prepare-error, :exception #<NoHostAvailableException com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried)>}
at clojure.core$ex_info.invoke(core.clj:4403)
at qbits.alia$ex__GT_ex_info.invoke(alia.clj:125)
at qbits.alia$prepare.invoke(alia.clj:141)
at org.spootnik.cyanite.store$insertq.invoke(store.clj:34)
at org.spootnik.cyanite.store$channel_for.invoke(store.clj:150)
at org.spootnik.cyanite.carbon$handler$fn__13780.invoke(carbon.clj:29)
at aleph.tcp$start_tcp_server$fn__9355$fn__9357.invoke(tcp.clj:34)
at aleph.netty.server$server_message_handler$initializer__9141.invoke(server.clj:111)
at aleph.netty.server$server_message_handler$reify__9192.handleUpstream(server.clj:131)
at aleph.netty.core$upstream_traffic_handler$reify__8884.handleUpstream(core.clj:258)
at aleph.netty.core$connection_handler$reify__8877.handleUpstream(core.clj:240)
at aleph.netty.core$upstream_error_handler$reify__8867.handleUpstream(core.clj:199)
at org.jboss.netty.channel.Channels.fireChannelOpen(Channels.java:170)
at org.jboss.netty.channel.socket.nio.NioAcceptedSocketChannel.<init>(NioAcceptedSocketChannel.java:42)
at org.jboss.netty.channel.socket.nio.NioServerBoss.registerAcceptedChannel(NioServerBoss.java:137)
at org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:104)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at aleph.netty.core$cached_thread_executor$reify__8830$fn__8831.invoke(core.clj:78)
at clojure.lang.AFn.run(AFn.java:22)
at java.lang.Thread.run(Thread.java:724)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:100)
at com.datastax.driver.core.SessionManager.execute(SessionManager.java:417)
at com.datastax.driver.core.SessionManager.prepareAsync(SessionManager.java:124)
at com.datastax.driver.core.SessionManager.prepare(SessionManager.java:108)
at qbits.alia$prepare.invoke(alia.clj:139)
... 20 more
After killing the cyanite process it will take a very long time before starting it again will not produce this below error. There are no other running instances and no other processes listening on tcp 2003.
starting with configuration: nil
DEBUG [2014-04-18 14:46:44,840] main - org.spootnik.cyanite.config - building :store with org.spootnik.cyanite.store/cassandra-metric-store
INFO [2014-04-18 14:46:44,841] main - org.spootnik.cyanite.store - connecting to cassandra cluster
INFO [2014-04-18 14:46:45,326] main - org.spootnik.cyanite.carbon - starting carbon handler
Exception in thread "main" org.jboss.netty.channel.ChannelException: Failed to bind to: VALID_DOMAIN/VALID_IP:2003
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at aleph.netty.server$start_server.invoke(server.clj:68)
at aleph.tcp$start_tcp_server.invoke(tcp.clj:31)
at org.spootnik.cyanite.carbon$start.invoke(carbon.clj:38)
at org.spootnik.cyanite$_main.doInvoke(cyanite.clj:31)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at org.spootnik.cyanite.main(Unknown Source)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290)
at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at aleph.netty.core$cached_thread_executor$reify__8828$fn__8829.invoke(core.clj:78)
at clojure.lang.AFn.run(AFn.java:22)
at java.lang.Thread.run(Thread.java:724)
I am currently running a carbon-relay -> carbon-aggregator -> carbon-cache setup and since carbon-cache performs poor I am looking into alternative backends.
To be able to use cyanite as a carbon-cache replacement it needs to accept metrics in as python pickles since this is how carbon-aggregator forwards metrics to carbon-cache.
I have found your interesting blog-post [1] about the python pickle format but I wasn't able to find out if it is already possible, and if so how, to configure cyanite to accept carbon's pickle output.
It would be great if you could give me a hint. I'm happy to contribute documentation updates as a pull request if you could point out here how to achieve this.
[1] http://spootnik.org/entries/2014/04/05_diving-into-the-python-pickle-format.html
Hello,
Sorry for the subject, I figured it would be eye catching ;). So I have a question, this project is intriguing but I am curious why it exists? What problem are you solving, and how well are you solving it ? If this a performance issue, and if so what sort of improvements have you seen ? Thanks, and looking forward to hearing from you .
-John
Hi everyone,
We are working with Cyanite to store metrics in Cassandra, store a cache in Elasticsearch, and read them through Graphite-web, all of it in a multiple node cluster. After a upgrade of Cassandra to a 2.1 version, and Cyanite to the 0.1.3 version, we have problems with Cyanite configuration. When we want to view the metrics, the Graphite-web doesn't find them.
cyanite.yaml:
carbon:
host: "192.168.150.111"
port: 2003
rollups:
- "60s:30d"
- "5m:180d"
- "1h:300d"
- "1d:1y"
http:
host: "192.168.150.111"
port: 8080
logging:
level: debug
console: true
files:
- "/var/log/cyanite.log"
store:
cluster: 'localhost'
keyspace: 'metric'
index:
use: "io.cyanite.es_path/es-native"
index: "my_paths" #defaults to "cyanite_paths"
host: "localhost" # defaults to localhost
port: 9300 # defaults to 9300
cluster_name: "es_4_cyanite" #REQUIRED! this is specific to your cluster and has no sensible default
/var/log/cyanite.log:
ERROR [2014-10-13 11:06:36,034] async-dispatch-27 - io.cyanite.es_path - No node available
org.elasticsearch.client.transport.NoNodeAvailableException: No node available
at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:196)
at org.elasticsearch.client.transport.support.InternalTransportClient.execute(InternalTransportClient.java:94)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:172)
at org.elasticsearch.client.transport.TransportClient.get(TransportClient.java:375)
at clojurewerkz.elastisch.native$get.invoke(native.clj:63)
at clojurewerkz.elastisch.native.document$get.invoke(document.clj:136)
at clojurewerkz.elastisch.native.document$present_QMARK_.invoke(document.clj:164)
at clojure.core$partial$fn__4328.invoke(core.clj:2503)
at io.cyanite.es_path$es_native$reify__5158$fn__5302$state_machine__4698__auto____5303$fn__5305.invoke(es_path.clj:219)
at io.cyanite.es_path$es_native$reify__5158$fn__5302$state_machine__4698__auto____5303.invoke(es_path.clj:217)
at clojure.core.async.impl.ioc_macros$run_state_machine.invoke(ioc_macros.clj:940)
at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invoke(ioc_macros.clj:944)
at clojure.core.async.impl.ioc_macros$take_BANG_$fn__4714.invoke(ioc_macros.clj:953)
at clojure.core.async.impl.channels.ManyToManyChannel$fn__1714.invoke(channels.clj:102)
at clojure.lang.AFn.run(AFn.java:22)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Do you have an idea of what is going wrong? Is the configuration correct? At least in the previous version this works fine.
Thank you very much!
With the following cyanite.yaml
...
carbon:
rollups:
- period: 21600
rollup: 15
- period: 259200
rollup: 60
- period: 1209600
rollup: 300
- period: 31536000
rollup: 600
- period: 94608000
rollup: 3600
... my intention was to try and match these retentions (hopefully I've got the right idea):
retentions = 15s:6h,1m:72h,5m:2w,10m:1y,1h:3y
However, I get the following exception while sending data to cyanite:
ERROR [2014-03-17 10:33:17,565] New I/O worker #5 - lamina.core.utils - Error in permanent callback.
java.lang.IllegalArgumentException: Value out of range for int: 18921600000
at clojure.lang.RT.intCast(RT.java:1115)
at clojure.lang.RT.intCast(RT.java:1085)
at org.spootnik.cyanite.store$channel_for$fn__11999.invoke(store.clj:154)
at lamina.core.graph.propagator.CallbackPropagator.propagate(propagator.clj:42)
at lamina.core.graph.core$fn__1875$propagate__1880.invoke(core.clj:34)
at lamina.core.graph.node.Node.propagate(node.clj:282)
at lamina.core.graph.core$fn__1875$propagate__1880.invoke(core.clj:34)
at lamina.core.channel.Channel.enqueue(channel.clj:63)
at lamina.core.utils$fn__1070$enqueue__1071.invoke(utils.clj:74)
at lamina.core$enqueue$fn__4919.invoke(core.clj:111)
... lots more lines ...
Looking at the number 18921600000
(which is 31536000 * 600
) does setting :values
in store.clj:154
need to have more long
s in it and a change to the cyanite cassandra schema? Sorry, I'd submit a pull-request but my lack of understanding at this point would probably break something.
Just doing a /metrics
request for something which exists (although I've tweaked the name in the log message below) with a from
querystring parameter gives Query execution failed
in the response error message and the following entry in cyanite.log
:
ERROR [2014-03-26 17:08:12,651] New I/O worker #7 - org.spootnik.cyanite.http - could not process request
clojure.lang.ExceptionInfo: Query execution failed {:values [("redacted.*.metrics") 60 4320 1395770400 1395853692 5556], :query #<BoundStatement com.datastax.driver.core.BoundStatement@3d8c3e90>, :type :qbits.alia/execute, :exception #<InvalidQueryException com.datastax.driver.core.exceptions.InvalidQueryException: Cannot page queries with both ORDER BY and a IN restriction on the partition key; you must either remove the ORDER BY or the IN and sort client side, or disable paging for this query>}
at clojure.core$ex_info.invoke(core.clj:4403)
at qbits.alia$ex__GT_ex_info.invoke(alia.clj:125)
at qbits.alia$ex__GT_ex_info.invoke(alia.clj:127)
at qbits.alia$execute.doInvoke(alia.clj:251)
at clojure.lang.RestFn.invoke(RestFn.java:457)
at org.spootnik.cyanite.store$fetch.invoke(store.clj:231)
at org.spootnik.cyanite.http$fn__15785.invoke(http.clj:80)
at clojure.lang.MultiFn.invoke(MultiFn.java:227)
at org.spootnik.cyanite.http$wrap_process$fn__15797.invoke(http.clj:102)
at org.spootnik.cyanite.http$wrap_process.invoke(http.clj:98)
at org.spootnik.cyanite.http$start$handler__15808.invoke(http.clj:120)
at aleph.http.netty$start_http_server$fn$reify__15180$stage0_15166__15181.invoke(netty.clj:77)
at aleph.http.netty$start_http_server$fn$reify__15180.run(netty.clj:77)
at lamina.core.pipeline$fn__3632$run__3639.invoke(pipeline.clj:31)
at lamina.core.pipeline$resume_pipeline.invoke(pipeline.clj:61)
at lamina.core.pipeline$start_pipeline.invoke(pipeline.clj:78)
at aleph.http.netty$start_http_server$fn$reify__15180.invoke(netty.clj:77)
at aleph.http.netty$start_http_server$fn__15163.invoke(netty.clj:77)
at lamina.connections$server_generator_$this$reify__14959$stage0_14945__14960.invoke(connections.clj:376)
at lamina.connections$server_generator_$this$reify__14959.run(connections.clj:376)
at lamina.core.pipeline$fn__3632$run__3639.invoke(pipeline.clj:31)
at lamina.core.pipeline$resume_pipeline.invoke(pipeline.clj:61)
at lamina.core.pipeline$start_pipeline.invoke(pipeline.clj:78)
at lamina.connections$server_generator_$this$reify__14959.invoke(connections.clj:376)
at lamina.connections$server_generator_$this__14942.invoke(connections.clj:376)
at lamina.connections$server_generator_$this__14942.invoke(connections.clj:371)
at lamina.trace.instrument$instrument_fn$fn__6340$fn__6374.invoke(instrument.clj:140)
at lamina.trace.instrument$instrument_fn$fn__6340.invoke(instrument.clj:140)
at clojure.lang.AFn.applyToHelper(AFn.java:154)
at clojure.lang.RestFn.applyTo(RestFn.java:132)
at clojure.lang.AFunction$1.doInvoke(AFunction.java:29)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at lamina.connections$server_generator$fn$reify__15006.run(connections.clj:407)
at lamina.core.pipeline$fn__3632$run__3639.invoke(pipeline.clj:31)
at lamina.core.pipeline$resume_pipeline.invoke(pipeline.clj:61)
at lamina.core.pipeline$subscribe$fn__3665.invoke(pipeline.clj:118)
at lamina.core.result.ResultChannel.success_BANG_(result.clj:388)
at lamina.core.result$fn__1315$success_BANG___1318.invoke(result.clj:37)
at lamina.core.queue$dispatch_consumption.invoke(queue.clj:111)
at lamina.core.queue.EventQueue.enqueue(queue.clj:327)
at lamina.core.queue$fn__1946$enqueue__1961.invoke(queue.clj:131)
at lamina.core.graph.node.Node.propagate(node.clj:282)
at lamina.core.graph.core$fn__1875$propagate__1880.invoke(core.clj:34)
at lamina.core.graph.node.Node.propagate(node.clj:282)
at lamina.core.graph.core$fn__1875$propagate__1880.invoke(core.clj:34)
at lamina.core.channel.Channel.enqueue(channel.clj:63)
at lamina.core.utils$fn__1070$enqueue__1071.invoke(utils.clj:74)
at lamina.core$enqueue.invoke(core.clj:107)
at aleph.http.core$collapse_reads$fn__14021.invoke(core.clj:229)
at lamina.core.graph.propagator$bridge$fn__2919.invoke(propagator.clj:194)
at lamina.core.graph.propagator.BridgePropagator.propagate(propagator.clj:61)
at lamina.core.graph.core$fn__1875$propagate__1880.invoke(core.clj:34)
at lamina.core.graph.node.Node.propagate(node.clj:282)
at lamina.core.graph.core$fn__1875$propagate__1880.invoke(core.clj:34)
at lamina.core.channel.SplicedChannel.enqueue(channel.clj:111)
at lamina.core.utils$fn__1070$enqueue__1071.invoke(utils.clj:74)
at lamina.core$enqueue.invoke(core.clj:107)
at aleph.netty.server$server_message_handler$reify__9192.handleUpstream(server.clj:135)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.handler.codec.http.HttpContentEncoder.messageReceived(HttpContentEncoder.java:81)
at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at aleph.netty.core$upstream_traffic_handler$reify__8884.handleUpstream(core.clj:258)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at aleph.netty.core$connection_handler$reify__8877.handleUpstream(core.clj:240)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at aleph.netty.core$upstream_error_handler$reify__8867.handleUpstream(core.clj:199)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at aleph.netty.core$cached_thread_executor$reify__8830$fn__8831.invoke(core.clj:78)
at clojure.lang.AFn.run(AFn.java:22)
at java.lang.Thread.run(Thread.java:724)
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Cannot page queries with both ORDER BY and a IN restriction on the partition key; you must either remove the ORDER BY or the IN and sort client side, or disable paging for this query
at com.datastax.driver.core.Responses$Error.asException(Responses.java:96)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:108)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:228)
at com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:354)
at com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:571)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
... 1 more
It's Cassandra 2.0.6
.
Hi, I'm testing cyanite and evaluating how difficult can be administration of this graphite backend vs the default carbon-whisper system.
I would like to know hot to do some basic things with metrics.
a) rename metrics which matches a pattern.
b) delete metrics which matches a pattern.
c) change roll-up global roll-up aggregation without affect the current data.
But when trying to do a simple query something happens.
cqlsh:metric> select * from metric where path='collectd.dades.graphite0.graphitews0.system.cpu.percent-active';
Bad Request: partition key part path cannot be restricted (preceding part rollup is either not restricted or by a non-EQ relation)
Can you help me to learn more on metric administration?
How can I run multiple instances of Cyanite on the same machine with different settings. What configurations that I need to be aware of. Also I am using codahale in the code to publish metrics. I am using Graphite publisher. Do anyone have experience in configuring the multiple Cyanite/graphite in the codehale graphite publisher.
Thanks in advance.
There is planned to set/update TTLs in ES, however after recent patch #36 'index/present' used to avoid excessive updates. This logic breaks an ability to maintain tree objects' TTLs properly.
Any ideas on how to maintain tree index now?
Hello I'm getting the following during my build
(Could not transfer artifact org.apache.httpcomponents:httpcore:pom:4.3.2 from/to central (http://repo1.maven.org/maven2/): Checksum validation failed, could not read expected checksum: Failed to transfer file: http://repo1.maven.org/maven2/org/apache/httpcomponents/httpcore/4.3.2/httpcore-4.3.2.pom.sha1. Return code is: 500 , ReasonPhrase:Domain Not Found.)
(Could not transfer artifact org.apache.httpcomponents:httpcore:pom:4.3.1 from/to central (http://repo1.maven.org/maven2/): Checksum validation failed, could not read expected checksum: Failed to transfer file: http://repo1.maven.org/maven2/org/apache/httpcomponents/httpcore/4.3.1/httpcore-4.3.1.pom.sha1. Return code is: 500 , ReasonPhrase:Domain Not Found.)
(Could not transfer artifact org.apache.httpcomponents:httpmime:pom:4.3.2 from/to central (http://repo1.maven.org/maven2/): Checksum validation failed, could not read expected checksum: Failed to transfer file: http://repo1.maven.org/maven2/org/apache/httpcomponents/httpmime/4.3.2/httpmime-4.3.2.pom.sha1. Return code is: 500 , ReasonPhrase:Domain Not Found.)
(Could not transfer artifact io.netty:netty-parent:pom:4.0.19.Final from/to central (http://repo1.maven.org/maven2/): Failed to transfer file: http://repo1.maven.org/maven2/io/netty/netty-parent/4.0.19.Final/netty-parent-4.0.19.Final.pom. Return code is: 500 , ReasonPhrase:Domain Not Found.)
This could be due to a typo in :dependencies or network issues.
If you are behind a proxy, try setting the 'http_proxy' environment variable.
Uberjar aborting because jar failed: Could not resolve dependencies
It's downloaded other dependencies but not the above.
I'm using leiningen 2.4.3
I know nothing of clojure or leiningen exceot clojure is a lisp dialect and leiningen is a build system/ I would find it extremely helpful if there was a build of cyanite that I could download.
Hi, we are working with the system Graphite-Cyanite-Cassandra, comparing it with other systems like Influxdb or Carbon-Whisper. We would like to know some features about Cyanite:
-There is any maximum in the temporal retention in the rollups? Days, months, years?
-We know that Whisper is a size fixed database because it reserves space in that for each group of retention, and if there is no data, whisper saves null values. The question is if Cyanite works like that or simply Cyanite doesn't reserve any space and the database is variable-sized in function of the data that receives?
Features about effectiveness on the manegement of the clustering:
We work with clusters of Cyanite and Elasticsearch for one side and with Cassandra cluster in the other side. Maybe this questions are more refered to the Cassandra cluster but we will appreciate any advice:
Thank you very much for any help that you can give us!
It would not take much to have a statsd listener in addition to the carbon one, our
storage schema in cassandra makes the listener part stateless, which is a nice addition
Hi Pierre,
I'm hoping to get cyanite working for showing stats from multiple data centers.
We're writing data into cassandra (via graphite-api) using something like the following:
So, in us-east, all stats are being stored in cassandra under statsd buckets starting with stats.us_east.app, etc.
I'm able to do a "select * from metric limit 100" and can see the data from the other DCs is in cassandra -- i.e. data is replicating between the cassandra nodes, cross-DC, just fine.
However, when I read /metrics/index.json from graphite-api, only the data from the local data center is showing up.
How is cyanite providing a list of metrics to graphite-api?
Thanks,
Jeff
Paths should be indexed with a different mechanism than cassandra, for fast queries
Received this error when starting the Cyanite
ERROR [2014-04-12 17:34:44,381] clojure-agent-send-off-pool-0 - org.spootnik.cyanite.store - could not update path database
clojure.lang.ExceptionInfo: Query execution failed {:values nil, :query #<SimpleStatement SELECT distinct tenant, path, rollup, period from metric;>, :type :qbits.alia/execute, :exception #<SyntaxError com.datastax.driver.core.exceptions.SyntaxError: line 1:16 no viable alternative at input 'tenant'>}
at clojure.core$ex_info.invoke(core.clj:4403)
at qbits.alia$ex__GT_ex_info.invoke(alia.clj:125)
at qbits.alia$ex__GT_ex_info.invoke(alia.clj:127)
at qbits.alia$execute.doInvoke(alia.clj:251)
at clojure.lang.RestFn.invoke(RestFn.java:421)
at org.spootnik.cyanite.store$update_path_db_every$fn__13702.invoke(store.clj:123)
at org.spootnik.cyanite.store$update_path_db_every.invoke(store.clj:122)
at org.spootnik.cyanite.store$cassandra_metric_store$fn__13709.invoke(store.clj:140)
at clojure.core$binding_conveyor_fn$fn__4145.invoke(core.clj:1910)
at clojure.lang.AFn.call(AFn.java:18)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: com.datastax.driver.core.exceptions.SyntaxError: line 1:16 no viable alternative at input 'tenant'
at com.datastax.driver.core.Responses$Error.asException(Responses.java:94)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:108)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:228)
at com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:354)
at com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:571)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
... 3 more
There are 2 wrong CQL statement
One with Distinct, which is not allowed in CQLSH without one of the elements for the partition keys mentioned
SELECT distinct tenant .... from metrics;
Other, where tenant cannot be empty which is part of the partition key
UPDATE metric USING TTL ? SET data = data + ? "
"WHERE tenant = '' AND rollup = ? AND period = ? AND path = ? AND time = ?;")))
Can you explain in your documentation how to erase all the metric of an host ?
We need this information, when we shut an server and want to clean the database.
In looking over the documentation and README, its not clear to me what functionality cyanite has in comparison to carbon.
I was able to get it going yesterday along with graphite-web and everything seems to be working (and its pretty awesome so far), but its murky whether the rollup and pruning functionality is already working. In other words, what maintenance should I expect to have to do?
Also, is this being used in production anywhere? I am about to create a new graphite cluster and really like the idea and would love to hear how your experience has been using it.
Hello,
Can you explain why we have to use the fork of GraphiteUI rather then origin ? I see mention of this project in the graphite docs ( http://graphite.readthedocs.org/en/latest/storage-backends.html ) so is that still a requirement ?
-John
Hi,
I'm not seeing how to set the cassandra hosts and port for cyanite to connect to. When I try running cyanite with cluster set to anything other than an IP address (e.g. either host:port or host,host), it fails; and attempting to give cluster a list of hosts, yaml style, also fails.
Thanks,
Jeff
Building debian packages from leiningen would make distribution far easier.
Seeing a lot of this since 470ab39 (multiple times a minute):
WARN [2014-03-26 17:00:45,965] Cassandra Java Driver worker-0 - com.datastax.driver.core.Cluster - Re-preparing already prepared query UPDATE metric USING TTL ? SET data = data + ? WHERE tenant = '' AND rollup = ? AND period = ? AND path = ? AND time = ?;. Please note that preparing the same query more than once is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once.
Also seeing this:
0 [main] 2014-03-26 16:52:41,148 WARN com.datastax.driver.core.FrameCompressor - Cannot find Snappy class, you should make sure the Snappy library is in the classpath if you intend to use it. Snappy compression will not be available for the protocol.
3 [main] 2014-03-26 16:52:41,151 WARN com.datastax.driver.core.FrameCompressor - Cannot find LZ4 class, you should make sure the LZ4 library is in the classpath if you intend to use it. LZ4 compression will not be available for the protocol.
... on stdout
at startup.
Should mention that I'm using Cassandra 2.0.6
.
Dear,
I setup cyanite and it works ok when i push some metrics. But when i push so many data in 2, 3 days, cyanite not display new metrics on graphite web. So i restart cyanite and all metrics not display on graphite-web. I check cyanite API: http://10.30.12.133:8080/paths?query=*, result = [] , fletch data : http://10.30.12.133:8080/metrics?path=UP_ZME_Test_30_14_10_30_12_42.loadavg.1min&from=1393150441&to=1393200441 , result = {"error":"LIMIT must be strictly positive"} . Please help me, and i don't understand rollups ("period and rollup") in config and why has 2 rollups? . I tried setup cyanite 3 times and all error with webservice.
Result Cassandra query:
cqlsh:metric> select path,data,time from metric where path in ('UP_ZME_Test_30_14_10_30_12_42.loadavg.1min') and rollup = 600 and period = 105120 and time >= 1393212735 and time <= 1394212735 order by time asc limit 1;
path | data | time
--------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------
UP_ZME_Test_30_14_10_30_12_42.loadavg.1min | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] | 1393213200
I use the following command to start
sudo java -jar cyanite/target/cyanite-0.1.0.jar -f /etc/cyanite.yaml
I get the following errors
Exception in thread "main" clojure.lang.ExceptionInfo: Query prepare failed {:query "UPDATE metric USING TTL ? SET data = data + ? WHERE tenant = '' AND rollup = ? AND period = ? AND path = ? AND time = ?;", :type :qbits.alia/prepare-error, :exception #<SyntaxError com.datastax.driver.core.exceptions.SyntaxError: line 1:24 mismatched input '?' expecting INTEGER>}
at clojure.core$ex_info.invoke(core.clj:4403)
at qbits.alia$ex__GT_ex_info.invoke(alia.clj:168)
at qbits.alia$prepare.invoke(alia.clj:185)
at org.spootnik.cyanite.store$insertq.invoke(store.clj:33)
at org.spootnik.cyanite.store$cassandra_metric_store.invoke(store.clj:133)
at clojure.lang.Var.invoke(Var.java:379)
at org.spootnik.cyanite.config$instantiate.invoke(config.clj:91)
at org.spootnik.cyanite.config$get_instance.invoke(config.clj:99)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.AFn.applyTo(AFn.java:144)
at clojure.core$apply.invoke(core.clj:626)
at clojure.core$update_in.doInvoke(core.clj:5698)
at clojure.lang.RestFn.invoke(RestFn.java:467)
at org.spootnik.cyanite.config$init.invoke(config.clj:121)
at org.spootnik.cyanite$_main.doInvoke(cyanite.clj:29)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at org.spootnik.cyanite.main(Unknown Source)
Caused by: com.datastax.driver.core.exceptions.SyntaxError: line 1:24 mismatched input '?' expecting INTEGER
at com.datastax.driver.core.Responses$Error.asException(Responses.java:94)
at com.datastax.driver.core.SessionManager$2.apply(SessionManager.java:209)
at com.datastax.driver.core.SessionManager$2.apply(SessionManager.java:184)
at com.google.common.util.concurrent.Futures$1.apply(Futures.java:720)
at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:859)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Any help apprecaited.
~Thanks
Is there a way to configure cyanite to use more than one node for elasticsearch?
It looks like the yaml file only supports passing in one node for elasticsearch when I follow the yaml convention. That is, this works:
index:
use: "io.cyanite.es_path/es-native"
index: "cyanite_stats_paths" #defaults to "cyanite_paths"
host: "192.168.1.103"
But this gives the below exception:
index:
use: "io.cyanite.es_path/es-native"
index: "cyanite_stats_paths" #defaults to "cyanite_paths"
host:
- "192.168.1.103"
- "192.168.1.102"
(Additionally, it would be nice to be able to pass the port number in as part of the host, so one can do "localhost:8300, localhost:9300" to test failover on a local machine.)
Thanks,
Jeff
Exception in thread "main" java.lang.ClassCastException: clojure.lang.LazySeq cannot be cast to java.lang.String
at clojurewerkz.elastisch.native.conversion$__GT_socket_transport_address.invokePrim(conversion.clj:174)
at clojurewerkz.elastisch.native$connect.invoke(native.clj:250)
at io.cyanite.es_path$es_native.invoke(es_path.clj:198) at clojure.lang.Var.invoke(Var.java:379)
at io.cyanite.config$instantiate.invoke(config.clj:94) at io.cyanite.config$get_instance.invoke(config.clj:102)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.AFn.applyTo(AFn.java:144)
at clojure.core$apply.invoke(core.clj:628)
at clojure.core$update_in.doInvoke(core.clj:5853)
at clojure.lang.RestFn.invoke(RestFn.java:467)
at io.cyanite.config$init.invoke(config.clj:129)
at io.cyanite$_main.doInvoke(cyanite.clj:31)
at clojure.lang.RestFn.applyTo(RestFn.java:137) at io.cyanite.main(Unknown Source)
I've got a very divergent fork going that i've been working on for some time to improve performance. Some highlights from the changes are:
I'd like to know what the appetite is for taking this rather large change set wholesale. That would obviously be simplest for me, but i appreciate that the original author might not want that. If not then I'll see if i can separate out some bits to give back.
Hello,
Sorry for the subject, I figured it would be eye catching ;). So I have a question, this project is intriguing but I am curious why it exists? What problem are you solving, and how well are you solving it ? If this a performance issue, and if so what sort of improvements have you seen ? Thanks, and looking forward to hearing from you .
-John
When retrieving metrics the response time is quite slow.
I have about 8 million metrics, each one being updated about once every 60 seconds. This come out to about 600 writes/sec according to opscenter with average write latencies significantly less than 1 ms, the cassandra clusters seems to have no trouble keeping up.
However, when hitting the API endpoint with a request that retrieves the last hour of metrics, it takes about 4 or 5 seconds to get the data. Often times it can be much worse and results in timeouts from the graphite-web frontend where the graphs fail to render.
Looking at opscenter, the read latencies don't seem to go above a few milliseconds, and the average read latency as reported by nodetool is just over 6ms.
Any clues as to where the bottleneck might be?
I have one cyanite being used just for writes, another just for reads, but it doesn't seem to make too big a difference.
Here is my config:
carbon:
host: "0.0.0.0"
port: 2003
rollups:
- period: 60480
rollup: 10
- period: 105120
rollup: 600
http:
host: "0.0.0.0"
port: 8080
logging:
level: info
console: true
store:
cluster: 'mycluster.com'
keyspace: 'metric'
Using latest master, it looks as if the in memory metric store no longer fills its cache from the DB.
Specifically, the update-path-db-every
looks like its gone?
I see how the path store is used for new metrics, but I have a cyanite process for reads only that won't get any paths.
Any ideas on this one? (I realize this might still be work in progress stuff, just trying to figure out the direction)
Also, as an aside, after realizing this, I downgraded back, and it seems with the old version, the cache isn't getting reliably updated... I have about 8 million rows, so perhaps its just taking a long time to fill the cache?
carbon's storage-schemas.conf supports one letter suffixes for periods, such as 7d, 1m, y, it would make migration easier when supporting this
I am new to Carbon/Cyanite, I am trying to understand the meaning of rollups settings
rollups:
- period: 60480
rollup: 10
- period: 105120
rollup: 600
What does the 105120/600 means , I appreciate any help on this,
Thanks
Seems like the master branch is broken at this time? I just tried commit (in subject) and I am getting errors, is this a known issue?
"Exception: " #<FileNotFoundException java.io.FileNotFoundException: Could not locate org/spootnik/cyanite/logging__init.class or org/spootnik/cyanite/logging.clj on classpath: >
Exception in thread "main" clojure.lang.ExceptionInfo: no such namespace: org.spootnik.cyanite.logging/start-logging {}
at clojure.core$ex_info.invoke(core.clj:4554)
at org.spootnik.cyanite.config$instantiate.invoke(config.clj:95)
at org.spootnik.cyanite.config$get_instance.invoke(config.clj:102)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.AFn.applyTo(AFn.java:144)
at clojure.core$apply.invoke(core.clj:628)
at clojure.core$update_in.doInvoke(core.clj:5853)
at clojure.lang.RestFn.invoke(RestFn.java:467)
at org.spootnik.cyanite.config$init.invoke(config.clj:122)
at org.spootnik.cyanite$_main.doInvoke(cyanite.clj:31)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at org.spootnik.cyanite.main(Unknown Source)
So my test cluster has been filling up it's disks at a rate MUCH higher than I expected. I went digging to see how the data was actually being stored in Cassandra. I was pretty surprised when I saw that the lower resolution "rollups" were simply a row with an array, "data", that contains every single value for that path during the time period. i.e.:
Given sending stats every 10s (i.e.: with statsd)
rollups defined in cyanite.yaml:
10s:1d = 8,640 rows (each row with 1 value in "data")
1m:7d = 10,080 rows (each row with 6 values in "data")
5m:365d = 525,600 rows (each row with 30 values in "data")
So for each unique metric at the end of a year, I would have 544,320 rows, with 15,837,120 values total in the "data" arrays.
When querying a "lower resolution" value (i.e.: 5m in my example), I believe cyanite is returning the average of the values in that row's "data" array.
Is this correct?
When trying the to build the latest cyanite I get the following error
Could not transfer artifact cc.qbits:alia:pom:2.2.0 from/to clojars (https://clojars.org/repo/): Checksum validation failed, expected 7495cbbed368dee884510d3329979f53baa013b1 but is bc2637df032b359c6baf9bd847d5157cb3706a45
We're seeing a minor bug when running cyanite in a testing setup, where the testing setup resets and re-inits everything.
Specifically, if the metric keyspace doesn't exist right at startup of cyanite (because our init scripts are running it at the same time -- so we can fix this by delaying the startup of cyanite), then the below stack trace occurs and the process hangs. I'd suggest either having cyanite exit, or retry on some interval, and on success, continue. As it is now, the process spins up, but fails to reach a point where it's listening on any ports for traffic.
-J
DEBUG [2014-09-19 10:15:58,545] main - org.spootnik.cyanite.config - building :store with org.spootnik.cyanite.store/cassandra-metric-store
INFO [2014-09-19 10:15:58,558] main - org.spootnik.cyanite.store - creating cassandra metric store
at org.spootnik.cyanite.config$init.invoke(config.clj:124)
at org.spootnik.cyanite$_main.doInvoke(cyanite.clj:31)
at clojure.lang.RestFn.applyTo(RestFn.java:137)
at org.spootnik.cyanite.main(Unknown Source)
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Keyspace 'metric' does not exist
INFO [2014-09-19 10:15:58,558] main - org.spootnik.cyanite.store - creating cassandra metric store
If I have a rollup value of 10s, but insert data into Cassandra in 1s increments I noticed the data field is an array where I am actually inserting 10 records into the one rollup. Is this expected? Looking at the code it is, but was this a way of handling more of a statsd approach in cyanite?
Hi,
graphite-api finders (like graphite-cyanite) expose a method, get_intervals() to provide hints about when a given path is valid from and to. Currently graphite-cyanite doesn't seem to have a way to query this information from cyanite, and returns that a given path is always valid for any given range.
If cyanite exposed this information, graphite-cyanite could make use of it, and the resulting frontends could make more sensible display information.
Cheers,
Is there already a way to import existing whisper files into cyanite? I guess it would make cyanite an even more attractive alternative if there was a way to migrate an existing carbon deployment.
If it is not possible yet would you mind pointing out how an import could be acchieved? I suppose it should be fairly easy to write a python script that uses whisper to read and decode the whisper files and then simply streams data into cassandra?
Currently, we use a series of carbon processes to handle all of our metrics. We have "relays" and "caches" and we have to spin up multiple of these processes to take advantage of the cores in the server.
In preparing cyanite to receive a subset of our production metrics (~1M per minute), do you recommend I follow the same pattern of multiple processes? Is there some benchmark of how many metrics per second a single cyanite process could handle?
Hello,
https://github.com/brutasse/graphite-cyanite/blob/master/cyanite.py#L36 urlencode parameters (like '*').
Cyanite reply empty body when query.pattern is urlencoded and django crashes.
cheers,
Cyanite binds the graphite listener to all interfaces. The "host" configuration setting is ignored:
Configuration:
carbon:
host: "127.0.0.1"
port: 2004
rollups:
- "1m:30d"
- "5m:90d"
http:
host: "127.0.0.1"
port: 8000
logging:
level: warn
console: false
files:
- "/var/log/cyanite/cyanite.log"
store:
cluster: 'localhost'
keyspace: 'metric'
index:
use: "io.cyanite.es_path/es-native"
index: "cyanite_paths"
host: "127.0.0.1"
port: 9300
cluster_name: "Monitoring"
Test:
$ netstat -tulpen |grep 2004
tcp 0 0 0.0.0.0:2004 0.0.0.0:* LISTEN 107 1763123 14763/java
where 14763 is the PID of the cyanite process.
The configuration should allow specifying alternate rollup strategies based on path regexes
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.