Comments (8)
i think that exception is harmless.
how many queues do you have running? how many clients typically connect to it?
from kestrel.
We have 50 queue, but most of them (~35) have very low traffic (<1000 per day). The number of clients tipically connected is around 100.
The number of operations is around 1.400/second.
from kestrel.
Hm, yeah, none of those are particularly high numbers. You're definitely not running out of fds at 100 clients.
You might lower the max_memory_size: with 50 queues, if 10 of them fill up, that's 5GB (which is more than can fit in a 6GB JVM because of the way garbage collection works).
Things you can check: heap usage; the GC log (is it spending a lot of time in GC when it crashes?); how backed up the queues are.
from kestrel.
We noticed that when it crashes, the following error is written to nohup (not the kestrel log, but the nohup.out were it was started from):
java.lang.OutOfMemoryError: Java heap space at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2427) at java.lang.Class.getDeclaredMethod(Class.java:1935) at scala.runtime.RichString.format(RichString.scala:240) at net.lag.kestrel.KestrelHandler$$anonfun$get$2.apply(KestrelHandler.scala:218) at net.lag.kestrel.KestrelHandler$$anonfun$get$2.apply(KestrelHandler.scala:212) at net.lag.kestrel.QueueCollection$$anonfun$remove$1.apply(QueueCollection.scala:148) at net.lag.kestrel.QueueCollection$$anonfun$remove$1.apply(QueueCollection.scala:142) at net.lag.kestrel.PersistentQueue.operateReact(PersistentQueue.scala:292) at net.lag.kestrel.PersistentQueue.removeReact(PersistentQueue.scala:334) at net.lag.kestrel.QueueCollection.remove(QueueCollection.scala:142) at net.lag.kestrel.KestrelHandler.get(KestrelHandler.scala:212) at net.lag.kestrel.KestrelHandler.net$lag$kestrel$KestrelHandler$$handle(KestrelHandler.scala:113) at net.lag.kestrel.KestrelHandler$$anonfun$act$1$$anonfun$apply$1.apply(KestrelHandler.scala:68) at net.lag.kestrel.KestrelHandler$$anonfun$act$1$$anonfun$apply$1.apply(KestrelHandler.scala:66) at com.twitter.actors.Reaction.run(Reaction.scala:79) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
At the time of the crash all queue sizes were relatively small (a few of them around 20-30 MB, and the rest < 500 KB).
from kestrel.
We restarted java with -verbose:gc to see what garbage collection was doing.
These are the last lines of gc.log, after around 3 hours of running (before crashing). The number of items remained constant throughout these 3 hours.
11298.686: [Full GC 1521716K->1521716K(1578432K), 1.1662680 secs] 11299.852: [Full GC 1521716K->1520693K(1578432K), 2.0338540 secs] 11301.901: [Full GC 1521726K->1521726K(1578432K), 1.1637670 secs] 11303.065: [Full GC 1521727K->1520726K(1578432K), 1.9922750 secs] 11305.077: [Full GC 1521726K->1521726K(1578432K), 1.1747770 secs] 11306.252: [Full GC 1521726K->1521726K(1578432K), 1.1726630 secs] 11307.425: [Full GC 1521726K->1521726K(1578432K), 1.1758330 secs] 11308.601: [Full GC 1521726K->1520851K(1578432K), 2.0171970 secs] 11310.622: [Full GC 1521727K->1521727K(1578432K), 1.1734600 secs] 11311.795: [Full GC 1521727K->1521727K(1578432K), 1.1749680 secs] 11312.971: [Full GC 1521727K->1521727K(1578432K), 1.1907470 secs] 11314.162: [Full GC 1521728K->1520970K(1578432K), 1.1750460 secs] 11315.340: [Full GC 1521727K->1521727K(1578432K), 1.1664440 secs] 11316.507: [Full GC 1521727K->1521727K(1578432K), 1.1724850 secs] 11317.679: [Full GC 1521727K->1521007K(1578432K), 1.1712050 secs] 11318.863: [Full GC 1521727K->1520790K(1578432K), 1.1731550 secs] 11320.039: [Full GC 1521727K->1521727K(1578432K), 1.1760910 secs] 11321.216: [Full GC 1521727K->1521727K(1578432K), 1.1724200 secs] 11322.389: [Full GC 1521727K->1521727K(1578432K), 1.1749710 secs] 11323.564: [Full GC 1521727K->1520987K(1578432K), 1.1942830 secs] 11324.761: [Full GC 1521727K->1521727K(1578432K), 1.1956870 secs] 11325.957: [Full GC 1521727K->1521727K(1578432K), 1.1730420 secs]
The entire log is here: http://pastie.org/1871509
Any ideea what this means? We restarted our production environment using 1.2.4 instead of 1.2.8, to see if this changes anything (in case there's a memory leak in 1.2.8).
from kestrel.
a leak is a possibility. :( [1.2 uses mina instead of netty.] but it's more likely that there just isn't enough heap space for the queues that are backing up.
you can try adding more heap space -- when java is given 6GB, it can't actually use all 6GB for the app, because of GC overhead. you can also try reducing the memory size of queues, to keep less stuff in memory.
from kestrel.
The items on our queues are being constantly processed, and are not clustering up. The queue sizes remain approximately constant while kestrel is running, so I don't see why there would be a need for more heap space.
from kestrel.
We have our kestrels monitored by ganglia, but any monitoring system will do. At worst, set up a cron to pipe kestrel's "stats" output to a file. What you want to do in see (ideally, graph) curr_items and curr_connections and correlate those to misbehavior. We run our kestrels pretty hot, and generally if one crashes, it's due to running out of file descriptors or running out of heap.
The logfile you posted makes it pretty clear that the JVM just ran out of heap, and was growing gradually the whole time.
We're currently running 1.2.2 on most machines, it looks like, so if regressing to 1.2.4 works, that would be a valuable data point that 1.2.8 has some kind of leak. (We're also in the process of upgrading to 2.1, but I'll post to the mailing list as that happens. We'll almost certainly find a few bugs as it rolls out.)
from kestrel.
Related Issues (20)
- Journal files not erased as they roll over HOT 4
- ThriftHandlerSpec fails. HOT 3
- Kestrel 2.9.2 doesn't initialize properly in Java HOT 4
- setting up fanout queues in Scala config files
- can not build from src. HOT 2
- Multiple open transactions on a connection HOT 1
- Scala 2.10 release? HOT 3
- List all queue names via thrift
- Archived Journal Files Do Not Compact as Documented HOT 2
- Kestrel is neither storing nor giving back the flags property value.
- Reliable Writes HOT 1
- Exception during startup (NumberFormatException) HOT 1
- still active HOT 5
- Freshly installed kestrel server fails with java.lang.IndexOutOfBoundsException HOT 1
- kestrel-master build fails in out of the box fresh build run(sbt launcher version 0.13.1 & Scala version 2.10.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_65).
- Is this project still active?
- Any interest in a community fork of kestrel? HOT 6
- Read-behind gets stuck (None.get)
- Any update on releasing the internal Twitter changes? HOT 1
- download page is 404 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kestrel.