GithubHelp home page GithubHelp logo

LevelDB Threads about leveldb HOT 4 OPEN

nsaadouni avatar nsaadouni commented on August 26, 2024
LevelDB Threads

from leveldb.

Comments (4)

matthewvon avatar matthewvon commented on August 26, 2024
  1. The thread pool is considered to be a circular list. Each scheduler thread's ID is divided by the number of threads (71) to create a starting index into the circular list. A prime number for the thread count reduces the possibility of two scheduler threads starting their hunt for an available worker at the same index.

  2. Each posix thread defaults to a stack size of 8Mbytes. So with 71 threads there is a default allocation of 568Mbytes of ram. Not all of the space is ever used, but it is something to keep in mind when trying to manage swap and disk cache space. A "developer mode" option exists that drops the thread count to 17 so that our Erlang programmers could run 5 instances of the server on one laptop for testing.

  3. This is open source code. Change it however you please. Just note that a given vnode typically has only one write request allowed at a time. Raising the thread count may therefore only help if your server is running more than 64 vnodes.

  4. The easiest solution is to try raising thread count to see if it helps your scenario. My guess is that it will not, but no harm in trying. You should post your actual symptoms against the riak_core to get thoughts from the Erlang programmers. I believe you are looking at the active anti-entropy (AAE) area of Riak. It may be "functioning as designed". You need Erlang eyes to give you better insight.

from leveldb.

nsaadouni avatar nsaadouni commented on August 26, 2024

We actually work on riak, and have gone through the AAE code several times to ensure that it is not due to the design, or the AAE code that could be providing this bottleneck. Which is why I was asking about how those threads work, as that is one of the major changes (I believe) to the leveldb dependancy, between riak's older stable versions (2.0.X) and the (2.2.5) riak releases.

I had also read some of your posts on the leveldb wiki in the past whereby changes to leveldb had caused customers using it as there 'storage backend' to experience these net_kernel tick timeouts.

We will keep on digging into it, thanks for taking the time to explain all of the above.

from leveldb.

matthewvon avatar matthewvon commented on August 26, 2024

The leveldb code intentionally slows down the rate of user write operations when the volume of compactions gets too high. AAE will slam 1,000 or more keys into the system at once. There is some code to reduce the impact of the intentional write slow downs for AAE, but when leveldb is behind user needs to wait.

I suggest you look at disk write activity relative to the time you see net_kernel tick counts ... which by the way, I do not know net_kernel statistic ... at least not by that name.

from leveldb.

matthewvon avatar matthewvon commented on August 26, 2024

Oh, and there is a known bug in the Erlang AAE code where two processes share the same leveldb iterator token. This is a really bad thing. It has lead to segfault crashes. Was not fixed before Basho closed. The eleveldb code has mutexes to help defend against this bad Erlang code. Sitting on one of those mutexes could result in the waits you are seeing. I believe the impact was typically seen during an AAE tree rebuild ... which I think posts status to one of Riak's command line tools. Again, there is heavy disk activity during the tree rebuild.

from leveldb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.