GithubHelp home page GithubHelp logo

Adaptive TinyLFU about cachelib HOT 3 CLOSED

facebook avatar facebook commented on April 18, 2024 1
Adaptive TinyLFU

from cachelib.

Comments (3)

haowu14 avatar haowu14 commented on April 18, 2024

Hi @ben-manes , thank you for the detailed post!

I think it's a good idea to add the support of hill climbing window size change into W-TinyLFU. It's unlikely that we can implement it any time soon since TinyLFU does not have a lot of use case in Facebook. But we are more than happy to review if you can send a pull request. We can also help supply traces for evaluation in cachebench.

Regarding the additional suggestions:

  1. CacheLib supports swappable CMS implementations down to 8 bits. For 4 bits, it is quite a bit of complexity compared to the saved memory (of course this varies by systems). On the other hand I'm curious about bloom filter. Is there evidence showing bloom filter can be as good as, or achieves a good tradeoff compared to CMS? CacheLib does have a bloom filter implemented and we can add a type parameter to have the user specify whether they want to use bloom filter or which CMS.
  2. The way CacheLib DRAM LRUs handle hot updates is to set an interval of lruRefreshTime. Accesses on the same key occurring within this time interval won't update the policy more than once. I think a ring buffer can complement this but we are able to get around with this time interval.
  3. Currently CacheLib DRAM cache is slab based. So to adopt to size-aware memory management would probably be a very long term change that we need to look carefully into. Or did I misunderstand your point here?
  4. I think this could be added as an option in TinyLFU.
  5. Absolutely!

It's nice to see you here Ben! I also worked at Addepar a while back on Greenbaums team and had the chance to take over a lot of your code in the financial graph.

from cachelib.

ben-manes avatar ben-manes commented on April 18, 2024

It's unlikely that we can implement it any time soon... But we are more than happy to review if you can send a pull request.

Unfortunately I am not much of a C++ programmer, so I'd leave this to someone else if interest. What is nice is that this avoids needing developers to think about the eviction policy, since those differences mostly goes away.

On that note, your 2Q policy is misnamed. The algorithm by Johnson & Shasha uses a non-resident queue, whereas cachelib implemented OpenBSD's TU-Q policy named after Ted Unangst.

  1. I'm curious about bloom filter. Is there evidence showing bloom filter can be as good as, or achieves a good tradeoff compared to CMS?

This is actually in addition to the CMS. The idea is to avoid polluting the CMS with one-valued counters, which increases the error rate and requires more counters to compensate. Instead the CMS counters are only incremented if found in the BF. For large caches this reduces the CMS space, which at 100Ms+ entries could be a benefit. Caffeine doesn't use this trick, though, because as an application cache it is usually small enough to not be worth the trouble.

  1. The way CacheLib DRAM LRUs handle hot updates is to set an interval of lruRefreshTime.

Ahh, thanks for pointing that out. This is what memcached does too. A flaw is that it is coupled to the entry, so operations like row scans would trigger many tryLock attempts. The docs do discuss lock contention problems, so if that does become a problem again then that is a technique to consider.

  1. Currently CacheLib DRAM cache is slab based. So to adopt to size-aware memory management would probably be a very long term change that we need to look carefully into. Or did I misunderstand your point here?

Nope, thanks for pointing that out.

It's nice to see you here Ben! I also worked at Addepar a while back on Greenbaums team and had the chance to take over a lot of your code in the financial graph.

ha, what a mess. I hope you don't hold that against me, I inherited much of it. My "starter project" was to make the graph distributed instead of running all of Addepar on one big machine. I created a transaction log, replayed to "catch-up" the nodes to allow for snapshot reads, a persistent data structure for concurrent readers, and used select-for-update to exclusively row-level lock the graph during a write transaction. Since everything was dependent on that logic in a large code base, but which was designed as a SPOF, that solution gave enough breathing room to allow for an eventual rebuild. I hope by now that situation is a lot better. Anyway, small world! 🙂

from cachelib.

haowu14 avatar haowu14 commented on April 18, 2024

I'll leave this open then.
The trick of bloom filter on top of CMS is interesting. We do have the interest to reduce the overhead of how access history is tracked in another application of CMS.

Anyway, small world! 🙂

Sure it is! AMP is a piece of art. Let's leave it there.

from cachelib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.