Comments (3)
Hi @ben-manes , thank you for the detailed post!
I think it's a good idea to add the support of hill climbing window size change into W-TinyLFU. It's unlikely that we can implement it any time soon since TinyLFU does not have a lot of use case in Facebook. But we are more than happy to review if you can send a pull request. We can also help supply traces for evaluation in cachebench.
Regarding the additional suggestions:
- CacheLib supports swappable CMS implementations down to 8 bits. For 4 bits, it is quite a bit of complexity compared to the saved memory (of course this varies by systems). On the other hand I'm curious about bloom filter. Is there evidence showing bloom filter can be as good as, or achieves a good tradeoff compared to CMS? CacheLib does have a bloom filter implemented and we can add a type parameter to have the user specify whether they want to use bloom filter or which CMS.
- The way CacheLib DRAM LRUs handle hot updates is to set an interval of lruRefreshTime. Accesses on the same key occurring within this time interval won't update the policy more than once. I think a ring buffer can complement this but we are able to get around with this time interval.
- Currently CacheLib DRAM cache is slab based. So to adopt to size-aware memory management would probably be a very long term change that we need to look carefully into. Or did I misunderstand your point here?
- I think this could be added as an option in TinyLFU.
- Absolutely!
It's nice to see you here Ben! I also worked at Addepar a while back on Greenbaums team and had the chance to take over a lot of your code in the financial graph.
from cachelib.
It's unlikely that we can implement it any time soon... But we are more than happy to review if you can send a pull request.
Unfortunately I am not much of a C++ programmer, so I'd leave this to someone else if interest. What is nice is that this avoids needing developers to think about the eviction policy, since those differences mostly goes away.
On that note, your 2Q policy is misnamed. The algorithm by Johnson & Shasha uses a non-resident queue, whereas cachelib implemented OpenBSD's TU-Q policy named after Ted Unangst.
- I'm curious about bloom filter. Is there evidence showing bloom filter can be as good as, or achieves a good tradeoff compared to CMS?
This is actually in addition to the CMS. The idea is to avoid polluting the CMS with one-valued counters, which increases the error rate and requires more counters to compensate. Instead the CMS counters are only incremented if found in the BF. For large caches this reduces the CMS space, which at 100Ms+ entries could be a benefit. Caffeine doesn't use this trick, though, because as an application cache it is usually small enough to not be worth the trouble.
- The way CacheLib DRAM LRUs handle hot updates is to set an interval of lruRefreshTime.
Ahh, thanks for pointing that out. This is what memcached does too. A flaw is that it is coupled to the entry, so operations like row scans would trigger many tryLock attempts. The docs do discuss lock contention problems, so if that does become a problem again then that is a technique to consider.
- Currently CacheLib DRAM cache is slab based. So to adopt to size-aware memory management would probably be a very long term change that we need to look carefully into. Or did I misunderstand your point here?
Nope, thanks for pointing that out.
It's nice to see you here Ben! I also worked at Addepar a while back on Greenbaums team and had the chance to take over a lot of your code in the financial graph.
ha, what a mess. I hope you don't hold that against me, I inherited much of it. My "starter project" was to make the graph distributed instead of running all of Addepar on one big machine. I created a transaction log, replayed to "catch-up" the nodes to allow for snapshot reads, a persistent data structure for concurrent readers, and used select-for-update to exclusively row-level lock the graph during a write transaction. Since everything was dependent on that logic in a large code base, but which was designed as a SPOF, that solution gave enough breathing room to allow for an eventual rebuild. I hope by now that situation is a lot better. Anyway, small world! 🙂
from cachelib.
I'll leave this open then.
The trick of bloom filter on top of CMS is interesting. We do have the interest to reduce the overhead of how access history is tracked in another application of CMS.
Anyway, small world! 🙂
Sure it is! AMP is a piece of art. Let's leave it there.
from cachelib.
Related Issues (20)
- The performance deterioration is observed in the new version
- compilation error HOT 3
- Create a version with stable facebook project dependencies HOT 1
- make_unique was not declared when compiled HOT 1
- Runtime error (seem to be caused by link error) HOT 4
- Fail to persist nvm cache due to "exceeding metadata limit" HOT 1
- Build failure - cannot find python-six HOT 2
- build.sh: error: failed to build cachelib HOT 2
- build.sh: error: No build recipe for detected operating system 'fedora38' HOT 3
- New runtime error HOT 5
- cachelib fails to build HOT 3
- allocator-test-AllocationClassTest Failed On ARM
- Where is `cachelib/facebook/admin/CacheAdmin.h`? HOT 1
- cachelib fails to build HOT 3
- refCount alignment issue HOT 4
- The method config.enableChainedItems() cannot be found. HOT 1
- Some questions in resizing the cachelib pool size HOT 2
- Is there any plan to provide an Java SDK for this cachelib ? HOT 2
- Fail to build dependency fbthrift (with errors reported in fmt) HOT 4
- make clean option for contrib/build.sh HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cachelib.