GithubHelp home page GithubHelp logo

smolcache's Introduction

smolcache

This package contains a CLOCK based approximate-LRU cache optimized for concurrent usage. It minimizes both the number of locks held, and the number of writes to shared cache lines. In the worst case, this implementation requires at most 1 shared-cache write on a hit, not counting lock acquisition.

Issues With Concurrent LRU

Typically, LRU is implemented with a map and a doubly-linked list, all elements are stored in both the map, and the list. Whenever there's a hit on a cache-item, it's moved to the head of the list; evictions are done from the tail of the list. This ensures that it's always the least-recently-hit element that's evicted.

While conceptually simple, the standard implementation of LRU as significant issues in a multi-processor setting, since every read to the cache writes to the underlying datastructures. Updating the LRU list requires at least 4 writes on shared cache lines; 2 to unlink the element from it's old place in the list, and 2 more to add it to the head. When the LRU is being accessed from multiple cores, these cache lines will bounce among them, causing a significant slowdown.

CLOCK-based LRU

The CLOCK algorithm approximates LRU. Conceptually, it augments a map with a bitset, with each element owning one bit in the bitset. Whenever an element is hit, it's bit is set to 1. The evictor maintains a cursor into the bitset, and reads it round-robin.

                    eviction cursor
                            v
bitset: 0 1 0 0 1 1 0 1 0 0 1 1 1 1 0

Whenever the evictor sees a 1 it sets the bit to 0. Whenever the evictor see a 0, it evicts the element. This means that eviction always evicts an element that hasn't been read since the last time the evictor ran, and, since it reads the element in order, the first such element is the element that hasn't been touched the longest since the last run.

Concurrent CLOCK

CLOCK two important benefits over regular LRU in a concurrent setting; the state that changes on a hit is per-element instead of shared, and the changes made by a hit are idempotent.

Regular LRU relies on a shared list, which is manipulated every time an element is hit. CLOCK only requires a per-element bit to be set on a hit. This means that on a hit, only a single value, and with the proper layout, a single cache line, needs to be written to. Already, this reduces the worst-case number of writes to 1 from 4. We can further reduce the number of writes by observing that for an LRU to be useful, hits must be more common than evictions, and in CLOCK any successive hits before an eviction need not change any state; the bit is already set to 1. This means that we can check the bit before writing it, and do nothing if it's already set, further reducing the number of writes.

Other Optimizations

The cache is built off a sharded hashmap; values are first hashed to see which hashmap they should be stored in, then stored in an individual shard. The shards are protected by Read/Write locks, with the read side taken for cache lookups, and the write side for cache inserts and evictions. Eviction locks a single shard at a time, going in round-robin fashion until it finds an element to evict.

Each shard consists of a hashmap and linkedlist. The map is used to perform key lookups, and the list is used to ensure a consistent order for eviction checks. The actual elements and staleness-bits are stored in the individual nodes of the list, there is no separate bitset. List nodes are padded to take a cache line each to prevent false-sharing. Staleness-bits are set using atomic operations, and are only written after they have been read to be 0.

smolcache's People

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.