GithubHelp home page GithubHelp logo

simple-filecache's Introduction

simple-filecache package

Comparison against related packages

  • filestore -- manages

  • filecache --

  • cached-traversable -- cache parts of a single lazy Traversable, element by element.

  • persistent-sqlite --

  • TCache --

In-memory memoization libraries:

  • uglymemo
  • data-memocombinators
  • monad-memo

simple-filecache's People

Contributors

rrnewton avatar parfunc avatar

Watchers

 avatar James Cloos avatar  avatar  avatar

simple-filecache's Issues

Add a arbitrary-bytes store as well as a haskell-value store

We probably want to take what we have now, move it to an Internal module, and parameterize it so we can get these two different behaviors:

  • Filecache key val -- store Haskell values, which have a Binary instance
  • Filecache key -- untyped store of raw ByteString data. It's your job to give a meaning to the bytestring.

This is necessary to store raw nvcc-produced "cubin" files in the cache. Needless to say, calling "Data.Binary.decode" on those won't work, and having to read them back into memory just to "Data.Binary.encode" would be a big waste.

Questions about storeCache (from Trevor)

TLM: questions for 'storeCache'

 1. Do we want to cache the entries that have been read from disk?

 2. If so, how strict will we be in maintaining the store limits if there
    are potentially several clients putting things into the cache
    concurrently.

    a. On insertion/store, do I completely update my local 'storeCache'
       to be consistent with what is on disk, and then apply the policy?
       This sounds like an expensive operation.

    b. On insertion/store, do I not care about the global view and just
       update my local cache, and apply the policy based on that? This
       means that two clients could have local views each within the
       policy limits but the union of which would exceed said limits.

    c. Have a separate thread that in the background periodically applies
       the versioning/limits policy based on the on-disk state.

    There may be no right choice for all use cases :

    On the other hand, what is the performance penalty for not having a
    local in-memory cache, and hitting disk every time? This also depends
    on the answer to the question of whether or not 'lookup' reads the
    file from disk, or just returns the path to said file (but then, can
    we guarantee that that path will remain valid?)

 3. When we add a file to the store, should we add it to the
    'storeCache'? This particularly applies to storing a file that might
    not exist in memory at all. Additionally, maybe the user wants to
    store files to disk so that they can be _purged_ from memory, and
    this certainly would defeat that.

Idea: idle timer based generation of index

Trevor wanted to have a quick-to-check in-memory index of which keys are on disk.

If the GHC runtime had a general purpose way to run computations when idle (like Emacs idle timers), I think generating this index would be a candidate computation that could happen when idle or waiting on IO.

Of course, it becomes stale as soon as you generate it (or during!), so you have to be willing to have an approximation of what's on disk.

Use hardlinks to preserve liveness

Trevor had an idea to use hardlinks to preserve the files in the store against the LRU replacement policy.

That is, when you do a lookup, you want to get back a "thing", such that you can:

withThing thing $ \ file -> do useTheFile file

The thing could have a finalalizer that removes the hardlink and thus lets the disk actually let go of the file.

Sizes of files we care about

This is just somewhere to keep this information together. Add other file types as we get information (e.g. object files from icc?)

CUDA

Looking over the cache in my home directory, most of the .cubin files are <10k in size. I currently have 895 files split over several versions of accelerate, totalling 4.2 MB. This would explain why nobody has complained about accelerate-cuda's lack of cache purge policy yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.