iu-parfunc / simple-filecache Goto Github PK

View Code? Open in Web Editor NEW

0.0 4.0 0.0 152 KB

A simple package for caching results to disk, and then in memory.

License: Other

Haskell 100.00%

simple-filecache's Introduction

simple-filecache package

Comparison against related packages

filestore -- manages
filecache --
cached-traversable -- cache parts of a single lazy Traversable, element by element.
persistent-sqlite --
TCache --

In-memory memoization libraries:

uglymemo
data-memocombinators
monad-memo

simple-filecache's People

Contributors

Watchers

simple-filecache's Issues

Add a arbitrary-bytes store as well as a haskell-value store

We probably want to take what we have now, move it to an Internal module, and parameterize it so we can get these two different behaviors:

Filecache key val -- store Haskell values, which have a Binary instance
Filecache key -- untyped store of raw ByteString data. It's your job to give a meaning to the bytestring.

This is necessary to store raw nvcc-produced "cubin" files in the cache. Needless to say, calling "Data.Binary.decode" on those won't work, and having to read them back into memory just to "Data.Binary.encode" would be a big waste.

Questions about storeCache (from Trevor)

TLM: questions for 'storeCache'

 1. Do we want to cache the entries that have been read from disk?

 2. If so, how strict will we be in maintaining the store limits if there
    are potentially several clients putting things into the cache
    concurrently.

    a. On insertion/store, do I completely update my local 'storeCache'
       to be consistent with what is on disk, and then apply the policy?
       This sounds like an expensive operation.

    b. On insertion/store, do I not care about the global view and just
       update my local cache, and apply the policy based on that? This
       means that two clients could have local views each within the
       policy limits but the union of which would exceed said limits.

    c. Have a separate thread that in the background periodically applies
       the versioning/limits policy based on the on-disk state.

    There may be no right choice for all use cases :

    On the other hand, what is the performance penalty for not having a
    local in-memory cache, and hitting disk every time? This also depends
    on the answer to the question of whether or not 'lookup' reads the
    file from disk, or just returns the path to said file (but then, can
    we guarantee that that path will remain valid?)

 3. When we add a file to the store, should we add it to the
    'storeCache'? This particularly applies to storing a file that might
    not exist in memory at all. Additionally, maybe the user wants to
    store files to disk so that they can be _purged_ from memory, and
    this certainly would defeat that.

Idea: idle timer based generation of index

Trevor wanted to have a quick-to-check in-memory index of which keys are on disk.

If the GHC runtime had a general purpose way to run computations when idle (like Emacs idle timers), I think generating this index would be a candidate computation that could happen when idle or waiting on IO.

Of course, it becomes stale as soon as you generate it (or during!), so you have to be willing to have an approximation of what's on disk.

Use hardlinks to preserve liveness

Trevor had an idea to use hardlinks to preserve the files in the store against the LRU replacement policy.

That is, when you do a lookup, you want to get back a "thing", such that you can:

withThing thing $ \ file -> do useTheFile file

The thing could have a finalalizer that removes the hardlink and thus lets the disk actually let go of the file.

Sizes of files we care about

This is just somewhere to keep this information together. Add other file types as we get information (e.g. object files from icc?)

CUDA

Looking over the cache in my home directory, most of the .cubin files are <10k in size. I currently have 895 files split over several versions of accelerate, totalling 4.2 MB. This would explain why nobody has complained about accelerate-cuda's lack of cache purge policy yet.

iu-parfunc / simple-filecache Goto Github PK

simple-filecache's Introduction

simple-filecache package

Comparison against related packages

simple-filecache's People

Contributors

Watchers

simple-filecache's Issues

Add a arbitrary-bytes store as well as a haskell-value store

Questions about storeCache (from Trevor)

Idea: idle timer based generation of index

Use hardlinks to preserve liveness

Sizes of files we care about

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs