Comments (9)
A little context; here's the cache table:
sql('CREATE TABLE IF NOT EXISTS Cache ('
' rowid INTEGER PRIMARY KEY,'
' key BLOB,'
' raw INTEGER,'
' version INTEGER DEFAULT 0,'
' store_time REAL,'
' expire_time REAL,'
' access_time REAL,'
' access_count INTEGER DEFAULT 0,'
' tag BLOB,'
' size INTEGER DEFAULT 0,'
' mode INTEGER DEFAULT 0,'
' filename TEXT,'
' value BLOB)'
)
Nothing uses the "version" field now. That should be ignored.
The Disk
uses "key", "raw", "size", "mode", "filename", and "value" fields.
So the metadata is "store_time", "expire_time", "access_time" (used only by LRU eviction policy) and "access_count" (used only by LFU eviction policy).
I think you want to only change the "expire_time". I wonder if there's value in allowing users to change the "store_time", "tag" and other metadata-ish fields.
When you have time, write a snippet illustrating how you would want it to work and we can iterate from there.
from python-diskcache.
Related: #56
from python-diskcache.
For our use case, I'd like to have all eviction happen in a separate, nightly cron process.
Can we have one cache handle with all expiration turned off, and then only the culling cron will have the specific limitations we want enabled, and then we call .expire()
?
from python-diskcache.
Yes. You would set the cull_limit to 0 (https://github.com/grantjenks/python-diskcache/blob/v2.9.0/diskcache/core.py#L66) then add a new eviction policy (https://github.com/grantjenks/python-diskcache/blob/v2.9.0/diskcache/core.py#L82)
I'm about to add ".cull()" in addition to ".expire()". See #52 for background. Currently, "expire" only removes items that have expired. It does not meet size constraints. The idea of "cull()" will be to remove expired items and then apply the eviction policy to meet size constraints.
from python-diskcache.
So the new eviction policy would be something like...
'least-recently-used-older-than-90': {
'init': (
'CREATE INDEX IF NOT EXISTS Cache_access_time ON'
' Cache (access_time)'
),
'get': 'access_time = ((julianday("now") - 2440587.5) * 86400.0)',
'cull': 'SELECT %s FROM Cache
WHERE access_time < ((julianday("now") - 2440587.5 - 90) * 86400.0)
ORDER BY access_time
LIMIT ?',
},
then? (Edit: fixed clause ordering)
from python-diskcache.
That's about right. I think you have to put the "WHERE ..." clause before the "ORDER BY ..." clause though.
The 'cull' key is used in only one place at https://github.com/grantjenks/python-diskcache/blob/master/diskcache/core.py#L705. My only concern, looking at that code, is that two queries are made and I don't think we could guarantee that they return the same rows because your concept of "now" changes slightly between the two queries. Maybe it should be changed to use format strings with the same "now=now" passed into each of these queries.
from python-diskcache.
Yeah, I see the timing window where something isn't in rows
because that got queried first, but is in the DELETE
query, because that happens a short time later.
Baking in now=time.time()
or w/e (edit: right, now
is a param to _cull
, gotcha) to the cull SQL could work, but it seems like that might break existing custom eviction policies. Is that actually an issue?
from python-diskcache.
I don't think it's a big issue but I'm willing to bump to v3 to get new-style string format parameters. I can't remember why I chose the old-style.
v3 issues/features: #60
from python-diskcache.
Committed at 24fadab. To be deployed in v3.
from python-diskcache.
Related Issues (20)
- Deque `peekleft` blocks infinitely after corruption(?) HOT 18
- No real-time synchronization between writing data and reading data in different processes. HOT 1
- RFE: is it possible to start making github releases?🤔 HOT 2
- Deque with JSONDisk throws TypeError: a bytes-like object is required, not 'int' HOT 1
- [Feature Request] Allow iterkeys method to yield value or tag along with key?
- High memory usage with multiple threads HOT 3
- [Bug] Fetching an item while writing the item into Cache HOT 4
- "unable to open database file" with Python 3.11 and PySide6 threads HOT 9
- Mark the distributed package as typed HOT 1
- Long-lived cache management HOT 2
- PR for fork deadlock HOT 1
- Can this awasome add the range selection Feature? HOT 2
- PicklingError: logger cannot be pickled HOT 6
- Q: Config for in-memory shared cache HOT 8
- Use local OS' path separator HOT 3
- Cannot change JSONDisk compression level HOT 2
- Memoize key isn't always invertible
- support pandas.DataFrame in cache.memoize()
- Protect Against Repeated KeyboardInterrupt Signals HOT 1
- Error during reset
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-diskcache.