Deion We currently use temporary storage directly through th

Use a manager/proxy interface to access temporary storage (temporary directory) about polars HOT 2 OPEN

nameexhaustion commented on June 23, 2024 1

Use a manager/proxy interface to access temporary storage (temporary directory)

from polars.

Comments (2)

ritchie46 commented on June 23, 2024

This might run on our tokio runtime. Then we could static task (runs for the duration of the polars process) that most of the time sleeps and once in a while garbage collects.

from polars.

ritchie46 commented on June 23, 2024

Alright, did a brainstorm. I think we have got some ideas.

Assuming our spill/cache directory ~.polars/.

We can dump spilled files under a folder created by a combination process id and current datetime. This can hold future spilling files.

For the caching of the files we should provide a time-to-live, TTL. This TTL can for instance be 1 day for files downloaded from the internet.

During startup we create a task that checks for old pid_datetime folders that are not alive anymore (interupted process) and files that surpassed their TTL and cleans them.

~/.polars/
    # Spills from the streaming engine. For future reference
    pid_datetime/
    pid_datetime/
    # files with a TTL
    cache/

The spill manager can be a static struct that initially only deals with the downloads, caching and cleanup. I think that we should set an in-process bit during downloading so that we don't start duplicate downloads.

from polars.

Use a manager/proxy interface to access temporary storage (temporary directory) about polars HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs