We realized that most eko operations do not depend on

This is also more long term than <a class="issue-link js-issue-link" data-error-text="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Closed in favor of <a class="issue-link js-issue-link" data-error-text="Failed to load

Memory saving output about eko HOT 8 CLOSED

nnpdf commented on July 21, 2024

Memory saving output

from eko.

Comments (8)

felixhekhorn commented on July 21, 2024

separate Q2 storage: it will be {q2}.npy

at first I wanted to point out that we should use the byte representation, but then I thought maybe we could actually use this to introduce an approximation: we could use log(Q2) with a fixed precision (say 4 digits) and this way save some computations ...

merge separately computed (but input compatible) outputs

before being able to merge full output objects, we first need to be able to extend existing outputs
a simple condition for this is to dump the theory and operators card into the archive
such that we can do Output.load("a.tar").get(2.)

support separately threshold operators and partial Q2 in Output

actually we could even join them only at loading time ...

they are also dumped on disk, with their own names: thresholds.npz

I'd rather put them in a separate sub-directory (just to keep them out of the way) - or we could even add them as regular operator, since after all they are regular operators leading to exactly the threshold (in the upper scheme)

from eko.

alecandido commented on July 21, 2024

at first I wanted to point out that we should use the byte representation, but then I thought maybe we could actually use this to introduce an approximation: we could use log(Q2) with a fixed precision (say 4 digits) and this way save some computations ...

This is fine, but I wonder if Q2 might be simpler than log(Q2): I know the log is more relevant, but the other way it's easier to inspect manually, since you require fewer operations (but we can think about it).

before being able to merge full output objects, we first need to be able to extend existing outputs

I was just thinking the other way round: the moment we can merge, we use this to extend.

a simple condition for this is to dump the theory and operators card into the archive

I thought it was done, but maybe I'm only doing it for yadism...

actually we could even join them only at loading time ...

This I'm not sure: it's easier, because in order to extend you just need to drop more .npy files into the archive, but might be expensive. We can try to benchmark how much it takes, if it's negligible we can even do, but if the product is expensive, I would precompute it (or perhaps JIT, the first time you do it, you dump the product, unless you call .compile() or something like, and you do it AOT for all).

I'd rather put them in a separate sub-directory (just to keep them out of the way) - or we could even add them as regular operator, since after all they are regular operators leading to exactly the threshold (in the upper scheme)

I was thinking to store patches and matching separately, but maybe you're right and there is no purpose. If we store them as regular operators, we should mark them as thresholds in the metadata.

from eko.

alecandido commented on July 21, 2024

we have a detailed plan for the output:

Plan

The folder will contain:

metadata.yaml, containing output metadata (already present)
runcards folder, containing all the runcards needed to reproduce the output
- in case of products and later manipulations, all the involved runcards are saved to this folder, and the full history to reproduce is written in metadata.yaml
recipes folder: this will contain brief YAML files with the recipes for jobs computing parts
parts folder: containing all the partial outputs that have to be Mellin integrated
- partial patches
- matching conditions
operators folder: containing the final results, after combining the parts
- the files name will be binary representation of the Q2 floats + extension (all files here will be valid evolution operators)

from eko.

alecandido commented on July 21, 2024

This is also more long term than #138 at this point: the core part of the memory structure has been done in #105, while in #138 the computation will be faced (that will include OperatorGrid removal and split threshold operators).

Every other feature here is considered a "nice to have", but no more.

from eko.

alecandido commented on July 21, 2024

@felixhekhorn the part strictly addressing the title is already implemented, and most of the rest will be implemented as a consequence of #138

The only two elements that are falling outside #138 and contained here are:

merge separately computed (but input compatible) outputs

split a single output into multiple ones

not strictly needed, but it's dual to the former one, and so nice to have

Should we keep this issue for them?

from eko.

felixhekhorn commented on July 21, 2024

I still consider the items relevant - maybe we can put them into a new issue (with a clearer title)?

from eko.

alecandido commented on July 21, 2024

Yes, maybe that's the way. If you open the new issue, feel free to close this one, otherwise I'll do at some point.

from eko.

felixhekhorn commented on July 21, 2024

Closed in favor of #193

from eko.

Memory saving output about eko HOT 8 CLOSED

Comments (8)

Plan

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs