GithubHelp home page GithubHelp logo

Comments (4)

rrtoledo avatar rrtoledo commented on August 14, 2024

@ed255 do you know which FFT algo we are using?

With ‎Cooley–Tukey FFT algorithm, we pad the evaluations to the closest power of 2 before doing iFFTs (as the FFTs on power of 2s are much cheaper).

We should be using iFFTs directly for the commitment but also for polynomial multiplications and so the end polynomial degree can be much higher (unless you already posted that in the stats PR).

So memory-wise, we may have higher gain than what you said.

from zkevm-circuits.

hero78119 avatar hero78119 commented on August 14, 2024

Notice the benchmark are taking on k = 26 with 32 chunks.

What do you think about another straw-man idea by trading smaller k with larger chunks ?
Ideally if we reduce k in a magnitude, memory consumption during proof generation should be cut also nearly half.
Under current aggregate proof scheme, if only one prover generating all chunk proof, each chunk proof after generated can be discard.
If we adopt multiple prover, the overall latency should also being improved thanks to parallelism

from zkevm-circuits.

ed255 avatar ed255 commented on August 14, 2024

@ed255 do you know which FFT algo we are using?

With ‎Cooley–Tukey FFT algorithm, we pad the evaluations to the closest power of 2 before doing iFFTs (as the FFTs on power of 2s are much cheaper).

In halo2 all polynomials are stored in vectors, and these vectors are always preallocated with sizes power of 2, so I would say the padding happens implicitly (it's just elements in the vector that are not assigned and have 0 by default).

We should be using iFFTs directly for the commitment but also for polynomial multiplications and so the end polynomial degree can be much higher (unless you already posted that in the stats PR).

The stats PR already considers the polynomials in the extended domain (which depends on the max expression degree). Is this what you mean? I believe the biggest source of memory consumption comes from the polynomials in the extended domain.

So memory-wise, we may have higher gain than what you said.

On a related note, the numbers of the stats utility are theoretical. In practice the memory usage of the process may be higher due to:

  • allocations that we didn't consider, for example coming from iterators? (but maybe we could study those if they appear and try to fix them)
  • things we may have missed?

Notice the benchmark are taking on k = 26 with 32 chunks.

What do you think about another straw-man idea by trading smaller k with larger chunks ? Ideally if we reduce k in a magnitude, memory consumption during proof generation should be cut also nearly half. Under current aggregate proof scheme, if only one prover generating all chunk proof, each chunk proof after generated can be discard. If we adopt multiple prover, the overall latency should also being improved thanks to parallelism.

Yes! I think that's something we could easily do now. On one hand we have to dimensions: memory and compute; and by changing the k (and thus the number of chunks) we have different memory and compute values (which may be a tradeoff; I assume at some point we can trade memory by compute and vice-versa). As you say, the good thing about compute is that we can parallelize or distribute the work (due to aggregation) and reduce memory, increase compute but not increase time (because we add more machines).

On the other hand, I think it would be great to find the sweet spot of the aggregation configuration:

  • If the circuit is too small, the aggregation overhead is too high overall
  • If the circuit is too big, the aggregation overhead is small (but maybe the aggregation proof takes longer?)
  • moreover, we can play with the k, and somehow with the number of advice columns.
  • If we have an aggregation tree, how many children should a node have?

from zkevm-circuits.

rrtoledo avatar rrtoledo commented on August 14, 2024

The stats PR already considers the polynomials in the extended domain (which depends on the max expression degree). Is this what you mean? I believe the biggest source of memory consumption comes from the polynomials in the extended domain.

I was not sure it was the extended domain, but you clarified this. I agree with you!

On a related note, the numbers of the stats utility are theoretical. In practice the memory usage of the process may be higher due to:
allocations that we didn't consider, for example coming from iterators? (but maybe we could study those if they appear and try to fix them)
things we may have missed?
On top of my head, the usual biggest costs are iFFTs, commitment and computing the quotient polynomial. However, because of the nb of columns we have, I wouldn't be surprised is the permutation argument (even if we split the poly) is very costly too.

Also, one straightforward thing that we can do for the time being (before thinking of merging columns and so on) is check if the FFT algo we use is sparse-friendly.

from zkevm-circuits.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.