anoma / ferveo Goto Github PK
View Code? Open in Web Editor NEWAn implementation of a DKG protocol for front-running protection on Anoma.
Home Page: https://anoma.net
License: GNU General Public License v3.0
An implementation of a DKG protocol for front-running protection on Anoma.
Home Page: https://anoma.net
License: GNU General Public License v3.0
Could add support for enclaves (when they exist at the deployment environment) for:
Details TBD.
Relevant refs:
Loop in https://github.com/anoma/ferveo/blob/master/tpke/src/decryption.rs#L67 can be sped up by using arkworks VariableBaseMSM.
The code below shows an initial approach (currently failing with errors):
use ark_ec::msm::VariableBaseMSM;
// sum_D_j = { [\sum_j \alpha_{i,j} ] D_i }
for (D, alpha_j, sum_alpha_D_i) in izip!(shares.iter(), alpha_ij.iter(), sum_D_j.iter_mut())
{
let Dj = D.iter().map(|Dij| Dij.decryption_share).collect::<Vec<_>>();
let alpha_j = alpha_j.iter().map(|a| a.into_repr()).collect::<Vec<_>>();
*sum_alpha_D_i = VariableBaseMSM::multi_scalar_mul(&Dj, &alpha_j);
}
There is a nontrivial amount of symmetric crypto in the protocol, almost all of it should be provided by existing crates.
This task involves:
I think the high priority for symmetric crypto choices should be:
Caching
Since Ferveo is intended to be an "online" protocol and some/many primitives being used are not constant-time/may have other side-channel vulnerabilities, there should be an analysis and potential mitigations investigated (as needed).
Fortunately Ferveo is not like TLS where the latency is highly important, so hopefully this should be straightforward.
Early idea for discussion.
For:
We could reduce the amount of operations in verify-poly
and verify-point
by generating a sparse matrix for the binomial's coefficients. E.g. a O(tlogt) dense matrix will reduce both verify-poly
and verify-point
to O(tlogt) exponentiations and O(tlogt) multiplications.
Additionally, we could use compression for the commitment. If we consider commitment compression, we need to be careful since it leaks information on which are the zero factors. Some initial thoughts on what to be careful about:
Setting aside performance, initially, the modified Haven VSS needs to be fully implemented feature wise. The three primary changes from the Haven paper are:
Implementing the VSS to feature-completeness will allow the performance tweaking to happen in parallel to building the rest of the protocol.
Part (I): Make sense of the code
dkg
msg
pvdkg
PubliclyVerifiableDKG
will handle all the context for handling DKG messagesimpl PubliclyVerifiableDKG
For constructing a new context it is required
ed_key
)
params
pvss_params
share
share
functionshare
share
!?
aggregate
aggregate
.find_by_key
anounce
Open questions:
Is #24 up to date? We should also merge the rendered PDF with Git LFS.
The #14 implementation helped performance quite a bit, but there are so many subgroup checks that further improvement is needed.
Probably the best remaining trick is to batch many subgroup checks together. Maybe these PR:
arkworks-rs/algebra#127
arkworks-rs/algebra#130
Other relevant issues:
zcash/zcash#3470
Consensys/gnark-crypto#94
https://cseweb.ucsd.edu/~mihir/papers/batch.pdf
apache/incubator-milagro-crypto-rust#37
The fundamental problem is described very well in the DKG in the Wild paper: "the difficulty of differentiating between a slow node and a faulty node in the asynchronous setting are the primary sources of the added complexity."
Basically a slow node (or a faulty node that comes back online) must be able to complete the DKG, however, if the other nodes have already signed "ready", then it's difficult to guarantee share distribution for the slow/faulty node. The DKG in the Wild paper solves this because when HybridVSS succeeds, the slow nodes are guaranteed to recover because even if the dealer goes offline, everyone else can share to them.
I still think using univariate polynomials is a good idea because HybridVSS is just pretty expensive by itself. But we have to figure out exactly how to do it.
Just to enumerate some options:
I actually rather like option 3 now that I consider it more. It uses very little on chain data (more or less the protocol described in the call on Tuesday), has low cryptographic and computational complexity, and particularly cheap in the optimistic case where no one tries to cheat and there are few/no network failures.
The scalability of (1) bothers me. it wastes a lot of resources to handle a non-optimistic case (bad behavior by the dealer) and the scalability of number of shares is itself part of the security of the system (a larger number of shares and therefore increased granularity/resolution can help the security, particularly if there are many small weight nodes in the system)
Since we are assuming a consensus mechanism available, it appears we don't really need the leader change protocol with its added complexity. The fundamental goal is to make sure everyone agrees on the HybridVSS instances that are used to construct the key and the consensus mechanism can guarantee that. So, probably everyone should simulate being the leader in the HybridDKG protocol and use the consensus choice of HybridVSS results.
There are still some questions of how to do this exactly. The first possibility is keeping HybridDKG with an optimistic phase and a pessimistic phase. In the optimistic phase, every HybridVSS result has been shared with every node, so there is no need for a consensus mechanism (I defer on the real world likelihood of this result happening)
In case of consensus breaking, or if the optimistic phase is deemed unreliable, then the pessimistic phase can happen. There's some decisions how to implement this as well. The most straightforward is to give all seen ready messages to the consensus mechanism and then stop when complete (with maybe a timeout period to start over).
Alternatively if this is a lot of data then possibly the consensus mechanism can simply decide on which HybridVSS instances complete and then rely on nodes to gossip the actual HybridVSS results from exactly those instances to produce the final key. Or, maybe another alternative is to just use the first generated key signed by t+1 weight of nodes that appears in the consensus mechanism. In the case where nodes disagree on which HybridVSS instances completed, there doesn't appear to be any obvious issue with nodes signing all generated keys that appear valid to them (i.e. if a node sees >= t+1 weight instances of HybridVSS complete, it signs the resulting key. if a node sees another node sign a key and knows >=t+1 HybridVSS instances that result in that key, it signs that key too)
There will probably need to be careful proofs of the new HybridDKG protocol, but at a high level we should figure out the informal tradeoffs first.
We have to address the handoff between epochs. This is something which we have ignored up until now, but is the primary research question remaining: when the previous epoch ends, two things happen:
The more performant the DKG, the less of an issue (a) and (b) are, but even in the best case it seems unavoidable to pause the network for 3-5 blocks, unless there is some clever pipelining or handoff trick we can use.
I believe we can and should run the DKG entirely during the previous epoch. This requires the staking phase to end slightly before the new epoch begins, and risks that the previous validator set can censor the DKG from proceeding (liveness risk) but I think this risk is very low when you consider:
Currently the dealer in HybridVSS has to construct and commit to an m^2 size polynomial (m is number of shares) which is fairly expensive to do. Unfortunately it seems nontrivial to improve this to subquadratic time, however since it is part of the DKG it might be an acceptable cost.
More importantly if we do a separate HybridVSS for every share, that is a lot of overhead, so we really only want to do HybridVSS once per identity (say, n is the number of nodes/identities) and nm^2 is a lot more manageable. This seems like a realistic proposal, but the details need to be carefully sketched out.
The first observation is that we can have HybridVSS issue multiple shares to each identity (likely no problem with this). The second observation is that HybridDKG now needs t+1 weighted HybridVSS instances to finish and not just t+1 identities to finish (the DKG proofs need to be modified accordingly)
The third observation is that some aggregation of messages may be possible (e.g. one ready message per HybridVSS per identity rather than per share) to potentially save more communication complexity and/or storage costs
The amount of code changes is relatively small but we mostly have to make sure we don't break any of the important properties of HybridDKG doing this.
Sikka has an initial implementation of threshold decryption at https://github.com/sikkatech/arkworks-threshold-decryption, that can be integrated easily.
This is a two-part task:
Currently, the benchmarks are rather ad-hoc. It would be nice to use criterion to scientifically microbenchmark each part of the protocol, so that we can experiment with changes and see if they're beneficial.
It's possible for the dealer to precompute, offline, many of the computationally expensive parts of its VSS. This feature can be added if it's judged to be useful.
A security analysis should be done to assess the risk of having VSS dealer/private data stored on the dealer's server for potentially a nontrivial amount of time.
Decrypting involves a number of pairing checks, as @simonmasson observes we can optimize that to two miller loops and one final exp (example.
Additionally since many transactions are getting decrypted at the same time, it would be nice to batch all these checks together (e.g. because you're verifying a bunch of decryptions made with the same key share).
Since tx decryption is going to need to be very fast it's worth looking at these optimizations
The crypto protocol draft should be reviewed for errors, details added as necessary, TODOs resolved, etc. Ideally the crypto side of the protocol should be well documented and clearly comparable to both the written code and also to the original papers.
The cryptoeconomics discussion may be added to the same document or a separate document later.
This is not the highest priority task, but it is relatively straightforward and can also be good for testing the overall system.
Extremely high performance is a design goal of Ferveo, this will require some creativity to achieve. This issue is to track the discussion and ideas.
Relevant readings:
https://twitter.com/aniketpkate/status/1319345423811809291
https://people.csail.mit.edu/devadas/pubs/scalable_thresh.pdf
Tracking all major tasks and decisions to be made, both roughly in order of importance.
Decisions:
Tasks:
We're also dependent on some external tasks:
Unlike #10 there is really only one choice for public key primitives, BLS12-381 curve with the described key agreement/encryption/signature schemes. This is mostly dictated by the small number of acceptable choices and also desire to reuse existing trusted setups.
Nevertheless, alternatives should be discussed and choices finalized, and then implementations of primitives written and/or imported from existing crates.
Unless there are good crates that exist, implement the fancy KZG commitment schemes for high performance and small multi-polynomial multi-value opening proof sizes.
https://github.com/khovratovich/Kate/blob/master/Kate_amortized.pdf
If KZG commitments are used, then a trusted setup has to be loaded (probably an existing setup, Zcash or Filecoin)
This is a straightforward task, implement in Rust the download/verification/loading of the trusted setup. Ideally there should be extensive opportunity for code reuse/external crates here, since for example Zcash clients perform this task exactly.
Currently, there are quite a few expensive loops which should be parallelized with rayon.
I'm not sure if the following has been discussed or proposed, but I'm putting this into an issue here so we can have a place to discuss things related to this topic.
Once Ferveo is implemented and integrated into the consensus system. It should not be hard to also implement the following:
(1) Threshold (BLS?) signatures
(2) Compact certificates of finalized states using threshold signatures.
(3) Light clients that rely on verification of public key updates and state certificates.
Currently, it is assumed that every VSS transcript is added to a block, which incurs a lot of overhead in the case of many validators.
When the gossip mechanism is redesigned, Ferveo should do VSS aggegation (basically, the aggregatable DKG approach) within the gossip layer, incurring cost more like O(# key shares * log(# validators))
When the aggregate step of the PVSS fails, the node handling the aggregation has to identify which PVSS instance was bad, and then send a complaint message to everyone else.
The protocol extensively sends/receives points on the BLS12-381 curve from third parties, there may be many subgroup checks needed to ensure that all such points lie in the prime order subgroup.
This task involves two parts:
Hopefully we can identify a suitable existing implementation, otherwise implement independently.
Any implementation should take into account the issues described here
https://ethresear.ch/t/fast-verification-of-multiple-bls-signatures/5407
NuBLS is a pure rust implementation, but of course the dependencies are an issue.
blstrs - another option, dependency issues and also depends on unsafe code/C library
Some parameters such as:
will need to be determined by experimentation once the protocol is implemented and initial performance tweaks are added. Experiments might also suggest changes to the abstract protocol in addition to the parameters.
While not an immediate issue, explore the possibility of using arkworks algebra for polynomial types and curves.
I have two primary reasons to consider this:
We need tests for the protocol and implementation.
[ ] End-to-end tests which run the entire DKG and threshold encryption/decryption from beginning to end
[ ] Tests for each individual component, to get 100% coverage of code paths
[ ] Test vectors for components where appropriate
Signed<T : Serializable>
typeA declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.