lambdaclass / zksync_era_precompiles Goto Github PK

Yul precompile library to speedup elliptic curves operations.

License: Apache License 2.0

Yul 20.99% Rust 63.27% Python 1.76% Makefile 0.10% Solidity 10.79% JavaScript 0.02% TypeScript 2.47% Shell 0.09% Jupyter Notebook 0.52%

zksync_era_precompiles's Introduction

zkSync Era Precompiles

DISCLAIMER: This implementation is still being developed and has not been reviewed or audited. Use at your own risk.

This is a precompile library implemented in Yul to speed up arithmetic operations of elliptic curves. In the next weeks, we will add more optimizations and benchmarks.

Current Status

Precompile	MVP	Optimized	Optional Future Optimizations	Audited	Comments
ecAdd	✅	✅	Montgomery SOS Squaring	✅	-
ecMul	✅	✅	Montgomery SOS Squaring + Mul GLV	✅	-
ecPairing	✅	✅	-	🏗️	-
modexp	✅	✅	-	🏗️	-
P256VERIFY	✅	✅	Montgomery SOS Squaring	✅	-
secp256k1VERIFY	✅	✅	Montgomery SOS Squaring	✅	-

Summary

ecAdd is optimized with finite field arithmetic in Montgomery form and optimized modular inverse with a modification of the binary extended Euclidean algorithm that skips the Montgomery reduction step for inverting. There is not much more room for optimizations, maybe we could think of Montgomery squaring (SOS) to improve the finite field squaring. This precompile has been audited a first time and it is currently being audited a second time (after the fixes).
ecMul is optimized with finite field arithmetic in Montgomery form, optimized modular inverse with a modification of the binary extended Euclidean algorithm that skips the Montgomery reduction step for inverting, and the elliptic curve point arithmetic is being done in homogeneous projective coordinates. There are some other possible optimizations to implement, one is the one discussed in the Slack channel (endomorphism: GLV or wGLV), the windowed method, the sliding-window method, wNAF (windowed non-adjacent form) to improve the elliptic curve point arithmetic, and Montgomery squaring (SOS) to improve the finite field squaring, Jacobian projective coordinates (this would have similar performance and gas costs as working with the homogeneous projective coordinates but it would be free to add it since we need this representation for ecPairing). This precompile has been audited a first time and it is currently being audited a second time (after the fixes).
modexp has been updated to support Big Int arithmetic. This means it is now fully compatible with EIP-198's specification and all the tests are passing, however the gas costs are really high. As an example, passing a modulus with three limbs (three uint256s) will most certainly make it run out of gas. The big cost is in the finite field div_rem function, which we need to have a modulo operator on big ints, taking around 80/90% of all the gas cost when calling the precompile. The gas cost skyrockets pretty quickly the more limbs numbers have. We are looking into optimization opportunities but gas costs may still remain really high. This precompile has not been audited yet.
ecPairing: We have based our algorithm implementation primarily on the guidelines presented in the paper "High-Speed Software Implementation of the Optimal Ate Pairing over Barreto–Naehrig Curves" . This implementation includes the utilization of Tower Extension Field Arithmetic and the Frobenius Operator.

To enhance the performance of the Miller loop, we have incorporated specific optimizations, we have optimized line evaluation based on the techniques outlined in "The Realm of the Pairings" . Also, instead of using Jacobian coordinates, we have adopted projective coordinates. This choice is particularly advantageous given the large inversion/multiplication ratio in this context.

In the final exponentiation phase, we have integrated the methods presented in "Memory-saving computation of the pairing final exponentiation on BN curves". This includes the Fuentes et al. method and the addition chain. We have also applied Faster Squaring in the Cyclotomic Subgroup, as described in ”Faster Squaring in the Cyclotomic Subgroup of Sixth Degree Extensions”.

Remaining Optimizations: While our implementation has achieved notable results, there are still some straightforward optimizations that can be implemented:
- Optimizing Accumulated Value: We are currently naively multiplying two fp12 elements, which contain many zeros. Modifying this calculation could enhance efficiency. This is in WIP.
Future Investigations: We need to investigate the reliability of additional optimizations, such as the application of the GLV method for multiplication of rational points of elliptic curves.
P256VERIFY is already working and optimized with Shamir’s trick. This precompile has been audited a first time and it is currently being audited a second time (after the fixes).
secp256k1VERIFY is already working and optimized with Shamir’s trick. This precompile has been audited a first time and it is currently being audited a second time (after the fixes).

Gas Consumption

Used Algorithms

		Precompile
Arithmetic	Operation	ecAdd	ecMul	modexp	P256VERIFY	secp256k1VERIFY
Prime Field Arithmetic	Addition	Montgomery Modular Addition	Montgomery Modular Addition	Big Unsigned Integer Addition	Montgomery Modular Addition	Montgomery Modular Addition
	Subtraction	Montgomery Modular Subtraction	Montgomery Modular Subtraction	Big Unsigned Integer Subtraction With Borrow	Montgomery Modular Subtraction	Montgomery Modular Subtraction
	Multiplication	Montgomery Modular Multiplication	Montgomery multiplication	Big Unsigned Integer Multiplication	Montgomery multiplication	Montgomery multiplication
	Exponentiation	-	-	Binary exponentiation	-	-
	Inversion	Modified Binary Extended GCD (adapted for Montgomery Form)	Modified Binary Extended GCD (adapted for Montgomery Form)	-	Modified Binary Extended GCD (adapted for Montgomery Form)	Modified Binary Extended GCD (adapted for Montgomery Form)
Elliptic Curve Arithmetic	Addition	Addition in Affine Form	Addition in Homogeneous Projective Form	-	Addition in Homogeneous Projective Form	Addition in Homogeneous Projective Form
	Double	Double in Affine Form	Double in Homogeneous Projective Form	-	Double in Homogeneous Projective Form	Double in Homogeneous Projective Form
	Scalar Multiplication	-	Double-and-add	-	Double-and-add	Double-and-add

Resources

You can find a curated list of helpful resources that we've used for guiding our implementations in References

Development

Follow the instructions below to setup the repo and run a development L2 node.

Running an era-test-node

Run one of the following commands to have a working test node.

make run-node
make run-node-light # no call trace, no hash resolving, and no gas details

Run the tests

If you want to run all the tests:

make test

If you want to run a specific test:

make test PRECOMPILE=<precompile_name>

To pull changes zk sync era node LC fork on the precompiles branch

git subtree pull --prefix=.test-node-subtree --squash [email protected]:lambdaclass/era-test-node.git lambdaclasss_precompiles

To push changes from local node to the branch

This should be used if for example a precompile is added or modified, and we want to push the chanes to the fork upstream

git subtree push -P .test-node-subtree [email protected]:lambdaclass/era-test-node.git lambdaclasss_precompiles

zksync_era_precompiles's People

Contributors

Stargazers

Watchers

Forkers

gmh5225 gmsorrena staralexbtc xiaolou86 pseud0n1nja 4nnnn razafindrabem2 zazamilham protocolwhisper adria0 mipo915 mildol5 doltrang wshino chfast samparsa1 denncrypto imanhaghian68

zksync_era_precompiles's Issues

Inefficient `overflowingAdd()` implementation

Context: EcMul.yul#L109

Description:

The overflowingAdd() returns the sum of two unsigned integers and a flag indicating if the addition overflowed. To check unsigned integer addition overflow it is enough to compare only one of the arguments against the sum: overflow = sum < a. The current implementation compares both arguments what is inefficient: overflow = (sum < a) or (sum < b).

Moreover, this addition with overflow check pattern is present in binaryExtendedEuclideanAlgorithm() twice although the overflowingAdd() is not used directly.

Recommendation:

Optimize overflowingAdd() and use it directly also in binaryExtendedEuclideanAlgorithm().
This however, will not have effect on the result optimized assembly because of the contexts the overflowingAdd() is used:

in let lo, overflowed := overflowingAdd(lowestHalfOfT, mul(m, P())) the lo is discarded so LLVM will optimize the overflow check,
in let newB, carry := overflowingAdd(b, modulus) the carry is optimized out because in and(iszero(modulusHasSpareBits), carry) the modulusHasSpareBits is constant 1.

diff --git a/precompiles/EcMul.yul b/precompiles/EcMul.yul
index 1eb80df..cf8223a 100644
--- a/precompiles/EcMul.yul
+++ b/precompiles/EcMul.yul
@@ -108,7 +108,7 @@ object "EcMul" {
             /// @return overflowed True if the addition overflowed, false otherwise.
             function overflowingAdd(augend, addend) -> sum, overflowed {
                 sum := add(augend, addend)
-                overflowed := or(lt(sum, augend), lt(sum, addend))
+                overflowed := lt(sum, augend)
             }

             /// @notice Checks if the LSB of a number is 1.
@@ -140,8 +140,7 @@ object "EcMul" {
                             b := shr(1, b)
                         }
                         case 1 {
-                            let newB := add(b, modulus)
-                            let carry := or(lt(newB, b), lt(newB, modulus))
+                            let newB, carry := overflowingAdd(b, modulus)
                             b := shr(1, newB)

                             if and(iszero(modulusHasSpareBits), carry) {
@@ -158,8 +157,7 @@ object "EcMul" {
                             c := shr(1, c)
                         }
                         case 1 {
-                            let newC := add(c, modulus)
-                            let carry := or(lt(newC, c), lt(newC, modulus))
+                            let newC, carry := overflowingAdd(c, modulus)
                             c := shr(1, newC)

                             if and(iszero(modulusHasSpareBits), carry) {

Optimization on `montgomeryAdd`

Context: EcMul.yul#L210

Description:

Update montgomeryAdd implementation for a more efficient one.

Recommendation:

function montgomeryAdd(augend, addend) -> ret {
    ret := add(augend, addend)
    if gt(ret, P()) {
        ret := sub(ret, P())
    }
}

This will be more efficient since addmod cares about overflow, but it couldn't happen in our case (253 bits long numbers)

Remove calldata size check

Context: ModExp.yul#L51

Description:
Check should be removed as the yellow paper says

Migrate the official precompile testing from ethereum

Migrate the official testing from Ethereum as follows:

Convert to rust testing using the test calldata for every case
Store the value returned from L1
Run the test agains the era-test-node and compare the result against L1

Research about using Montgomery form in `modexp`

Context: ModExp.yul

Description:
It is difficult to optimize mulmod at a compiler level, so we need to optimize modular multiplications at a Yul level. We need to look if it'd be more efficient to use the Montgomery form even without the algorithm values that could be precomputed with a known modulus, as in the elliptic curve precompiles.

Recommendation:

Implement Montgomery Modular Multiplication algorithm. This is done in the elliptic curve precompiles, the only difference is that values like R2_MOD_P and N_PRIME cannot be precomputed and should be computed after knowing the modulus. Also R cannot be $2^{256}$ and we should find a proper R for every modulus. I think that we could use $R = 2^{256}$ for a modulus with less than 256 bits.

Implement `P256VERIFY` precompile

Description: Implement the precompile P256VERIFY following the EIP-7212

Incorrect comparison of points in projective coordinates

Context: EcMul.yul#L508

Description:

In the multiplication loop when the addition R = R + Q is performed the special case of R == Q must be identified to apply point doubling procedure. The points R denoted as (xr, yr, zr) and Q denoted as (xq, yq, zq) are in projective coordinates. The comparison R == Q is done incorrectly as:

(xr == xq) and (yr == yq) and (zr == zq)

Projective coordinates represent an affinite point on curve as a line, i.e. (x, y, z) represents (x/z, y/z). Therefore the comparison should compare

(xr/zr == xq/zq) and (yr/zr == yq/zq)

what is

(xr*zq == xq*zr) and (yr*zq == yq*zr)

Because of the incorrect comparison, in some cases of R == Q the control flow will fall back to the common case // P1 + P2 = P3. This will compute invalid sum.

To find a test case reproducing the issue the following calculation is needed:

Q = 2^k * P
R = c * P, c < 2^k
Q = R
(2^k - c) * P = 0
therefore
2^k - c = n, where n is the prime group order
2^k + (2^k - n) is the multiplication scalar

So for example (for k = 254):

36007801746818822489539086759086678838086627932404247676030587817380756324351 * (1,2)

will be computed incorrectly.

The bigger set of tests (including the above) in hex input date:

000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000024f9bb18d1ece5fd647afba497e7ea7a2d7cc17b786468f6ebc1e0a6c0fffffff
00000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000002cf9bb18d1ece5fd647afba497e7ea7a2d7cc17b786468f6ebc1e0a6c0fffffff
13bdaa100d8dd914405f9f17986ec6df3f30096f0da40a39e5c07c941266205f1236dfd1de28dc00a000b40f11dfab9e6e101643f18e4dcfac70eef7021918624f9bb18d1ece5fd647afba497e7ea7a2d7cc17b786468f6ebc1e0a6c0fffffff
13bdaa100d8dd914405f9f17986ec6df3f30096f0da40a39e5c07c941266205f1236dfd1de28dc00a000b40f11dfab9e6e101643f18e4dcfac70eef702191862cf9bb18d1ece5fd647afba497e7ea7a2d7cc17b786468f6ebc1e0a6c0fffffff

Recommendation:

One way of fixing this is to use proper projective points comparison procedure: (xr*zq == xq*zr) and (yr*zq == yq*zr).
These value are already computed for the "P1 + P2 = P3" case.

diff --git a/precompiles/EcMul.yul b/precompiles/EcMul.yul
index 1eb80df..fabc94c 100644
--- a/precompiles/EcMul.yul
+++ b/precompiles/EcMul.yul
@@ -505,7 +505,15 @@ object "EcMul" {
                         scalar := shr(1, scalar)
                         continue
                     }
-                    if and(and(eq(xr, xq), eq(yr, yq)), eq(zr, zq)) {
+
+                    let t0 := montgomeryMul(yq, zr)
+                    let t1 := montgomeryMul(yr, zq)
+                    let t := montgomerySub(t0, t1)
+                    let u0 := montgomeryMul(xq, zr)
+                    let u1 := montgomeryMul(xr, zq)
+                    let u := montgomerySub(u0, u1)
+
+                    if iszero(or(t, u)) {
                         // P + P = 2P
                         xr, yr, zr := projectiveDouble(xr, yr, zr)

@@ -519,12 +527,6 @@ object "EcMul" {

                     // P1 + P2 = P3

-                    let t0 := montgomeryMul(yq, zr)
-                    let t1 := montgomeryMul(yr, zq)
-                    let t := montgomerySub(t0, t1)
-                    let u0 := montgomeryMul(xq, zr)
-                    let u1 := montgomeryMul(xr, zq)
-                    let u := montgomerySub(u0, u1)
                     let u2 := montgomeryMul(u, u)
                     let u3 := montgomeryMul(u2, u)
                     let v := montgomeryMul(zq, zr)

Implement multi-precision arithmetic for `modexp`

Context: ModExp.yul

Description:

As the base, exponent, and modulus are big integers (represented with limbs) the arithmetic between those need to be multi-precision. In the case of this precompile we just need to implement the addition and multiplication.

Recommendation:

Given that the precompile EIP does not specify a maximum number of limbs and that Yul functions cannot receive a variable amount of parameters, we will send the pointers and the number of limbs for every number involved in the operation.

bigAdd(augendPointer, augendLimbs, addendPointer, addendLimbs, modulus).
bigMul(multiplicandPointer, multiplicandLimbs, multiplierPointer, multiplierLimbs, modulus).

Check this algorithm.

Implement Big UInt Addition for `modexp`

Context

modexp.yul

Description

The addition between big integers is needed for the booth algorithm implementation

Recommendation

We'll probably follow this implementation from Lambdaworks repo

Optimize ecMul with the Montgomery form

Implement ecPairing

G2 Subgroup Check

Context: EcPairing.yul#L1383

Description:

Given a pair $P = (x,y) ∈ F_{p^{2}}$, it is easy to check if P is a point on the twisted curve $E′(F_{p^{2}})$. However, we need to check further if $P$ lies in the subgroup $G_{2} = E′(F_{p^{2}})[r]$. Let # $E^{'}(F_{p^{2}}) = c_{2}r$. Unfortunately in our case $c_{2} ≠ 1$. This number $c_{2}$ is known as the $G_2$ cofactor.

For the specific case of BN254, we get:

#E'(F_{p^2}) = 479095176016622842441988045216678740799252316531100822436447802254070093686356349204969212544220033486413271283566945264650845755880805213916963058350733
c_2 = 21888242871839275222246405745257275088844257914179612981679871602714643921549

The `binaryExtendedEuclideanAlgorithm` is not tuned for constant sparse modulus

Context: EcMul.yul#L126

Description:

The binaryExtendedEuclideanAlgorithm() has a runtime check for the modulus being sparse. Because the field modulus of BN254 is constant and sparse the modulusHasSpareBits := iszero(and(modulus, mask)) is always 1.

Recommendation:

You can eliminate calculation of modulusHasSpareBits and conditional branches that depend on it (now always false): if and(iszero(modulusHasSpareBits), carry).

diff --git a/precompiles/EcMul.yul b/precompiles/EcMul.yul
index cf8223a..632be71 100644
--- a/precompiles/EcMul.yul
+++ b/precompiles/EcMul.yul
@@ -119,12 +119,7 @@ object "EcMul" {
             }

             function binaryExtendedEuclideanAlgorithm(base) -> inv {
-                // Precomputation of 1 << 255
-                let mask := 57896044618658097711785492504343953926634992332820282019728792003956564819968
                 let modulus := P()
-                // modulus >> 255 == 0 -> modulus & 1 << 255 == 0
-                let modulusHasSpareBits := iszero(and(modulus, mask))
-
                 let u := base
                 let v := modulus
                 // Avoids unnecessary reduction step.
@@ -140,12 +135,7 @@ object "EcMul" {
                             b := shr(1, b)
                         }
                         case 1 {
-                            let newB, carry := overflowingAdd(b, modulus)
-                            b := shr(1, newB)
-
-                            if and(iszero(modulusHasSpareBits), carry) {
-                                b := or(b, mask)
-                            }
+                            b := shr(1, add(b, modulus))
                         }
                     }

@@ -157,12 +147,7 @@ object "EcMul" {
                             c := shr(1, c)
                         }
                         case 1 {
-                            let newC, carry := overflowingAdd(c, modulus)
-                            c := shr(1, newC)
-
-                            if and(iszero(modulusHasSpareBits), carry) {
-                                c := or(c, mask)
-                            }
+                            c := shr(1, add(c, modulus))
                         }
                     }

`burnGas` can be replaced with `invalid`

Context:
EcMul.yul#L90

Description:
On wrong input data gas can be burned with invalid(), It makes exactly the same what burnGas(). It can be replaced.

Recommendation:
Remove burnGas() and use invalid() instead.

diff --git a/precompiles/EcMul.yul b/precompiles/EcMul.yul
index 1eb80df..0f65f95 100644
--- a/precompiles/EcMul.yul
+++ b/precompiles/EcMul.yul
@@ -79,20 +79,6 @@ object "EcMul" {
             //                      HELPER FUNCTIONS
             // ////////////////////////////////////////////////////////////////

-            /// @dev Executes the `precompileCall` opcode.
-            function precompileCall(precompileParams, gasToBurn) -> ret {
-                // Compiler simulation for calling `precompileCall` opcode
-                ret := verbatim_2i_1o("precompile", precompileParams, gasToBurn)
-            }
-
-            /// @notice Burns remaining gas until revert.
-            /// @dev This function is used to burn gas in the case of a failed precompile call.
-            function burnGas() {
-                // Precompiles that do not have a circuit counterpart
-                // will burn the provided gas by calling this function.
-                precompileCall(0, gas())
-            }
-
             /// @notice Retrieves the highest half of the multiplication result.
             /// @param multiplicand The value to multiply.
             /// @param multiplier The multiplier.
@@ -423,7 +409,7 @@ object "EcMul" {
             let x := calldataload(0)
             let y := calldataload(32)
             if iszero(affinePointCoordinatesAreOnGroupOrder(x, y)) {
-                burnGas()
+                invalid()
             }
             let scalar := calldataload(64)

@@ -437,7 +423,7 @@ object "EcMul" {

             // Ensure that the point is in the curve (Y^2 = X^3 + 3).
             if iszero(affinePointIsOnCurve(m_x, m_y)) {
-                burnGas()
+                invalid()
             }

             if eq(scalar, ZERO()) {

Implement Big UInt Multiplication for `modexp`

Context: modexp.yul

Description:

/// @notice Performs the big unsigned integer multiplication between to big unsigned integers with an arbitrary amount of limbs.
/// @dev The product is stored from `productPtr` to `productPtr + (WORD_SIZE * nLimbs)`.
bigUIntMul(multiplicandPtr, multiplierPtr, nLimbs, productPtr)

multiplicandPtr: The pointer to the MSB of the multiplicand.
multiplierPtr: The pointer to the MSB of the multiplier.
nLimbs: The number of limbs needed to represent the number to square.
productPtr: The pointer to where you want the product to be stored.

Recommendation:

Follow Vlad's implementation (tag me for more details if needed).

Wrong naming: group order vs finate field's prime

Context: EcAdd.yul#L31

Description:

The elliptic curve is defined over a finite field defined by the prime number P. The elliptic curve itself is an algebraic group of order N. The implementations of the ecadd and ecmul incorrectly use the name "group order" in constant names, function names, and in comments where they exclusively mean the finite field prime number. It seems the issue might have been noticed in ecmul because e.g. ALT_BN128_GROUP_ORDER has been renamed to P recently, but there are still a lot of function names and comments referencing "group order".

Recommendation:

In EcAdd.yul and EcMul.yul replace all usages of name "group order" with name "field prime" (or at least "field order"). Constant and function names which should be changed:

ALT_BN128_GROUP_ORDER
R2_MOD_ALT_BN128_GROUP_ORDER
isOnGroupOrder
coordinateIsOnGroupOrder
affinePointCoordinatesAreOnGroupOrder
projectivePointCoordinatesAreOnGroupOrder

Optimize ecPairing

Add era-test-node to the repo to run tests on CI

We should add own our fork of the era-test-node with the precompiles deployed to the github CI to run the tests

Add test job for the CI

Implement wGLV for faster scalar multiplication (`ecMul` & `ecPairing`)

Context: EcMul..yul, EcPairing.yul

Description:

Implement wGLV scalar multiplication. In theory, it could improve up to 50% the cost of ecMul and significantly reduce the cost of ecPairing.

Recommendation:

Check these papers:

More efficient algorithm for double a point in projective coordinates

Context: EcMul.yul#L402

Description:

More efficient algorithm for double a point in projective coordinates can we used:

"Algorithm 9: Exception-free point doubling for prime order j-invariant 0 short Weierstrass curves"
from Complete addition formulas for prime order elliptic curves.

The current algorithm uses 11 multiplications, 6 additions and 3 subtractions. It cannot handle a point-at-infinity, i.e. for any (0, y, 0) it produces invalid value (0, 0, 0) (not an element for the projective space).

The new algorithm uses 9 multiplications, 8 additions and 1 subtraction (additions are better than subtraction because subtraction a+b is implemented as a+(p-b)). It properly handles a point-at-infinity, specifically for any (0, y, 0) it produces (0, y⁴, 0).

In two arbitrary selected, non-trivial point multiplication tests the new doubling algorithm has reduced gas used:

409744 to 380434 (-7%)
422338 to 393118 (-7%)

And reduced EcMul code size (calculated as number of lines in the assembly) from 1262 to 1153 (-9%).

Recommendation:

Change the implementation of projectiveDouble() to the proposed algorithm.

diff --git a/precompiles/EcMul.yul b/precompiles/EcMul.yul
index 1eb80df..1406dd5 100644
--- a/precompiles/EcMul.yul
+++ b/precompiles/EcMul.yul
@@ -44,6 +44,13 @@ object "EcMul" {
                 m_three := 19052624634359457937016868847204597229365286637454337178037183604060995791063
             }

+            /// @notice Constant function for value 3*b (i.e. 9) in Montgomery form.
+            /// @dev This value was precomputed using Python.
+            /// @return m_b3 The value 9 in Montgomery form.
+            function MONTGOMERY_B3() -> m_b3 {
+                m_b3 := 13381388159399823366557795051099241510703237597767364208733475022892534956023
+            }
+
             /// @notice Constant function for the alt_bn128 group order.
             /// @dev See https://eips.ethereum.org/EIPS/eip-196 for further details.
             /// @return ret The alt_bn128 group order.
@@ -390,9 +397,9 @@ object "EcMul" {
             }

             /// @notice Doubles a point in projective coordinates in Montgomery form.
-            /// @dev See https://www.nayuki.io/page/elliptic-curve-point-addition-in-projective-coordinates for further details.
-            /// @dev For performance reasons, the point is assumed to be previously checked to be on the
-            /// @dev curve and not the point at infinity.
+            /// @dev See Algorithm 9 in https://eprint.iacr.org/2015/1060.pdf for further details.
+            /// @dev The point is assumed to be on the curve.
+            /// @dev It works correctly for the point at infinity.
             /// @param xp The x coordinate of the point P in projective coordinates in Montgomery form.
             /// @param yp The y coordinate of the point P in projective coordinates in Montgomery form.
             /// @param zp The z coordinate of the point P in projective coordinates in Montgomery form.
@@ -400,19 +407,28 @@ object "EcMul" {
             /// @return yr The y coordinate of the point 2P in projective coordinates in Montgomery form.
             /// @return zr The z coordinate of the point 2P in projective coordinates in Montgomery form.
             function projectiveDouble(xp, yp, zp) -> xr, yr, zr {
-                let x_squared := montgomeryMul(xp, xp)
-                let t := montgomeryAdd(x_squared, montgomeryAdd(x_squared, x_squared))
-                let yz := montgomeryMul(yp, zp)
-                let u := montgomeryAdd(yz, yz)
-                let uxy := montgomeryMul(u, montgomeryMul(xp, yp))
-                let v := montgomeryAdd(uxy, uxy)
-                let w := montgomerySub(montgomeryMul(t, t), montgomeryAdd(v, v))
-
-                xr := montgomeryMul(u, w)
-                let uy := montgomeryMul(u, yp)
-                let uy_squared := montgomeryMul(uy, uy)
-                yr := montgomerySub(montgomeryMul(t, montgomerySub(v, w)), montgomeryAdd(uy_squared, uy_squared))
-                zr := montgomeryMul(u, montgomeryMul(u, u))
+                let t0
+                let t1
+                let t2
+
+                t0 := montgomeryMul(yp, yp)
+                zr := montgomeryAdd(t0, t0)
+                zr := montgomeryAdd(zr, zr)
+                zr := montgomeryAdd(zr, zr)
+                t1 := montgomeryMul(yp, zp)
+                t2 := montgomeryMul(zp, zp)
+                t2 := montgomeryMul(MONTGOMERY_B3(), t2)
+                xr := montgomeryMul(t2, zr)
+                yr := montgomeryAdd(t0, t2)
+                zr := montgomeryMul(t1, zr)
+                t1 := montgomeryAdd(t2, t2)
+                t2 := montgomeryAdd(t1, t2)
+                t0 := montgomerySub(t0, t2)
+                yr := montgomeryMul(t0, yr)
+                yr := montgomeryAdd(xr, yr)
+                t1 := montgomeryMul(xp, yp)
+                xr := montgomeryMul(t0, t1)
+                xr := montgomeryAdd(xr, xr)
             }

Optimize modExp with the Montgomery form

Implement Big UInt Subtraction with Borrow for `modexp`

Context: modexp.yul

Description:

/// @notice Computes `lhs - (rhs + borrow)`, returning the result along with the new borrow.
/// @dev The result is stored from `resPtr` to `resPtr + (WORD_SIZE * nLimbs)`.
bigUIntSBB(lhsPtr, rhsPtr, borrowPtr, nLimbs, resPtr)

lhsPtr: The pointer to the MSB of the left operand.
rhsPtr: The pointer to the MSB of the right operand.
borrowPtr: The pointer to the MSB of the borrow.
nLimbs: The number of limbs needed to represent the operands.
resPtr: The pointer to where you want the result to be stored.

Recommendation:

Follow this implementation.

Implement Big UInt Bitwise Or for `modexp`

Context: modexp.yul

Description:

/// @notice Performs the big unsigned integer bit or operation.
/// @dev The result is stored from `resPtr` to `resPtr + (WORD_SIZE * nLimbs)`.
bigUIntBitOr(lhsPtr, rhsPtr, nLimbs, resPtr)

lhsPtr: The pointer to the MSB of the left operand.
rhsPtr: The pointer to the MSB of the right operand.
nLimbs: The number of limbs needed to represent the operands.
resPtr: The pointer to where you want the result to be stored.

Recommendation:

Follow this implementation.

In the point multiplication loop the point Q can never be point-at-infinity

Context: EcMul.yul#L476

Description:

In the point multiplication loop the input point is P, with affine coordinates (x, y) and projective coordinates (xp, yp, zp). P is not the point-at-infinity in the loop because this is checked earlier by if affinePointIsInfinity(x, y).

In the loop the point Q (xq, yq, zq) is doubled each iteration starting from P: P, 2P, 4P, 8P, ... i.e. 2ᵏP.

For 2ᵏP to be 0 (point-at-infinity) the 2ᵏ would need to a multiply of the group order (n) which is a prime number. This is impossible by the prime factorization theorem: number 2ᵏ doesn't have the n prime factor.

Recommendation:

The code related to qIsInfinity variable can be safely removed (see the diff below).
In two arbitrary selected, non-trivial point multiplication tests this has changed gas used:

409744 to 410176 (!)
422338 to 422050

And reduced EcMul code size (calculated as number of lines in the assembly) from 1262 to 1254.

diff --git a/precompiles/EcMul.yul b/precompiles/EcMul.yul
index 1eb80df..9bb36b7 100644
--- a/precompiles/EcMul.yul
+++ b/precompiles/EcMul.yul
@@ -473,13 +473,8 @@ object "EcMul" {
             let zr := ZERO()
             for {} scalar {} {
                 if lsbIsOne(scalar) {
-                    let qIsInfinity := projectivePointIsInfinity(xq, yq, zq)
                     let rIsInfinity := projectivePointIsInfinity(xr, yr, zr)
-                    if and(rIsInfinity, qIsInfinity) {
-                        // Infinity + Infinity = Infinity
-                        break
-                    }
-                    if and(rIsInfinity, iszero(qIsInfinity)) {
+                    if rIsInfinity {
                         // Infinity + P = P
                         xr := xq
                         yr := yq
@@ -490,10 +485,6 @@ object "EcMul" {
                         scalar := shr(1, scalar)
                         continue
                     }
-                    if and(iszero(rIsInfinity), qIsInfinity) {
-                        // P + Infinity = P
-                        break
-                    }
                     if and(and(eq(xr, xq), eq(montgomerySub(ZERO(), yr), yq)), eq(zr, zq)) {
                         // P + (-P) = Infinity
                         xr := ZERO()

Incorrect representation of point-at-infinity in projective coordinates

Context: EcMul.yul#L500

Description:

In the point multiplication loop, when performing addition R = R + Q for the case R = -Q the R should be correctly set to the point-at-infinity. However, invalid representation of the point-at-infinity in projective coordinates is used: (0, 0, 0). The correct representation of the point-at-infinity is with class equivalence of (0:1:0). See https://www.johndcook.com/blog/2019/02/24/the-point-at-infinity/ for a short description.

Recommendation:

Change yr := ZERO() to yr := MONTGOMERY_ONE() to get valid representation of the point-at-infinity and be consistent with the yr initialization value before the multiplication loop.

In practice any other value other than zero can be used in place of MONTGOMERY_ONE() because it still remains equivalent to (0:1:0). Therefore, simply using 1 may save some code size.

Optimize accumulated value

Context: EcPairing.yul

Description: We are currently naively multiplying two $F_{p^{12}}$ elements, which contain many zeros. Modifying this calculation could enhance efficiency.

Add gh-pages workflow

Unused functions in EcMul

Context: EcMul.yul#L331

Description:

The following functions are unused in the EcMul code:

THREE()
P_MINUS_ONE()
montgomeryModExp()
projectivePointIsOnCurve()

Recommendation:

Remove the functions from the code.

Implement Big UInt Right Shift for `modexp`

Context: modexp.yul

Description:

/// @notice Performs the big unsigned integer right shift (>>).
/// @dev The result is stored from `shiftedPtr` to `shiftedPtr + (WORD_SIZE * nLimbs)`.
bigUIntShr(numberPtr, nLimbs, shiftedPtr)

numberPtr: The pointer to the MSB of the number to shift.
nLimbs: The number of limbs needed to represent the number to shift.
shiftedPtr: The pointer to where you want the result to be stored.

Recommendation:

Follow this implementation.

`binaryExtendedEuclideanAlgorithm` needs clarification

Context: EcMul.yul#L121

Description:

This algorithm is a modification of Algorithm 3 MontInvbEEA from Montgomery inversion but it's not explained how it was modified.

This comment is not clear:

// Avoids unnecessary reduction step.
let b := R2_MOD_P()

It is a proper initialization of b variable to the adjuster R^2 mod P which modifies the algorithm to return inversion of aR mod P but the the comment says something different.

Besides not clear doc comments the algorithm is correct.

Recommendation:

Add proper reference to paper/document describing the algorithm.
It should be describe where the algorithm was taken from and how it was modified.

Implement Big UInt Division (with remainder) for `modexp`

Context: modexp.yul

Description:

/// @notice Performs the big unsigned integer square of big unsigned integers with an arbitrary amount of limbs.
/// @dev The quotient is stored from `quotientPtr` to `quotientPtr + (WORD_SIZE * nLimbs)`.
/// @dev The reminder is stored from `reminderPtr` to `reminderPtr + (WORD_SIZE * nLimbs)`.
bigUIntDivRem(dividendPtr, divisorPtr, nLimbs, quotientPtr, reminderPtr)

dividendPtr: The pointer to the MSB of the dividend.
divisorPtr: The pointer to the MSB of the divisor.
nLimbs: The number of limbs needed to represent the number to square.
quotientPtr: The pointer to where you want the quotient to be stored.
remainderPtr: The pointer to where you want the remainder to be stored.

Subtasks:

Documentation

Implement Big UInt Left Shift for `modexp`

Context: modexp.yul

Description:

/// @notice Performs the big unsigned integer left shift (<<).
/// @dev The result is stored from `shiftedPtr` to `shiftedPtr + (WORD_SIZE * nLimbs)`.
bigUIntShl(numberPtr, nLimbs, shiftedPtr)

numberPtr: The pointer to the MSB of the number to shift.
nLimbs: The number of limbs needed to represent the number to shift.
shiftedPtr: The pointer to where you want the result to be stored.

Recommendation:

Follow this implementation.

Bug: Tests for unhappy path pass when the function returns an HTTPError

What?

We have some tests that assert that the function call returns in failure. This is great, but we have the problem that we aren't checking the exact type of error. This makes the test pass when no Era testnet is connected.

Requirements

Assert on the exact error type we want and not just any error returned by the function.

Steps to reproduce

Ensure that the era tesst node isn't running (port 8011)
cargo test

Implement Shamir's trick for `P256VERIFY`

Context: P256VERIFY.yul

Description:

Using Shamir's trick, a sum of two scalar multiplications $u_{1}\times G+u_{2}\times Q_{A}$ can be calculated faster than two scalar multiplications done independently.

Recommendation:

Check this (page 27).

Optimize first iterations of Miller Loop

Context: EcPairing.yul

Description:

We can avoid unnecessary multiplications by handling the first iterations of the Miller loop separately.

The constant ZERO, ONE, TWO are used inconsistently, probably not needed

Context:
EcMul.yul#L11
EcMul.yul#L17
EcMul.yul#L23

Description:
These constants are not needed and their usage can be replaced by their values literals. If we want to keep them they should be used consistently in the code. I.e:
c := shr(1, c) - > c := shr(ONE(), c)
if mod(aux, 2) -> if mod(aux, TWO())

Recommendation:

Remove these functions and replace usages with 0, 1 and 2 accordigly.

Make `modexp` consistent with the other precompiles

Context: ModExp.yul

Description: Remove constant functions ZERO(), ONE(), TWO().

Optimize ecAdd with the Montgomery form

`P + (-P)` special case in main EcMul loop is not needed

Context: EcMul.yul#L497

Description:
This case can be handled by the general algorithm for points' adding (P1 + P2). Proper infinity point is returned (0 ,1, 0) There is no need to handle this separately because the case is very rare and handling it does not provide significant optimization, but the contract code is bigger. Look also at #62.

Recommendation:
This case can be safely removed.
In two arbitrary selected, non-trivial point multiplication tests this has changed gas used:

409744 to 406774
422338 to 419068

And reduced EcMul code size (calculated as number of lines in the assembly) from 1262 to 1104.

diff --git a/precompiles/EcMul.yul b/precompiles/EcMul.yul
index 1eb80df..4986c41 100644
--- a/precompiles/EcMul.yul
+++ b/precompiles/EcMul.yul
@@ -494,17 +494,6 @@ object "EcMul" {
                         // P + Infinity = P
                         break
                     }
-                    if and(and(eq(xr, xq), eq(montgomerySub(ZERO(), yr), yq)), eq(zr, zq)) {
-                        // P + (-P) = Infinity
-                        xr := ZERO()
-                        yr := ZERO()
-                        zr := ZERO()
-
-                        xq, yq, zq := projectiveDouble(xq, yq, zq)
-                        // Check next bit
-                        scalar := shr(1, scalar)
-                        continue
-                    }
                     if and(and(eq(xr, xq), eq(yr, yq)), eq(zr, zq)) {
                         // P + P = 2P
                         xr, yr, zr := projectiveDouble(xr, yr, zr)

Unnecessary use of switch case

Context: EcMul.yul#L324

Description:

The use of the switch case could be removed in projectiveIntoAffine.

Recommendation:

xr and yr are both 0 when declared, so setting them to that value when zp is 0 is not necessary, we just need to set them if they're different from 0 and that is the case when zp is.

function projectiveIntoAffine(xp, yp, zp) -> xr, yr {
       if zp {
            let zp_inv := montgomeryModularInverse(zp)
            xr := montgomeryMul(xp, zp_inv)
            yr := montgomeryMul(yp, zp_inv)
        }
}

Document `montgomeryAdd` and `montgomerySub`

Context: EcAdd.yul, EcMul.yul, EcPairing.yul

Description: Function documentations is missing

Implement Big UInt Square for `modexp`

Context: modexp.yul

Description:

/// @notice Performs the big unsigned integer square of big unsigned integers with an arbitrary amount of limbs.
/// @dev The product is stored from `productPtr` to `productPtr + (WORD_SIZE * nLimbs)`.
bigUIntSqr(numberPtr, nLimbs, productPtr)

numberPtr: The pointer to the MSB of the big unsigned integer to square.
nLimbs: The number of limbs needed to represent the number to square.
productPtr: The pointer to where you want the product to be stored.

Recommendation:

Follow this implementation.

Create a template for the Issues

Template

**Context:**

**Description:**

**Recommendation:**

Notes

Context: Line/s of code or code file/s involved in the issue.
Description: Detailed description of the issue.
Recommendation: If you have an idea of how to solve it, it is welcome.

No test case for point coordinates from the outside of the field in EcMul

Context: EcMul.yul#L426

Description:
This case is not covered by any unit tests.

Recommendation:
Add proper test case passing point with coordinates > P()

diff --git a/tests/tests/ecmul_tests.rs b/tests/tests/ecmul_tests.rs
index 770467d..b629467 100644
--- a/tests/tests/ecmul_tests.rs
+++ b/tests/tests/ecmul_tests.rs
@@ -1636,6 +1636,17 @@ async fn ecmul_13bdaa10_1236dfd1_dbl5_28000_96() {
 	assert_eq!(eth_response, era_response, "");
 }
 
+#[tokio::test]
+async fn ecmul_invalid_point_P_2_28000_96() {
+	assert!(eth_call(ECMUL_PRECOMPILE_ADDRESS, None, Some(Bytes::from(hex::decode("30644E72E131A029B85045B68181585D97816A916871CA8D3C208C16D87CFD4700000000000000000000000000000000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000002").unwrap()))).await.is_err());
+	assert!(era_call(ECMUL_PRECOMPILE_ADDRESS, None, Some(Bytes::from(hex::decode("30644E72E131A029B85045B68181585D97816A916871CA8D3C208C16D87CFD4700000000000000000000000000000000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000002").unwrap()))).await.is_err());
+}
+
+#[tokio::test]
+async fn ecmul_invalid_point_P_plus_1__P_plus_2_28000_96() {
+	assert!(eth_call(ECMUL_PRECOMPILE_ADDRESS, None, Some(Bytes::from(hex::decode("30644E72E131A029B85045B68181585D97816A916871CA8D3C208C16D87CFD4830644E72E131A029B85045B68181585D97816A916871CA8D3C208C16D87CFD490000000000000000000000000000000000000000000000000000000000000002").unwrap()))).await.is_err());
+	assert!(era_call(ECMUL_PRECOMPILE_ADDRESS, None, Some(Bytes::from(hex::decode("30644E72E131A029B85045B68181585D97816A916871CA8D3C208C16D87CFD4830644E72E131A029B85045B68181585D97816A916871CA8D3C208C16D87CFD490000000000000000000000000000000000000000000000000000000000000002").unwrap()))).await.is_err());
+}
 
 #[tokio::test]
 async fn ecmul_ethereum_tests() {

G2 membership check using efficient endomorphism

Context: EcPairing.yul

Description:

An easy, but slow, way to check if $P ∈ E′(F_{p^{2}})[r]$ is to see if $[r]P = O$. This is undesirable since $r$ has 254 bits in the case of BN254.

Recommendation:

Check this paper.

`intoMontgomeryForm` does not have to calculate `mod`

Context:
EcMul.yul#L219
Description:
This EcMul.yul#L220 can be removed. intoMontgomeryForm is used only after check that point coordinates are already proper field members EcMul.yul#L425

Recommendation:

Remove the EcMul.yul#L220 line and change the code of intoMontgomeryForm accordingly.

In two arbitrary selected, non-trivial point multiplication tests this has changed gas used:

409744 to 409732
422338 to 422326

And reduced EcMul code size (calculated as number of lines in the assembly) from 1262 to 1260.

diff --git a/precompiles/EcMul.yul b/precompiles/EcMul.yul
index 1eb80df..d720473 100644
--- a/precompiles/EcMul.yul
+++ b/precompiles/EcMul.yul
@@ -217,9 +217,8 @@ object "EcMul" {
             /// @param a The field element to encode.
             /// @return ret The field element in Montgomery form.
             function intoMontgomeryForm(a) -> ret {
-                let temp := mod(a, P())
-                let hi := getHighestHalfOfMultiplication(temp, R2_MOD_P())
-                let lo := mul(temp, R2_MOD_P())
+                let hi := getHighestHalfOfMultiplication(a, R2_MOD_P())
+                let lo := mul(a, R2_MOD_P())
                 ret := REDC(lo, hi)
             }

It is assumed that the value 0 is stored in the first 64 bytes of the memory. We are not sure if this is right every time so we need to ensure that.

Recommendation:

Ensure that the value 0 is stored before returning

EcMul.yul#L380

if affinePointIsInfinity(x, y) {
    // Infinity * scalar = Infinity
    mstore(0x00, 0x00)
    mstore(0x20, 0x00)
    return(0x00, 0x40)
}

EcMul.yul#L393

if eq(scalar, 0) {
    // Infinity * scalar = Infinity
    mstore(0x00, 0x00)
    mstore(0x20, 0x00)
    return(0x00, 0x40)
}

lambdaclass / zksync_era_precompiles Goto Github PK

zksync_era_precompiles's Introduction

zkSync Era Precompiles

Current Status

Summary

Used Algorithms

Resources

Development

Running an era-test-node

Run the tests

To pull changes zk sync era node LC fork on the precompiles branch

To push changes from local node to the branch

zksync_era_precompiles's People

Contributors

Stargazers

Watchers

Forkers

zksync_era_precompiles's Issues

Context

Description

Recommendation

Description:

Subtasks:

Documentation

What?

Requirements

Steps to reproduce

Template

Notes

Recommend Projects

Recommend Topics

Recommend Org

Jobs