GithubHelp home page GithubHelp logo

prysmaticlabs / prysm Goto Github PK

View Code? Open in Web Editor NEW
3.4K 3.4K 908.0 156.49 MB

Go implementation of Ethereum proof of stake

Home Page: https://www.offchainlabs.com/

License: GNU General Public License v3.0

Go 92.94% Shell 0.63% Dockerfile 0.01% Starlark 6.07% Smarty 0.20% Batchfile 0.06% PowerShell 0.03% Solidity 0.06% C++ 0.01%
ethereum

prysm's Introduction

Prysm: An Ethereum Consensus Implementation Written in Go

Build status Go Report Card Consensus_Spec_Version 1.4.0 Execution_API_Version 1.0.0-beta.2 Discord GitPOAP Badge

This is the core repository for Prysm, a Golang implementation of the Ethereum Consensus specification, developed by Offchain Labs. See the Changelog for details of the latest releases and upcoming breaking changes.

Getting Started

A detailed set of installation and usage instructions as well as breakdowns of each individual component are available in the official documentation portal. If you still have questions, feel free to stop by our Discord.

Staking on Mainnet

To participate in staking, you can join the official eth2 launchpad. The launchpad is the only recommended way to become a validator on mainnet. You can explore validator rewards/penalties via Bitfly's block explorer: beaconcha.in, and follow the latest blocks added to the chain on beaconscan.

Contributing

Branches

Prysm maintains two permanent branches:

  • master: This points to the latest stable release. It is ideal for most users.
  • develop: This is used for development, it contains the latest PRs. Developers should base their PRs on this branch.

Guide

Want to get involved? Check out our Contribution Guide to learn more!

License

GNU General Public License v3.0

Legal Disclaimer

Terms of Use

prysm's People

Contributors

0xkiwi avatar bidlocode avatar db2510 avatar dv8silencer avatar enriquefynn avatar farazdagi avatar houjieth avatar int88 avatar james-prysm avatar jmozah avatar jsvisa avatar jtraglia avatar kasey avatar leolara avatar mcdee avatar michaelneuder avatar nalepae avatar nisdas avatar patricevignola avatar pinglamb avatar potuz avatar prestonvanloon avatar rauljordan avatar renovate[bot] avatar rkapka avatar roveneliah avatar saolyn avatar shayzluf avatar skillful-alex avatar terencechain avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prysm's Issues

Sharding-Enabled Merkle Trie Design Choices

Hi all,

This issue will explore all of the latest talk around what Merkle tree scheme we will follow as part of our sharding implementation. As #100 is being worked on, it is important to discuss the state trie design as it is beyond the scope of that PR. Sharding gives us a unique opportunity to optimize Ethereum's state tries in ways that were not possible before.

In its current form, Ethereum's trie has two layers. That is, there is a global accounts trie where each account has a corresponding storage trie. In a system with a sharded state, the inefficiencies behind this current scheme become immediately apparent as cross-shard transactions would have too many dependencies.

In addition, the data availability problem becomes a major concern when exploring stateless client mechanisms in a sharded environment. We want our trees to have built-in mechanisms for fraud proofs and data validity checks that are efficient.

Finally, we want to be storage-efficient with our trie implementation. We maintain that storing the trie representation should have as little footprint on sharding clients.

Storage Efficiency

To increase storage efficiency and create a single-layer trie instead of the current two-layers (accounts trie and storage trie for each account), a solid approach would be to use a Binary Tree as proposed by Dfinity, as it creates much smaller proofs.

The context of this approach was first explored in this: A Two Layer Account Trie Inside a Single Layer Trie on ETHResearch.

Vitalik's implementation in Python

Data Availability

See: Data-Availability Friendly Tries and Detailed Analysis of Stateless Client Witness Size.

Another possible approach to a modified Merkle tree that is friendly towards data availability proofs is a Sparse Merkle Tree construction. This approach would save what we call "intermediate state roots". This means that "a full node can now easily generate efficient fraud proofs for invalid state roots, because they only have to prove that one intermediate root is invalid for the whole block to be invalid." (ETHResearch Post) and would give the system massive efficiency gains.

Another approach is to use Reed-Solomon reparation schemes as our tree implementation, but this is a bit fuzzier and not talked about as much. It was proposed in the following ETHResearch post

Potential Pull Request

A potential PR could be to just start with a binary tree similar to what Vitalik was creating in the Python implementation and test out some basic benchmarks in Go. As this issue is fleshed out more, we can discuss what other scheme we can follow next, including sparse Merkle trees.

Please let me know your thoughts, team.

Proposal Pool Implementation Notes

Let's use this to figure out the implementation detail of the proposal pool. Propers submits collations to the proposal pool, and Validators subscribes and fetches collations from the proposal pool Here is the functionality splits up into multiple tasks

  • API to inspect collations - list the details of all the collations currently pending/queued for inclusion in the next block
  • API for overall status - list the total number of collations currently pending/queued for inclusion in the next block
  • Cli flags - price limit/ price bump/account slots/global slots

Let me know if I missed anything, we can start a discussion on this through here

Running a notary always tries to deposit 1000ETH

To reproduce:

Deploy the contract, deposit ETH as a notary, update config.go with the contract address, then run the client as a notary again without --deposit.

geth sharding-notary --datadir=/tmp/data --networkid=1337INFO [05-13|11:15:55] Starting notary client
INFO [05-13|11:15:55] Joining notary pool
unable to deposit eth and become a notary: failed to estimate gas needed: gas required exceeds allowance or always failing transaction

SMC Changes for a minimal sharding protocol

The research team has identified a few new changes in "spec 1.1" that they are confident will not be removed in a future specification.

High level requirements (please check off these as PRs are merged):

  • Anyone can call addHeader(period_id, shard_id, chunks_root) at any time. The first header to get included for a given shard in a given period gets in, all others don’t. This function just emits a log.

  • For every combination of shard and period, N collators (now called “notaries”) are sampled. They try to download the collation body corresponding to any header that was submitted. They can call a function submitVote(period_id, shard_id, chunks_root). This function just emits a log.

  • Clients read logs. If a client sees that in some shard, for some period, a chunk has been included and >= 2N/3 notaries voted for it, it accepts it as part of the canonical chain.

Assigning to SMC expert @enriquefynn for now.
We should consider writing smaller tasks to achieve the above requirements unless this can be completed in 1 or 2 medium sized pull requests.

References:

Modify EVM for Sharding

The EVM needs to be modified in order for sharding to works, unfortunately documentation is still in progress, I'm creating this issue to centralize the documentations and discussions for the work in progress implementation in Geth.

List of useful links:

TODO:
Opcodes (ordered by complexity):

  • SIGHASH (Calculate transaction hash w/o signature)
  • CREATE2 (CREATE w/ salt argument)
  • PAYGAS (Refunds unused gas via a temporary account)
  • BREAKPOINT (Revert transaction up to a certain point)

Revamp the Sharding Client Entry Points / Architecture

Hi all,

This is an issue that expands upon #122 to restructure our sharding client effectively. We need to leverage the first-class concurrency Go offers and allow for more modularity in the services attached to our running clients.

This is a very big topic requiring extensive discussion and design, so I propose a simple PR to get things started.

Requirements

  • Refactor the entry point of the sharding command to instead take --nodetype="proposer" or --nodetype="notary" as a cli flag
  • Main entry point will launch a startShardingClient option that does the following:
    • Sets up basic all the config options for a sharding client in a simple and concise manner
    • Registers all services required by the sharding client, similar to how how RegisterEthService does so in go-ethereum/cmd/utils/flags.go depending on the command line flag: in this case, proposer or notary
  • Setup Notary and Proposer as implementations of a Service interface that satisfy certain methods such as .Start() and .Stop().

I can take hold of this PR and I'll keep it simple.

As discussed in #122, this approach would allow for the sharding client instance to manage the lifecycle of its services without needing to be aware of how they function underneath the hood.

Once these requirements are done, we can wrap up this issue. Then, we can begin exploring the Notary and Proposer service implementations in greater detail in separate issues and PR's, analyzing the event loops they will require as well as their p2p requirements.

Let me know your thoughts.

Add gitcop and similar restrictions as go-ethereum

We're already facing a huge problem with go-ethereum coding guidelines

Please make sure your contributions adhere to our coding guidelines:

Code must adhere to the official Go formatting guidelines (i.e. uses gofmt).
Code must be documented adhering to the official Go commentary guidelines.
Pull requests need to be based on and opened against the master branch.
Commit messages should be prefixed with the package(s) they modify.
E.g. "eth, rpc: make trace configs optional"

We'll need to retroactively conform to these guidelines as well, and it likely won't be very fun, but this should be done sooner rather than later.

Sharding P2P Protocol Specs Proposition

Hi Team,

As part of this issue I would like to propose we use libp2p as our sharding networking protocol. It is a solid library created by Protocol Labs and currently powering IPFS. Let's use this issue to elaborate on this as much as we can and discuss if we want to integrate this.

Let's figure out how different it is from devp2p and if it would be worth our while to try this approach.

Here are some of the resources to get started:

libp2p site
libp2p specs
go libp2p
js libp2p

distinction between p2p libraries
libp2p video by juan benet
devp2p wiki

Let me know your thoughts below.

Setup a Simple libp2p Echo Server as a Service Attached to Sharding Node

Hi all,

As a simple way to integrate libp2p and to begin our experimentation of libp2p into our repo, we can start by setting up a bootnode that serves a simple echo server, and attach another libp2p node to our sharding node as a Service (see #127). Let's keep this constrained a package called p2p under the sharding package in our repo.

We can register this echo service in the registerShardingServices func in cmd/geth/shardingcmd.go

// registerShardingServices sets up either a notary or proposer
// sharding service dependent on the ClientType cli flag. We should be defining
// the services we want to register here, as this is the geth command entry point
// for sharding.
func registerShardingServices(n node.Node) error {
	protocolFlag := n.Context().GlobalString(utils.ProtocolFlag.Name)

	err := n.Register(func(ctx *cli.Context) (sharding.Service, error) {
		// TODO(terenc3t): handle case when we just want to start an observer node.
		if protocolFlag == "notary" {
			return notary.NewNotary(n.Context(), n)
		}
		return proposer.NewProposer(n.Context(), n)
	})

	if err != nil {
		return fmt.Errorf("failed to register the main sharding services: %v", err)
	}

	// TODO: registers the shardp2p service.
	// we can do n.Register and initialize a shardp2p.NewServer() or something like that.
	return nil
}

Then, in the submit collation function of the notary package, as the notary protocol will have access to the sharding node, we can call out an echo request using this shardp2p service and get an output.

What This Achieves

This would at least allow us to get our hands dirty with libp2p and see how we can write useful functions we can call throughout our notary or proposer protocols. This code will eventually evolve into registering another service in the sharding node for discovery, and more.

Proposer design for phase 1

Here are the functions proposer needs to implement for proposal/collator game in phase 1 Sharding. Feel free to edit/add anything I missed

  • Process transactions (dummy data) into collation bodies and collation headers
  • Propose collation header or collation body to collator
  • Check for commitment from collator
  • Challenge scheme to challenge collator who doesn't download collation body

Collator Client Modifications

Hi All,

As we have separated state execution and consensus into collators and proposers, respectively, we'll need to create new PR's to split up work effectively.

I propose the following potential PR's:

  • Once a collator is selected, he/she needs to fetch collations from the proposals pool and take the one with the highest payout.
  • Collator needs to countersign the selected collation and request the body from the proposer
  • Collator needs to add collation header to the SMC and then receive a payout from the collator's deposit

Let me know if I missed anything here.

Reduce References to cli.Context of the Sharding Node in its Services

Hi all,

As pointed out by @prestonvanloon in the comments in #127, we have a bunch of places where we reference the sharding node's .Context(). It would be best if we can simplify this by instead just passing in the values the services will need. Instead of calling .Context() to fetch a cli flag, we could just pass in that cli flag instead. This makes it clear to the reader what cli args each service will need.

If anyone wants to learn more about this or understand the context of this discussion, reference this issue over out Gitter channel.

--deposit tries to deposit as a notary when already deposited

This flag should really confirm that we are not already deposited because this is an unpleasant error.

To reproduce:
Deploy the contract, deposit ETH as a notary, update config.go with the contract addres, then try to deposit again.

geth sharding-notary --deposit --datadir=/tmp/data --networkid=1337
INFO [05-13|11:13:32] Starting notary client
INFO [05-13|11:13:33] Joining notary pool
unable to deposit eth and become a notary: failed to estimate gas needed: gas required exceeds allowance or always failing transaction

Transaction simulator tool

Hey guys,

We'll need a transaction simulator tool that can start when the sharding client is launched so that the client can have stuff to process into collations on the shards. Ideally, we can turn this into an independent cli tool that we can launch while our geth node is running.

Allow user to specify whether he/she wants to become a validator upon client start

Currently, the client will automatically deposit 100ETH from the user's account into the the VMC validator set. In the future, we need to allow users to specify whether or not they want to take this action, perhaps as a cli flag upon launching the sharding client.

geth shard --joinvalidatorset

will deposit the 100ETH automatically into the VMC.

geth shard

will start the client and not deposit the ETH automatically.

Explain how we will forego certain security considerations for purposes of Ruby Release

Hi team,

As discussed earlier, we will need to explain to the community how we will forego certain security practices for the purposes of the Ruby release (e.g. slashing, 2/3's signing from validators when supernodes validate collations, etc.).

This can be a specific section we talk about in the README, or we can create a Ruby Release README that has all the components necessary to run the first demo version of sharding.

CollatorClient interface: Refactor Account()

It's not a great developer experience to handle an error everytime we need to access the account for validation. If the account can be unlocked at the time the sharding client is created, then the unlocked account returned from Account() then we should refactor the interface from
https://github.com/prysmaticlabs/geth-sharding/blob/972509a8c2d3934cb6711e467138ac471b43b344/sharding/collator.go#L17
to

type CollatorClient interface {
   Account() *accounts.Account
   ...
}

SMC missing notary deposit field

The amount of notary balance is missing from the notary struct in SMC. We should add it to track the balance of deposit amount since there will be slashing condition for the notaries.

What to fix:

  1. Notary struct should track notary deposit balance
  2. When releasing notary, send current balance instead constant NOTARY_DEPOSIT

A similar issue was opened from py-evm

Design Spec for a Sharding Visualization Front-End

The goal of this potential project is to create a web front-end that visualizes a sharded Ethereum blockchain network. Could be built as an extension to ethstats or as its own, independent interface.

Requirements / Acceptance Criteria

  • Ability to inspect transaction load on n number of shards
  • Ability to visualize cross-shard interactions
  • Ability to see number of nodes and distribution of nodes across shards
  • See collations happening in each period for each shard
  • Ability to inspect the size of canonical shard chains

Cool to Have

  • Some way of visualizing the actual p2p network topology

Alternative

Create a command line tool to visualize a sharded network. Can be baked into a sharding client implementation and could be a combination of ASCII art and text to visualize the different information pertaining to each shard.

This is a long, multi-step project, so at this point we are only looking for a design spec with possible visuals of how the front-end would look or how ethstats can be extended to accommodate these requirements.

The necessary reading to understand the context of the requirements and basic terms around sharding can be found in our project's README. Many sections, however, are deprecated while a minimal sharding protocol is still being researched.

We see this as a good first bounty for Prysmatic Labs.

Modify Sharding Management Contract to Align With New Phase 1

SMC needs to make the following modifications for Proposer

  • add Proposer struct and mapping (similar to the Validator's stack implementation)
  • functions to deposit and withdraw
  • functions to verify Proposer's signature, deduct Proposer's bid from deposit upon receiving collation header

Let me know if I miss anything

Start implementing Collation interface/structs

We need to start creating an interface for the Collation type and brainstorm the different methods we will require to interact with it.

When fetching information from the txpool and adding transactions into collations, we need to keep track of gas used, so a low-hanging fruit would be to start working on that functionality.

Specifically, I'm referring to the python pseudocode specified by the sharding spec:

# Sort by descending order of gasprice
txpool = sorted(copy(available_transactions), key=-tx.gasprice)
collation = new Collation(...)
while len(txpool) > 0:
    # Remove txs that ask for too much gas
    i = 0
    while i < len(txpool):
        if txpool[i].startgas > GASLIMIT - collation.gasused:
            txpool.pop(i)
        else:
            i += 1
    tx = copy.deepcopy(txpool[0])
    tx.witness = UPDATE_WITNESS(tx.witness, recent_trie_nodes_db)
    # Try to add the transaction, discard if it fails
    success, reads, writes = add_transaction(collation, tx)
    recent_trie_nodes_db = union(recent_trie_nodes_db, writes)
    txpool.pop(0)

Collator design for phase 1

Here are the functions collator needs to implement for proposal/collator game in phase 1 Sharding. Feel free to edit/add anything I missed

  • Fetch for incoming block headers
  • Check for eligible collator from SCM
  • Sort proposals by highest bid
  • Commitment scheme to proposers for the looked at proposals
  • Challenge response scheme to answer proposer's challenge
  • Windback scheme to get longest/valid chain
  • Add collation header to SMC

Header with GNU/GPL license text

Opening this issue to discuss the necessity of putting a license notice in our header. The license notice makes the header of the file cluttered, and we should have a sweeping license in the root of the project. Let's evaluate our options to avoid possible legal complications for the future

Rename Terms Across Repository to Match New Glossary

Hi Everyone,

As recently mentioned on Gitter, we'll need to revise our old terminology spread out across our projects, pull requests, issues, code, and documentation to match the new glossary. The terms we have to change are the following:

  • collators (old glossary term is validator) are agents selected by the smart contract to reach consensus on collations that will be added to the mainchain via the contract.
  • proposers are in charge with packaging transaction data into collations along with an ETH deposit that will then be "sold" to collators.
  • sharding manager contract is the new name for the validator manager contract. We will be managing both collators and proposers through this contract. We store proposers here because we need a place where they can lockup the ETH deposit that comes with their collation headers and automatically transfer to collators upon addition of the collation into a shard.
  • collator pool is the new name for the validator set.

We also need to ensure that

  • canonical chain: The canonical collation chain of a shard, i.e. the longest chain with available collation bodies

refers to the above. In our documentation, we use canonical chain to refer to the Ethereum main chain.

Additionally, we need to mention that we will managing proposers through the SMC (sharding manager contract) in addition to collators, as this is an important part of the phase 1 modifications that will go public soon.

Clients Can Join the Network, Sync, and Propagate Messages

This is a mega issue and might need to be broken up into smaller tasks.

  • Clients can join the network
  • Clients can sync collations
  • Clients propagate messages
  • Clients are selective about their peers (torus network?)
  • p2p messages are defined

TODO: Add more tasks to the list above and/or break into smaller issues.

We had some discussion in #72, but bumped it to Sapphire. I still agree with this since the purpose of this issue is to have some minimal and mostly local example of syncing with actors communicating over some network. A full implementation for Sapphire includes peer discovery and many other tasks.

Use Blob Serialization to Construct Collation Instances

We can resolve some TODO's by taking advantage of the blob serialization algorithm created in #92 by @nisdas. When creating a new collation object and having access to the collation's byte serialized body, we can deserialize that body into a slice of transactions that we can pass into the NewCollation function.

An example of this is in shard.go, but there are other similar todo's.

// CollationByHash fetches full collation.
func (s *Shard) CollationByHash(headerHash *common.Hash) (*Collation, error) {
	header, err := s.HeaderByHash(headerHash)
	if err != nil {
		return nil, fmt.Errorf("cannot fetch header by hash: %v", err)
	}

	body, err := s.BodyByChunkRoot(header.ChunkRoot())
	if err != nil {
		return nil, fmt.Errorf("cannot fetch body by chunk root: %v", err)
	}
	// TODO: deserializes the body into a txs slice instead of using
	// nil as the third arg to MakeCollation.
	col := NewCollation(header, body, nil)
	return col, nil
}

Another one occurs when saving the collation's body:

// SaveBody adds the collation body to the shardDB and sets availability.
func (s *Shard) SaveBody(body []byte) error {
	// TODO: check if body is empty and throw error.
	// TODO: dependent on blob serialization.
	// right now we will just take the raw keccak256 of the body until #92 is merged.
	chunkRoot := common.BytesToHash(body)
	s.SetAvailability(&chunkRoot, true)
	return s.shardDB.Put(chunkRoot, body)
}

Refactor Overall Test Structure

Context

To efficiently maintain all our tests, I am creating this issue to discuss the state of our current tests and how we can refactor them to be better. There's a lot of reusable test code that can ported to a test helper/util like file. Then we can re-use the test helper functions to avoid repeatable code tests

Current test structure:

  • contracts
    • sharding_manager_test.go
  • notary
    • notary_test.go
  • proposer
    • proposer_test.go
  • collation_test.go
  • config_test.go

What I propose:

  • contracts
    • contract_tests
      • notary_registration_test.go
      • notary_vote_test.go
      • proposer_add_header_test.go
  • notary
    • notary_test.go
  • proposer
    • proposer_test.go
  • collation
    • collation_test.go
  • config_test.go
  • test_utils
    • test_config.go
    • test_library.go

Test version of config

Similar to config.go we should have a test version of config naming test_config.go The config will have shorter notary and proposer lock up period. The current notary lock up period is 16128 periods. To deregister a notary takes 30 seconds with simulated backend, this won't scale for testing multiple notaries.

Test library

The test library has the following functions

// register 10 notaries, start 0 end 9
func (s *SMC) Batch_register_notary(start int, end int) error {}
// deregister the last 5 notaries, start 4 end 9
func (s *SMC) Batch_deregister_notary(start int, end int) error {}
// fast forward 100 periods
func (s *SMC) Fast_forward_period(p int) error {}
// add header from shard 0 to shard 100, start 0 end 99
func (s *SMC) Batch_add_header(start int, end int) error {}
// notary 0-9 votes on shard 10, start 0, end 9, shardId 10
func (s *SMC) Batch_submit_vote(start int, end int, shardId int) error {}
// Hash returns the hash of a collation's entire contents
func (c *Collation) Hash() (hash common.Hash) {}

Optimize collation serializer and deserializer functions

We can further optimize collation serializer and deserializer collations functions as the following:

  • Serializer function should be pure, it should not depend on a collation pointer receiver
  • collation.Serialize() can be deleted because we can get the same data from collation.Body()
  • collation.Deserialize() can be deleted because we can get the same data from collation.Transactions()

I can take on this issue because it's currently blocking #111

Proposer Client Implementation Notes

After the core Team meeting on the 24th of March and to align with the updated Sharding Spec , the Proposer Client Implementation will be as following :

In Phase 1 for our implementation there will be no state execution and no P2P wire protocol will be implemented and everything will be implemented using the local file system. Also the collations will basically consists of blobs of random data rather than actual transactions. Updated goals to be accomplished for the proposer client:

  • Create a basic interface for the Proposer , where it is basically be able to interact with the main shard over JSON-RPC
  • The proposer client should be able use the local filesystem to process arbitrary data blobs into collation bodies and collation headers
  • Then propose the collation header to the collator to be added to the shard chain.

Create a Shard Interface/Struct With Useful Receiver Methods in Client Package

Hey all,

To proceed with the shard local state storage, I am creating this issue and a PR shortly to include a Shard interface/struct in our client package. This interface will have a lot of useful receiver methods for checking availability, interfacing with the shardDB backend, and fetching collations/headers from within a certain shard.

Context

This was worked on in Py-EVM ethereum/py-evm#570 and would also serve as a useful wrapper class to thin out the code we would otherwise have to write for notaries/proposers.

IMO we need this in order to proceed, as the next steps would be to define the shardDB as perhaps a sparse merkle tree wrapped around levelDB or badgerDB.

This is being worked on in #100.

Exploring the Current Client Protocol Architecture

Hi all,

As more of our sharding client code is being created in our fork, it is critical to understand the design considerations of the current Ethereum nodes baked into go-ethereum. In particular, our notary/proposer clients need to be designed with good event loop management, pluggable services, and solid entry points for p2p functionality built in. As a case study, we will be looking at lightsync nodes as they are currently implemented in geth, understand their full responsibilities, and figure out the bigger picture behind the design considerations of their architecture.

The key question we will be asking ourselves is: what exactly happens when we start a light client? What are the design considerations that came into play when designing the code that gets the light client to work?

We will cap off this document by determining what aspects of the protocols in geth we can use as part of our sharding clients. We have an opportunity to write clean, straightforward code that does not have a massive number of file dependencies and complicated configs as geth currently does.

Let’s dive in.

Case Study: Light Client Nodes

Ethereum’s light client sync mode allows users to spin up a geth node that only downloads block headers and relies on merkle proofs to verify specific parts of the state tree as needed. Light peers are extremely commonplace and critical components in the Ethereum network today. Their architecture serves as a great starting point for anyone extending or redesigning geth in a secure, concurrent, and performant way.

Unfortunately, the current geth code is very hard to read, has a ton of dependencies across packages, and contains obscure configuration options. This doc will attempt to explain light client sync from start to finish, light node peer-to-peer networking, and other responsibilities of the protocol.

How is a Light Node Triggered?

Launching a geth light node is as easy as:

$ geth --syncmode="light"

Upon the command being executed, the main function within go-ethereum/cmd/geth/main.go runs as follows:

func main() {
  if err := app.Run(os.Args); err != nil {
    fmt.Fprintln(os.Stderr, err)
    os.Exit(1)
  }
}

This triggers the urfave/cli external package’s Run function, which will trigger the geth function a few lines below main().

func geth(ctx *cli.Context) error {
  node := makeFullNode(ctx)
  startNode(ctx, node)
  node.Wait()
  return nil
}

Based on the cli context, this function initializes a node instance, which is a critical entry point. Let’s take a look at how makeFullNode does this.

In go-ethereum/cmd/geth/config.go:

func makeFullNode(ctx _cli.Context) _node.Node {
  stack, cfg := makeConfigNode(ctx)

  utils.RegisterEthService(stack, &cfg.Eth)
// a bunch of other services are configured below…// then it returns the node, which is a var called a “stack”,
// representing a protocol stack of the node (i.e. p2p services, rpc, etc.).
  return stack
}

Two important functions are at play here:

  • makeConfigNode returns a configuration object that uses the cli context to fetch relevant command line flags and returns a node instance + a configuration object instance.
  • utils.RegisterEthService is a function that, based on the command line flags from the context, will use configuration options to add a Service object to the node instance we just declared above. In this case, the cli context contains the --syncmode="light" flag that we will be using to setup a light client protocol instead of a full Ethereum node.

Let's see makeConfigNode in go-ethereum/cmd/geth/config.go:

func makeConfigNode(ctx _cli.Context) (_node.Node, gethConfig) {

  // Load defaults.
  cfg := gethConfig{
    Eth:       eth.DefaultConfig,
    Shh:       whisper.DefaultConfig,
    Node:      defaultNodeConfig(),
    Dashboard: dashboard.DefaultConfig
  }

  // Load config file.
  if file := ctx.GlobalString(configFileFlag.Name); file != "" {
    if err := loadConfig(file, &cfg); err != nil {
      utils.Fatalf("%v", err)
    }
  }

  // Apply flags.
  utils.SetNodeConfig(ctx, &cfg.Node)
  stack, err := node.New(&cfg.Node)
  if err != nil {
    utils.Fatalf("Failed to create the protocol stack: %v", err)
  }
  utils.SetEthConfig(ctx, stack, &cfg.Eth)
  if ctx.GlobalIsSet(utils.EthStatsURLFlag.Name) {
    cfg.Ethstats.URL = ctx.GlobalString(utils.EthStatsURLFlag.Name)
  }

  utils.SetShhConfig(ctx, stack, &cfg.Shh)
  utils.SetDashboardConfig(ctx, &cfg.Dashboard)

  return stack, cfg

}

Cool, so this function just sets up some basic, default configurations to start a node. This sets up some basic, familiar options we have in the Ethereum network.

var DefaultConfig = Config{
	SyncMode: downloader.FastSync,
	Ethash: ethash.Config{
		CacheDir:       "ethash",
		CachesInMem:    2,
		CachesOnDisk:   3,
		DatasetsInMem:  1,
		DatasetsOnDisk: 2,
	},
	NetworkId:     1,
	LightPeers:    100,
	DatabaseCache: 768,
	TrieCache:     256,
	TrieTimeout:   5 _ time.Minute,
	GasPrice:      big.NewInt(18 _ params.Shannon),

    TxPool: core.DefaultTxPoolConfig,
    GPO: gasprice.Config{
    	Blocks:     20,
    	Percentile: 60,
    },

}

The utils.SetEthConfig(ctx, stack, &cfg.Eth) line is what will modify the cfg option based on command line flags. In this case, if SyncMode is set to light, then the config is updated to reflect that flag. Then, we go into the actual code that initializes a Light Protocol instance and registers it as the node's ETH service.

In go-ethereum/cmd/flags.go:

// RegisterEthService adds an Ethereum client to the stack.
func RegisterEthService(stack _node.Node, cfg _eth.Config) {

  var err error
  if cfg.SyncMode == downloader.LightSync {
    err = stack.Register(func(ctx _node.ServiceContext) (node.Service, error) {
      return les.New(ctx, cfg)
    })
  } else {
    err = stack.Register(func(ctx _node.ServiceContext) (node.Service, error) {
      fullNode, err := eth.New(ctx, cfg)
      if fullNode != nil && cfg.LightServ > 0 {
        ls, \_ := les.NewLesServer(fullNode, cfg)
        fullNode.AddLesServer(ls)
      }
      return fullNode, err
    })
  }
  if err != nil {
    Fatalf("Failed to register the Ethereum service: %v", err)
  }

}

So here, if the config option for the downloader is set to LightSync, which was set in the makeConfigNode function we saw before, we register a Service object into the node (referred to as stack in the code above). Nodes contain an array of Service instances that all implement useful functions we will come back to later. In this case, the service a LightEthereum instance that gives us all the functionality we need to run a light client.

How Do These Attached Services Start Running?

Here's where everything actually ties together. If you go back to the main function in go-ethereum/cmd/geth/main.go,

func geth(ctx *cli.Context) error {

  node := makeFullNode(ctx)
  startNode(ctx, node)
  node.Wait()
  return nil

}

the startNode func actually kicks things off.

// startNode boots up the system node and all registered protocols, after which
// it unlocks any requested accounts, and starts the RPC/IPC interfaces and the
// miner.
func startNode(ctx _cli.Context, stack _node.Node) {

  // Start up the node itself
  utils.StartNode(stack)

  // a lot of stuff below is related to wallet opening/closing events and setting up
  // full node mining functionality...
  ...
}

When we look at utils.StartNode in go-ethereum/cmd/utils/cmd.go:

func StartNode(stack *node.Node) {

  if err := stack.Start(); err != nil {
    Fatalf("Error starting protocol stack: %v", err)
  }

  // stuff below handles signal interrupts to stop the service...
  ...
}

...we see the actual code that starts off a node! Let's explore. In go-ethereum/node/node.go, a lot of things happen (simplified for readability):

func (n *Node) Start() error {

  n.lock.Lock()
  defer n.lock.Unlock()

  // Short circuit if the node's already running
  if n.server != nil {
    return ErrNodeRunning
  }
  if err := n.openDataDir(); err != nil {
    return err
  }

  // Initialize the p2p server. This creates the node key and
  // discovery databases.
  n.serverConfig = n.config.P2P
  n.serverConfig.PrivateKey = n.config.NodeKey()
  n.serverConfig.Name = n.config.NodeName()
  n.serverConfig.Logger = n.log

  // setting up more config stuff...
  ...

  // sets up a peer to peer server instance!
  running := &p2p.Server{Config: n.serverConfig}
  n.log.Info("Starting peer-to-peer node", "instance", n.serverConfig.Name)

  services := make(map[reflect.Type]Service)

  // serviceFuncs is an internal slice updated in a node whenever node.Register() is called!
  for _, constructor := range n.serviceFuncs {

    // Create a new context for the particular service
    ctx := &ServiceContext{
      config:         n.config,
      services:       make(map[reflect.Type]Service),
      EventMux:       n.eventmux,
      AccountManager: n.accman,
    }

    // does some stuff for threaded access...
    ...
   
    // Construct and save the service
    service, err := constructor(ctx)

    // sets up the service and adds it to the services slice defined above...
    ...

    // updates the services slice
    services[kind] = service
  }

  // this uses the .Protocols() property of each attached service (yes, LightEthereum has this defined)
  // and attaches it to the running p2p server instance.
  for _, service := range services {
    running.Protocols = append(running.Protocols, service.Protocols()...)
  }

  // this starts the p2p server!
  if err := running.Start(); err != nil {
    ...
  }
  // Start each of the services
  for kind, service := range services {
    // Start the next service, stopping all previous upon failure
    if err := service.Start(running); err != nil {
      ...
    }
  }

  // code below starts some RPC stuff and cleans up the node when it exits...

  return nil
}

Aha! So this is the function that iterates over each attached service and runs the .Start() function for each! The LightEthereum instance that was attached as a service to the node implements the Service interface that contains a .Start() function. This is how it all fits together!

The Light Ethereum Package

We will focusing our attention on the go-ethereum/les package in this section, as this is the service that is attached to the running node upon launching a geth instance with the --syncmode="light" flag.

The light client needs to implement the Service interface defined in go-ethereum/node/service.go as follows:

type Service interface {

  // Protocols retrieves the P2P protocols the service wishes to start.
  Protocols() []p2p.Protocol

  // APIs retrieves the list of RPC descriptors the service provides.
  APIs() []rpc.API

  // Start is called after all services have been constructed and the networking
  // layer was also initialized to spawn any goroutines required by the service.
  Start(server *p2p.Server) error

  // Stop terminates all goroutines belonging to the service, blocking until they
  // are all terminated.
  Stop() error
  
}

The core of the entire light client is written in go-ethereum/les/backend.go. This is where we find the functions required to satisfy this Service interface, alongside the code that initializes an actual LightEthereum instance in a function known called New.

func New(ctx _node.ServiceContext, config _eth.Config) (_LightEthereum, error) {
  
  // sets up the chainDB and genesis configuration for the light node...
  chainDb, err := eth.CreateDB(ctx, config, "lightchaindata")
  if err != nil {
    return nil, err
  }
  chainConfig, genesisHash, genesisErr := core.SetupGenesisBlock(chainDb, config.Genesis)
 
  ...

  log.Info("Initialised chain configuration", "config", chainConfig)

  leth := &LightEthereum{
    ...
  }

  // sets up a transaction relayer, a server pool, and info retrieval systems

  leth.relay = NewLesTxRelay(peers, leth.reqDist)
  leth.serverPool = newServerPool(chainDb, quitSync, &leth.wg)
  leth.retriever = newRetrieveManager(peers, leth.reqDist, leth.serverPool)
  
  ...

  // sets up the light tx pool
  leth.txPool = light.NewTxPool(leth.chainConfig, leth.blockchain, leth.relay)

  // sets up a protocol manager: we'll get into this shortly...
  if leth.protocolManager, err = NewProtocolManager(...); err != nil {
    return nil, err
  }

  // sets up the light ethereum APIs for RPC interactions
  leth.ApiBackend = &LesApiBackend{leth, nil}
 
  ...

  return leth, nil

}

Let's see what the light client's .Start() function does and how it sets up the p2p stack:

func (s _LightEthereum) Start(srvr _p2p.Server) error {

  ...

  log.Warn("Light client mode is an experimental feature")
  s.netRPCService = ethapi.NewPublicNetAPI(srvr, s.networkId)

  ...

  s.serverPool.start(srvr, lesTopic(s.blockchain.Genesis().Hash(), protocolVersion))
  ...
  return nil
  
}

Light Protocol Event Loop

The creation of the LightEthereum instance kicks off a bunch of goroutines, but where the actual sync and retrieval of state occurs is in the creation of a ProtocolManager in the New function.

In go-ethereum/les/handler.go, we see at the bottom of the NewProtocolManager function, code that runs some event loops:

if lightSync {
		manager.downloader = downloader.New(downloader.LightSync, chainDb, manager.eventMux, nil, blockchain, removePeer)
		manager.peers.notify((*downloaderPeerNotify)(manager))
		manager.fetcher = newLightFetcher(manager)
	}

In this case, we the instance starts a new downloader instance and a newLightFetcher, which work in tandem with the p2p layer to sync the state and respond to RPC requests that trigger events on peers or respond to incoming messages from peers.

The implementation diverges into a variety of files at this point, but an important aspect of the les package is the usage of on-demand requests or ODR's. Through the p2p light server, nodes receive requests that are processed via goroutines such as in the example below.

In go-ethereum/les/odr_requests.go:

func (r _TrieRequest) Validate(db ethdb.Database, msg _Msg) error {

  log.Debug("Validating trie proof", "root", r.Id.Root, "key", r.Key)

  switch msg.MsgType {
  case MsgProofsV1:
    proofs := msg.Obj.([]light.NodeList)
    if len(proofs) != 1 {
      return errInvalidEntryCount
    }
    nodeSet := proofs[0].NodeSet()
    // Verify the proof and store if checks out
    if _, err, _ := trie.VerifyProof(r.Id.Root, r.Key, nodeSet); err != nil {
      return fmt.Errorf("merkle proof verification failed: %v", err)
    }
    r.Proof = nodeSet
    return nil

  case MsgProofsV2:
    proofs := msg.Obj.(light.NodeList)
    // Verify the proof and store if checks out
    nodeSet := proofs.NodeSet()
    reads := &readTraceDB{db: nodeSet}
    if _, err, _ := trie.VerifyProof(r.Id.Root, r.Key, reads); err != nil {
      return fmt.Errorf("merkle proof verification failed: %v", err)
    }
    // check if all nodes have been read by VerifyProof
    if len(reads.reads) != nodeSet.KeyCount() {
      return errUselessNodes
    }
    r.Proof = nodeSet
    return nil

  default:
    return errInvalidMessageType
  }

}

The node in question has the capacity to immediately respond to a message received via other peers, which is a critical piece of functionality we will need the more we elaborate on our notary/proposer clients.

Key Takeaways

Overall, taking full advantage of Go's concurrency primitives along with mutexes for managing services is a great benefit of working with the geth client. We should maintain the pluggability of Services via a Service-like interface and allow for easy management and testing of relevant code.

What we should avoid, however, is the extremely dependent spaghetti code around configuration options. There is a lot of hetereogeneity around configuring structs in the geth client, with packages often following their own approaches compared to others throughout the project. We should aim to constrain all configuration to a single, initial entrypoint and avoid redundancy of .Start() methods. After reading this code, it often feels like the geth team really drove themselves into a corner here. We have the opportunity to keep things simple, DRY, and performant.

We have to leverage the powerful constructs shown above in our notary/proposer implementations to make the most out of Go. Please let me know your thoughts below as to how we can improve upon what the go-ethereum team has done.

Let's go for it.

Calculate Chunk Root Using Blob Serialization & Proof of Custody

Hey all,

To continue with the work done on #100 and #92 with respect to creating a Shard struct and the blob serialization implementation done by @nisdas, we now need to calculate ChunkRoot according to the serialization algorithm on the body of a collation.

Context

In #100, I included a simple CalculateChunkRoot method on the Collation type. Given this was out of scope of my PR and blocked by #92 being merged, I chose to have the chunk root be a simple, sha3 hash of the collation body. We need to open a PR that uses the blob serialization to modify the following function created in #100:

// CalculateChunkRoot updates the collation header's chunk root based on the body.
func (c *Collation) CalculateChunkRoot() {
	// TODO: this needs to be based on blob serialization.
	// For proof of custody we need to split chunks (body) into chunk + salt and
	// take the merkle root of that.
	chunkRoot := common.BytesToHash(c.body)
	c.header.data.ChunkRoot = &chunkRoot
}

Additionally, for proof of custody (See that ETHResearch post for context), we will need to split the chunks of the body into chunk + salts and take the merkle root of those combined values. This proof of custody mechanism would add more skin-in-the-game for the notarization scheme, preventing them from being lazy when voting on collations.

Consider Case When ChainID != NetworkID When Signing Transactions

On Ethereum Classic, ChainID != NetworkID and the following code would fail when signing the initial contract creation transaction in vmc.go:

return c.keystore.SignTx(accounts[0], tx, networkID /* chainID */)

We currently fetch networkID via the ethclient's NetworkID method, but we should consider situations in which this could break. Keeping it as an open issue for now.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.