GithubHelp home page GithubHelp logo

portal-network-specs's Introduction

The Portal Network

This specification is a work-in-progress and should be considered preliminary.

Introduction

The Portal Network is an in progress effort to enable lightweight protocol access by resource constrained devices. The term "portal" is used to indicate that these networks provide a view into the protocol but are not critical to the operation of the core Ethereum protocol.

The Portal Network is comprised of multiple peer-to-peer networks which together provide the data and functionality necessary to expose the standard JSON-RPC API. These networks are specially designed to ensure that clients participating in these networks can do so with minimal expenditure of networking bandwidth, CPU, RAM, and HDD resources.

The term 'Portal Client' describes a piece of software which participates in these networks. Portal Clients typically expose the standard JSON-RPC API.

Motivation

The Portal Network is focused on delivering reliable, lightweight, and decentralized access to the Ethereum protocol.

Prior Work on the "Light Ethereum Subprotocol" (LES)

The term "light client" has historically referred to a client of the existing DevP2P based LES network. This network is designed using a client/server architecture. The LES network has a total capacity dictated by the number of "servers" on the network. In order for this network to scale, the "server" capacity has to increase. This also means that at any point in time the network has some total capacity which if exceeded will cause service degradation across the network. Because of this the LES network is unreliable when operating near capacity.

Architecture

The Portal Network is built upon the Discover V5 protocol and operates over the UDP transport.

The Discovery v5 protocol allows building custom sub-protocols via the use of the built in TALKREQ and TALKRESP message. All sub-protocols use the Portal Wire Protocol which uses the TALKREQ and TALKRESP messages as transport. This wire protocol allows for quick development of the network layer of any new sub-protocol.

The Portal Network is divided into the following sub-protocols.

  • Execution State Network
  • Execution History Network
  • Beacon Chain Network
  • Execution Canonical Transaction Index Network (preliminary)
  • Execution Verkle State Network (preliminary)
  • Execution Transaction Gossip Network (preliminary)

Each of these sub-protocols is designed to deliver a specific unit of functionality. Most Portal clients will participate in all of these sub-protocols in order to deliver the full JSON-RPC API. Each sub-protocol however is designed to be independent of the others, allowing clients the option of only participating in a subset of them if they wish.

All of the sub-protocols in the Portal Network establish their own overlay DHT that is managed independent of the base Discovery V5 DHT.

Terminology

The term "sub-protocol" is used to denote an individual protocol within the Portal Network.

The term "network" is used contextually to refer to either the overall set of multiple protocols that comprise the Portal Network or an individual sub-protocol within the Portal Network.

Design Principles

Each of the Portal Network sub-protocols follows these design principles.

  1. Isolation
  • Participation in one network should not require participation in another network.
  1. Distribution of Responsibility
  • Normal operation of the network should result in a roughly even spread of responsibility across the individual nodes in the network.
  1. Tunable Resource Requirements
  • Individual nodes should be able to control the amount of machine resources (HDD/CPU/RAM) they provide to the network

These design principles are aimed at ensuring that participation in the Portal Network is feasible even on resource constrained devices.

The JSON-RPC API

The following JSON-RPC API endpoints are directly supported by the Portal Network and exposed by Portal clients.

  • eth_getBlockByHash
  • eth_getBlockByNumber
  • eth_getBlockTransactionCountByHash
  • eth_getBlockTransactionCountByNumber
  • eth_getUncleCountByBlockHash
  • eth_getUncleCountByBlockNumber
  • eth_blockNumber
  • eth_call
  • eth_estimateGas
  • eth_getBalance
  • eth_getStorageAt
  • eth_getTransactionCount
  • eth_getCode
  • eth_sendRawTransaction
  • eth_getTransactionByHash
  • eth_getTransactionByBlockHashAndIndex
  • eth_getTransactionByBlockNumberAndIndex
  • eth_getTransactionReceipt

In addition to these endpoints, the following endpoints can be exposed by Portal clients through the data available through the Portal Network.

  • eth_syncing

The following endpoints can be exposed by Portal clients as they require no access to execution layer data.

  • eth_protocolVersion
  • eth_chainId
  • eth_coinbase
  • eth_accounts
  • eth_gasPrice
  • eth_feeHistory
  • eth_newFilter
    • TODO: explain complexity.
  • eth_newBlockFilter
  • eth_newPendingTransactionFilter
  • eth_uninstallFilter
  • eth_getFilterChanges
  • eth_getFilterLogs
  • eth_getLogs
    • TODO: explain complexity
  • eth_mining
  • eth_hashrate
  • eth_getWork
  • eth_submitWork
  • eth_submitHashrate
  • eth_sign
  • eth_signTransaction

JSON-RPC Specs

Bridge Nodes

The term "bridge node" refers to Portal clients which, in addition to participating in the sub-protocols, also inject data into the Portal Network. Any client with valid data may participate as a bridge node. From the perspective of the protocols underlying the Portal Network there is nothing special about bridge nodes.

The planned architecture for bridge nodes is to pull data from the standard JSON-RPC API of a Full Node and "push" this data into their respective networks within the Portal Network.

Network Functionality

State Network: Accounts and Contract Storage

The State Network facilitates on-demand retrieval of the Ethereum "state" data. This includes:

  • Reading account balances or nonce values
  • Retrieving contract code
  • Reading contract storage values

The responsibility for storing the underlying "state" data should be evenly distributed across the nodes in the network. Nodes must be able to choose how much state they want to store. The data is distributed in a manner that allows nodes to determine the appropriate nodes to query for any individual piece of state data. When retrieving state data, a node should be able to validate the response using a recent header from the header chain.

The network will be dependent on receiving new and updated state for new blocks. Full "bridge" nodes acting as benevolent state providers are responsible for bringing in this data from the main network. The network should be able to remain healthy even with a small number of bridge nodes. As new data enters the network, nodes are able to validate the data using a recent header from the header chain.

Querying and reading data from the network should be fast enough for human-driven wallet operations, like estimating the gas for a transaction or reading state from a contract.

History Network: Headers, Blocks, and Receipts

The History Network facilitates on-demand retrieval of the history of the Ethereum chain. This includes:

  • Headers
  • Block bodies
  • Receipts

The responsibility for storing this data should be evenly distributed across the nodes in the network. Nodes must be able to choose how much history data they want to store. The data is distributed in a manner that allows nodes to determine the appropriate nodes to query for any individual piece of history data.

Participants in this network are assumed to have access to the canonical header chain.

All data retrieved from the history network is addressed by block hash. Headers retrieved from this network can be validated to match the requested block hash. Block Bodies and Receipts retrieved from this network can be validated against the corresponding header fields.

All data retrieved from the history network can be immediately verified by the requesting node. For block headers, the requesting node always knows the expected hash of the requested data and can reject responses with an incorrect hash. For block bodies and receipts, the requesting node is expected to have the corresponding header and can reject responses that do not validate against the corresponding header fields.

Canonical Transaction Index Network: Transactions by Hash

The Canonical Transaction Index Network facilitates retrieval of individual transactions by their hash.

The responsibility for storing the records that make up this should be evenly distributed across the nodes in the network. Nodes must be able to choose how many records from this index they wish to store. The records must be distributed across the network in a manner that allows nodes to determine the appropriate nodes to query for an individual record.

Transaction information returned from this network includes a merkle proof against the Header.transactions_trie for validation purposes.

Transaction Gossip Network: Sending Transactions

The Transaction Gossip Network facilitates broadcasting new transactions for inclusion in a future block.

Nodes in this network must be able to limit how much of the transaction pool they wish to process and gossip.

The goal of the transaction gossip network is to make sure nodes can broadcast transaction such that they are made available to miners for inclusion in a future block.

Transactions which are part of this network's gossip are able to be validated without access to the Ethereum state. This is accomplished by bundling a proof which includes the account balance and nonce for the transaction sender. This validation is required to prevent DOS attacks.

This network is a pure gossip network and does not implement any form of content lookup or retrieval.

Network Specifications

portal-network-specs's People

Contributors

acolytec3 avatar alexeyakhunov avatar carver avatar chee-chyuan avatar dapplion avatar drinkcoffee avatar emhane avatar ethers avatar ftruzzi avatar jacobkaufmann avatar jinfwhuang avatar kdeme avatar kolbyml avatar konradstaniec avatar mandrigin avatar minaminao avatar morph-dev avatar mrferris avatar mynameisdaniil avatar namrapatel avatar njgheorghita avatar ogenev avatar omahs avatar oslfmt avatar pipermerriam avatar poemm avatar raghavendra80 avatar sandjohnson avatar scottypoi avatar tkstanczak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

portal-network-specs's Issues

Thin websocket to UDP socket proxy to allow browsers to join the club

An idea for how we can allow browsers to be part of the network.

An open question is whether there is a viable way for a browser based node, which ideally is running in some form of a browser extension to execute long running background processes that aren't constantly put to sleep. For the context of this issue, we'll assume this is possible

Why browser based clients are hard

The thing blocking browsers from easily joining the network is the lack of access to the UDP transport.

The initial idea for how we might be able to shim browsers in would be to use websockets or webrtc to do communication between browsers with some form of bridging to the outside world, however, this has two unfortunate things that probably make it non viable. 1: websockets and webrtc are both long-lived connections which is at odds with the network designs which are based around UDP datagrams. It's likely that trying to do discv5 over websockets/webrtc incurs significant overhead in continuously establishing and disconnecting these connections. 2: the idea of bridging is fundamentally not easy to make compatible with the way the network functions, since the browser nodes would be somewhat partitioned off from the network and either they could only communicate with nodes that supported the bridge or we'd need some sort of relay system, both of which are complex solutions.

A simple solution

The proposed solution isn't pure peer-to-peer.

It should be relatively simple to write a piece of software that does the following:

  1. Run a websocket server and listen for new incoming connections.
  2. When a new incoming connection is established, open up an external facing UDP socket.
  3. Bridge the websocket stream with the UDP socket, relaying packets between the two
+-----------+     +------------------------------+
|  Browser  |     |       Proxy Server Thing     |
+-----------+     +-----------+     +------------+
|           | --> |           | --> |            |
| websocket |     | websocket |     | UDP socket |
| (client)  | <-- | (server)  | <-- |            |
|           |     |           |     |            |
+-----------+     +-----------+-----+------------+

This proxy should be quite lightweight, likely being able to support many browsers from a single server. The proxy should have the same visibility into the data you are sending as your ISP since all of the packed data will be encrypted. Running this service "benevolently" on the scale of supporting millions of browsers connecting to it would likely cost well under $100k USD per year.

content-id vs content-key clarification?

I know content-id = sha256(content-key). Currently in my implementation of responding to the OFFER message, the node does 2 checks before requesting the content 1) making sure that the distance between node-id and content-id is less than the data_radius, then 2) using the content-key to index into the node's database to check if it currently has the data being offered. If not, it then sets the corresponding bit to 1.

My main confusion is whether content-id is used to index into the database where content is stored. Previously, this is what I had been using in my code, and it gave me a type error...I changed the index to content-key and it now works, and looking in other parts of the trin code, it uses content-key to index into the db as well.

Additionally, I had these notes on how to handle the storing of content after receiving a set of content-keys, now I realize they might be wrong:

// content-key = string with semantics, has info about the specific data to store (bodies, receipts, headers)
// content-id = hash(content-key), "dummy" way to store data, can index into db
// 2 methods:
// 1. just hash the key to get content-id and use to index into db
// 2. parse the content-key for specific semantics to "smartly" store the data using custom handlers,
//    then hash the key to get content-id to index into db

So is content-key what is actually used as the index into the db, and content-id is just purely for calculation in the distance function?

Header oracle

I would like a simple to run web service that did the following things:

  • used a web3 connection to track the header chain.
  • maintained our "header accumulator" for the chain.
  • maintains a public/private key pair
  • exposed an API with all responses containing a signature from the key pair:
    • getting any of the most recent 256 blocks of HEAD info (hash, and accumulator root)
    • pulling a full copy of the latest accumulator.

The idea would be that the service could be hosted somewhere on the normal web. We can put the public key in the DNS records. Clients could query known/trusted instances of this service to jump to the HEAD of the chain.

OFFER/ACCEPT/STORE message semantics

I would like to propose the following with respect to OFFER/ACCEPT/STORE messages as they are specified in #67 . This also presumes that the messages are updated to support multiple content keys, aka the OFFER/ACCEPT messages payloads are List[ContentKey], allowing a set of keys to be transmitted in a single message.

The initial OFFER message offers the keys {k0, k1, ..., kn}. The message is invalid if contains any duplicate keys.

The ACCEPT message must contain a subset of the offered keys. The key ordering is up to the requester and independent of the order from the corresponding OFFER message. The message is invalid if it contains any duplicate keys.

The STORE response will be List[Bytes] and must contain a number of payloads equal to the number of keys accepted in the ACCEPT. aka, if I accept 4 things, then the response must contain four payloads. Empty payloads should be included for any content that the offering node is unable to return. The order of the response payloads must be the same as the keys accepted in the ACCEPT message. These requirements are intended to simplify the parsing and validation of the returned data. Allowing empty responses is intended to handle the edge case where something gets evicted in between it being offered and accepted.

Priliminary spec for header gossip network

We need a specification for the header gossip network.

  • PING/PONG/FIND/FOUND for building overlay network routing table
  • OFFER/ACCEPT/STOR for transmission of headers and accumulator snapshots
  • specification for the header accumulator

The Merry Go Round

This is an idea that can in theory be applied to a few different things. I'm going to explain it in the context of a client performing a full sync of the header chain.

At present, fully syncing the header chain is expensive in the context of the current plans for the History Network. A client would likely use the Canonical Indices Network to fetch a "skeleton" of the header chain, meaning that they would do something like retrieve every 1000th header. Then, they would fill in the gaps between those headers concurrently, using Header.parent_hash to retrieve the parent header until they've retrieved all 999 headers in the gap.

This is an established approach for syncing the header chain, but due to the storage layout in the portal network, every single header has to be fetched individually. For a chain of 13 million headers, a client would have to maintain a sustained rate of 150 retrievals per second to sync the full chain within 24 hours. We can do better.

The "Merry Go Round" approach to these things is meant to take advantage of emergent coordination to speed up processes like this. The loose idea is as follows.

  • There is a known schedule such as: We sync the full header chain once per hour.
  • There is a gossip network dedicated to this synchronization process.
  • Over the course of the hour, nodes gossip contiguous sections of the header chain around the network on a known schedule.
    • Example: Headers 0-4096 are gossiped during the first 10 seconds, 4096-8192 in the second 10 seconds...

This process is continuous, and since the schedule is known ahead of time, it requires zero actual coordination but allows for emergent cooperation. Even just a few nodes fetching and assembling these contiguous blocks of headers can supply the data for an arbitrary number of other nodes to consume via gossip. The end result should be the ability for nodes to roughly saturate their inbound bandwidth and quickly and efficiently sync the data.

Other Applications

We can apply this idea to other things like...

  • Full history sync of block bodies and receipts
  • Full state sync of the leaf data

Create an official Stateless Ethereum Glossary

I think that as a very useful step to achieve 2 goals:

  1. To make it easier to onboard new people to this idea, so they know our lingo;

  2. To make communications in our community more clear and concise. So we know exactly what we mean if we mention witnesses, semi-stateless, implicit/explicit states, etc.

Looking at this post by Alexey, and trying to explain stateless ethereum to other people I would find this extremely useful.

Witness spec: little endian and U32

The witness specification has some basic types, including:

<U32> := u32:<Byte>^4		{u32 as a 32-bit unsigned integer in little-endian}

Is there any reason for this choice? Is little endian selected because most clients are little endian and this could be read without byteswapping? Was any kind of compression (like leb128) considered?

It would be nice to document the reason for the decision.

Require uint compression

Currently, the numbers in our witness spec aren't compressed, so they always take 4 bytes no matter if they can be stored with less data.

The possible options:

  • (currently in turbo-geth) use CBOR variable uint compression
  • use LEB128 encoding (with a mention that the shortest version should be considered to remove ambiguity);
  • use VLQ with a note about limiting ambiguity

One more thing that it would be nice to keep in the consideration, that it should be relatively easy to implement in different languages and on different platforms.

Staged Test Network Plan

I would like to outline some proposed structure for how we roll out our test network.

The goal of this test network is to iterate towards the full functionality necessary for the Chain History network.

Phases

Phase Zero: Overlay Only

During this phase we will simply establish our ability to establish a stable overlay network using the Portal Wire Protocol messages.

  • A: Clients support can be externally configured in some way to specify bootnodes.
  • B: Clients can use use the Portal Wire Protocol using the protocol identifier: 0x500B (History Network)
  • C: Clients can send and receive PING/PONG messages using the overlay protocol
    • "fake" radius information in the payload
  • D: Clients can send and respond to FINDNODE/FOUNDNODES messages using the overlay protocol.
  • E: Clients periodically check the liveliness of nodes in their routing table and evict unreachable nodes
  • F: Clients actively populate their routing table by exploring the DHT (typically via random exploration using the recursive find nodes algorithm)
  • G: Client has published instructions on how to build, configure, and launch a node for the test network.

In this phase, each team will be responsible for deploying however many nodes they wish. Each team may supply the ENR records for any stable "bootnodes" they will be operating.

What we are testing:

  1. Client interoperability and stability
    • Message encoding and decoding for PING/PONG/FINDNODE/FOUNDNODES
  2. Healthy overlay network
    • Do clients populate and maintain their routing tables as expected.
    • Can we navigate the network as expected to find nodes the closest nodes to a location.
Client A B C D E F G
Fluffy ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️
Trin ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️
Ultralight ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️

Phase One: Simple Content Retrieval

During this phase we will be testing the ability to find and retrieve content stored in the network by other nodes.

  • A. Clients have a basic implementation of a content database
    • The database can be manually populated with data
    • support for looking up a piece of content by it's content-id
  • B. Clients have support for the content-key format for headers referenced by their block hash.
  • C. Clients support a subset of the functionality for serving FINDCONTENT requests
    • Clients only need to support responding when the payload can be returned within the CONTENT message response.
    • Clients do not need to support uTP based transmission at this stage
  • D. A client can be launched with a pre-populated database of content
    • In this phase we will populate our test network with nodes which have had their content databases pre-populated.
    • We will likely do something like pre-loading the first 1 million block headers

What we are testing

  1. The ability for nodes to traverse the DHT and find the nodes which have the content they need
    • Any block header in the first million blocks should be retrievable by its block hash.
  2. The basic transmission of content without uTP
Client A B C D
Fluffy ✔️ ✔️ ✔️ ✔️
Trin ✔️ ✔️ ✔️ ✔️
Ultralight ✔️ ✔️ ✔️ ✔️

Phase Two: Gossip

During this phase we will build out the mechanisms needed for gossip, including use of the uTP protocol.

  • A. Clients support uTP
    • Listening for inbound connections
    • Establishing outbound connections
    • Receiving data sent over the uTP stream
  • B. Client support for gossip
    • Responding to OFFER messages if the content is of interest
    • Establishing outbound connection upon receiving an ACCEPT message and sending the data over uTP
    • Listening for an inbound connection after sending an ACCEPT message and receiving the data over uTP
    • Support for the block header content-key and validation of the block header.
    • Upon successful receipt of new content, performing neighborhood gossip.
  • C. Client support for the POKE mechanic.
    • Tracking of the nodes along the search path which did not have the content but should have been interested.
    • Once content was successfully retrieved, offering it to nodes along the search path.
  • D. Non critical but related functionality
    1. Management of the content database, eviction when full based on radius
    2. Validation that offered headers are indeed part of the canonical chain (depends on header accumulator)

What we are testing

  1. uTP based transmission of content
  2. The effective spread of new content using gossip
  3. The passive replication of existing content using POKE mechanic.
Client A B C D.i D.ii
Fluffy ✔️ ✔️ ✔️ ✔️ 🍕
Trin ✔️ ✔️ ✔️ ✔️ 🍕
Ultralight ✔️ ✔️ ✔️ ✔️

Phase Three: Head Tracking

TODO: we can no longer simply follow the chain with the highest TDD after merge. Need to implement light protocol for Eth2

NOTE: establish the "correct" solution and then look for a quick solution that will get us the needed functionality quickly.

Phase Four: Full Content Transmission

During this phase we will flesh out the remaining functionality for transmission of content, specifically, large payloads that must be sent over uTP

  • A. Clients support CONTENT response that contain a uTP connection ID
    • Upon receipt of a CONTENT response with a uTP connection-id, the client initiates a uTP stream and receives the content payload over the stream.
  • B. Clients serve FINDCONTENT requests for all content types (headers, bodies, receipt bundles, master and epoch accumulators)
  • C. Clients support block body and receipt bundle content types
  • D. Received block body and receipt content is validated using the block header.

What we are testing

  1. support for remaining content types (bodies and receipt bundles)
  2. clients implement full support for FINDCONTENT/CONTENT including full uTP based content transmission.
Client A B C D
Fluffy ✔️ ✔️ ✔️ ✔️
Trin ✔️ 🍕 ✔️ ✔️
Ultralight ✔️ ✔️ 🍕 ✔️

Phase Five: The Road To Production

TODO: what is left?

NOTE: visibility into the state of the network is probably valuable at this stage.

NOTE: verifying cross client compatibility will need to be done

Test vectors for Portal Wire Protocol messages

We need test vectors for the various portal wire protocol messages. I would encourage us to start simple by just embedding them in the markdown documents. We can do something more machine readable at a later date.

Handling multiple content items over uTP

At present, the specs do not state any opinion regarding how to stream multiple content items over uTP when a node ACCEPTs multiple contentKeys OFFERd by a peer. Since uTP is agnostic to what is sent over it, I suggest the following possible options for handling multiple content items:

  1. Run the existing uTP stream process (SYN -> DATA -> FIN) for each piece of content while reusing the same uTP connection ID for each piece of content being sent. This would require the clients to maintain some metadata to track how many individual streams had been received and only clear the socket once all content items have been received.
  2. Define a new uTP message payload that uses an SSZ List of the form Container(content_keys: List[ByteList, max_length=64]) (corresponding to the payload of the OFFER/ACCEPT messages that limit the number of content keys to 64)
    (thanks @kdeme for the idea!) I think I like this option better though it would require a decision on whether all uTP should use this new container or else define an SSZ union of either this list of bytelists or also a single bytelist (in the case of a FOUNDCONTENT response where the content needs to be sent over uTP).

I haven't embedded this in our uTP code yet but this PR implements the new list type and verifies that it can serialize and deserialize multiple blocks and easily access each item in the deserialized payload -

Reliable Transaction Gossip to Full Radius Nodes

@pipermerriam noted a gap in the design of the gossip mechanism:

https://github.com/ethereum/portal-network-specs/blob/77925bf2623db8ddc22fcb1ea8363f10ecc271af/transaction-gossip.md

The current gossip design does not provide reliable guarantees that transactions broadcast by low radius nodes will successfully be gossiped to full radius nodes.

Consider a transaction starts at a low radius r node. There is no distribution requirement with respect to r in a node's routing table. It is possible that the node's routing table contains only nodes with small radius; furthermore, the traversal of those nodes' routing tables also only touches low radius nodes. This transaction is stuck on a local typology containing only small radius nodes.

The issue arises because the routing table's node distribution is only designed to have some structure with respect to "node_id". Note that I am assuming a that the routing table follows the Kademlia algorithm. The node distribution is such that each k-bucket covers a specific distance range, namely [2^2, 2^(i+1)], where i is the k-bucket index. That means that if the node wants to get to a certain part of the network, the routing could help the node to get closer. However, if the node wants to reach a node with a larger radius and eventually reaching node with r=1, the routing table does not contain any useful deterministic or even probabilistic structure.

Snappy compression for data transmitted over the wire

One thing we could consider to shrink the amount of data that needs to be transmitted over the wire would be to follow RLPx and add snappy compression for the payloads of Portal Network messages (or at least a subset of them). My thought would be that that at a minimum, all BlockBodies messages in the History Network should be compressed. If we assume the numbers from EIP-706 scale, this would dramatically decrease the amount of data that needs to be shoved across the network, especially if EIP4444 goes through and our History Network becomes one of the de facto standards for syncing the historical chain.

Co-locate ContentKey definitions in the specifications

What is wrong

Currently, all of the ContentKey definitions are defined within the individual specifications for each of the different portal network.

This presents 3 problems

  1. Inconsistent formatting
  2. Duplicate definitions for any ContentKey used in multiple networks
  3. Duplication of selector values across networks, or inconsistent selector values for any key used across multiple networks.

How to fix

I am going to move all of the ContentKey definitions into a common document and adjusting the specifications to simply reference which content keys are supported within an individual network.

Define witness serialization fully

Currently, the spec mentions the CBOR spec, a subset of which we use to serialize variable-lenght fields of a witness. It makes sense to include the full serialization spec in this doc, so it can be implemented standalone.

The distance metric is of ring geometry instead of XOR

The overlay network's routing table use a custom distance function that is of "ring geometry" instead of XOR. discv5's lookup mechanism should not work unless the routing table itself is Kademlia-like, i.e. each k-th bucket has appropriate coverage of a range in the node space. If the only change to the algorithm is swapping out the distance metric, the Kademlia "lookup" algorithm does not work.

If we are also modifying the FINDNODES algorithm to use one that works with the distance measure that is essentially distance in a ring, should we make it explicit and specify the algorithm as well? For example, are we expecting the routing table to be more like chord or symphony.

Utp cross client testing

As it currently stands there are 3 upcoming utp implementations : fluffy, trin and ultralight. It would be great to establish few practices and test cases which would guarantee compatibility between all of the clients.

Things we can do (in increasing complexity order):

  1. Generate test vectors for encoding and decoding of each utp packet (DATA, FIN, STATE, SYN, RESET), each client could use them in their implementation unit test suites.
  2. Describe few test cases each implementation need to pass in theirs integration suites between 2 nodes of the same type.
  3. Each implementation could create test app which would start uTP instance with some config and expose simple api (via command line , json-rpc or whatever we decide) like:
connect
read
write
close

Having such app would make it possible to create CI integration test suite in each client or event test manually certain conditions between clients. Ideally we would also find/implement something similar to https://github.com/Shopify/toxiproxy but for udp transport (as discoverv5 works over udp) to inject network failures (lost packets, timeouts) while working on localhost
4. Imo gold standard test would be to have some small testnet (like 3 nodes, one of each type), with some monitoring attached, and every node would disseminate some content via OFFER/ACCEPT messages using utp. But this is more of the integration between utp subprotocol and history (or state) subprotocol.

It would be great to hear other teams thoughts on this topic :) cc: @victor-wei126 @acolytec3

Spec out what are the sub-DHT and gossip topics needed for beacon-chain

  • beacon-chain block headers gossip
  • sync_aggregation gossip
  • attestation gossip
  • beacon-chain state DHT
  • beacon-chain history DHT
  • etc.

Each of these sub-protocol could get its own page.

Note:

  1. We should tackle the first few that are required for beacon-chain to sync to the latest head.
  2. The next set should allow beacon-chain to function sufficiently for attesting, but probably not good enough for proposing.
  3. Provide sufficient data coverage to power a light-weight beacon-chain client fully capable of servicing a validator.

Relationship between content-key and content-id

In the history network, one of our base types of content is the "block body" which contains a list of transactions and uncles. One expected use case is retrieval of an individual transaction. If the only way to do this is retrieval of the full block body, we will incur a high overhead for each retrieval. For this reason we probably want to support retrieval of individual transactions.

In order to retrieve an individual transaction, it will need to come back with a merkle proof against the Header.transactions_root. There are two obvious ways to serve this data.

  1. A: Store individual transactions with accompanying proofs
  2. B: Serve them from the block body and construct the proof on-demand.

Option A will incur a very high additional storage overhead. Option B has implications on the content-key and content-id scheme.

Exploring Option B:

We will want a content-key for both the full block body, and for an individual transaction and we want both of these to map to the same content-id. This implies that the relationship between content-keys and content-ids is a many-to-one which potentially has some implications on implementations, and these might end up being "foot guns"

Witness tests

This is a solicitation for discussion towards a witness test format.

Some desired properties of a test format:

  • Tests are agnostic to witness binary format, and can be updated as the binary format changes.
  • Tests are easily parsable from various client languages. Perhaps JSON format.
  • Tests allow specifying both success cases and error cases.
  • Tests have tags allowing testing only certain things. For example, during implementation, it may be useful to try specific basic test cases.

Mechanism for keeping proofs up to date for cold state.

At present, the conceptual gossip mechanism for distributing new and updated state into the network looks roughly like this.

At each block, a proof P is generated which contains all of the state data touched during block execution. This proof P is comprised of account leafs, contract storage leafs, and the intermediate trie data necessary to anchor that data to the Header.state_root. A bridge node will then take this proof, and enumerate the leaf data, finding nodes that are interested in each piece of leaf data and distribute it to those nodes using OFFER/ACCEPT/STORE messages. These nodes will then repeat this process, offering the proof data they received to a subset of their interested neighbors. This process repeats until it hits the natural termination condition of none of the offered data being accepted.

The mechanism above ensures that the network contains updated proof information for each piece of leaf data at each block state root where that data was either created or modified.

Now, we consider an account which was modified at block N - 1 when we are currently on block N. In this case, there can be nodes that will have received a proof against the state root at height N - 1 who need to update their proof to be valid against the state root at height N.

Suppose we have the following trie of data.

We use a binary trie for simplified visuals

    0:                           X
                                / \
                              /     \
                            /         \
                          /             \
                        /                 \
                      /                     \
                    /                         \
    1:             0                           1
                 /   \                       /   \
               /       \                   /       \
             /           \               /           \
    2:      0             1             0             1
           / \           / \           / \           / \
          /   \         /   \         /   \         /   \
    3:   0     1       0     1       0     1       0     1
        / \   / \     / \   / \     / \   / \     / \   / \
    4: 0   1 0   1   0   1 0   1   0   1 0   1   0   1 0   1

       A   B C   D   E   F G   H   I   J K   L   M   N O   P

Now we suppose that during block N-1 the state data C was touched. The
corresponding proof that nodes which store C would receive would be as
follows.

The trie nodes with a * next to them are the nodes that actually changed.

    0:                         (N-1)*
                                / \
                              /     \
                            /         \
                          /             \
                        /                 \
                      /                     \
                    /                         \
    1:             0*                          1
                 /   \  
               /       \ 
             /           \
    2:      0*            1
           / \
          /   \
    3:   0     1*
              / \ 
    4:       0*  1 

       A   B C   D   E   F G   H   I   J K   L   M   N O   P

Now suppose that during block N the state data G was changed. The
corresponding proof that nodes which store G would receive would be as
follows.

The trie nodes with a * next to them are the nodes that actually changed.

    0:                           N
                                / \
                              /     \
                            /         \
                          /             \
                        /                 \
                      /                     \
                    /                         \
    1:             0*                          1
                 /   \  
               /       \ 
             /           \
    2:      0             1*
                         / \
                        /   \
    3:                 0     1*
                            / \
    4:                     0*  1

       A   B C   D   E   F G   H   I   J K   L   M   N O   P

The nodes storing C would need to receive the following proof in order to
update their proof from height N-1 to be valid against the state root at
heigh N.

    0:                           N
                                / \
                              /     \
                            /         \
                          /             \
                        /                 \
                      /                     \
                    /                         \
    1:             0*                          1
                 /   \  
               /       \ 
             /           \
    2:      0             1*
            
           
    3:    
         
    4:  

       A   B C   D   E   F G   H   I   J K   L   M   N O   P

This visualization is intended to highlight that the data needed to update the
proof for C at height N-1 to be valid against height N is a subset of the
proof at height N. This means that the data necessary to update proofs for
cold state is already present in the main proof P that is generated at each
block. We just need to design a mechanism that allows nodes in posession of
cold state to receive the appropriate subset of the proof data they need in
order to update their proofs.

Portal Network Roadmap

Originally pulled from trin testnet plans: ethereum/trin#118

What Is This?

A high level roadmap of the portal network build-out plans. Initially writing this up as an issue after which we can move this to a living document.

Phase 0: Experimentation and Research (DONE-ish)

This is what occurred from mid 2020 through early 2021. During this time the focus was on ensuring that we fully understood:

  • The data that was necessary to serve the JSON-RPC API
  • How that data could be spread out among the nodes to a DHT in a manner that allowed efficient retrieval
  • A preliminary plan for how we would deal with the problem of state data being inherently imbalanced

By early 2021 we had satisfactory answers to these issues.

Phase 1: Build out "State Network" (In-Progress)

Given that the State Network is the most complex of the networks, we chose to focus our efforts on it first since if we couldn't do state then all of the other networks are significantly less useful.

  • Initial build out only includes support for establishing overlay network
    • nodes support PING/PONG/FINDNODES/FOUNDNODES
    • nodes bootstrap into the network using hard coded ENR records
    • nodes pong when they receive a ping
    • nodes populate their routing table using the "recursive find nodes" algorithm to discover nodes in the network
    • nodes maintain their routing table by occasionally verifying liveliness of the records from their routing table
    • nodes serve FINDNODES requests with data from their routing tables
  • Support for basic content transmission
    • nodes support FINDCONTENT/FOUNDCONTENT
    • only support for small payloads. No uTP support yet
    • nodes serve FINDCONTENT requests from a local content database
  • Support for large content
    • FINDCONTENT/FOUNDCONTENT based retrieval supports uTP streams for larger content
  • Support for Gossip Primatives
    • nodes support OFFER/ACCEPT + uTP transfer of information
    • nodes support basic neighborhood gossip rules for re-transmission to nearby peers.

Phase 2: Build out "History Network" (In-Progress)

The history network is currently being worked on by members of the Core Developer Apprenticeship Program.

Phase 3+: Build out remaining nebtworks (Pending)

Transaction gossip, header gossip, and canonical indices networks

Phase XXX: Beacon chain data

Explore hosting the data for the beacon chain in a similar structure.

State network violates discv5 specification

Has been stated before on chat/calls but adding here as an issue for visibility/tracking:

https://github.com/ethereum/stateless-ethereum-specs/blob/master/discv5-utp.md#specification states: This protocol does not use the TALKRESP message.

https://github.com/ethereum/devp2p/blob/26e380b1f3a57db16fbdd4528dde82104c77fa38/discv5/discv5-wire.md#talkreq-request-0x05 states: The recipient must respond with a TALKRESP message containing the response to the request.

Proper solution could be to not use TALKREQ/TALKRESP for the underlying uTP transport but a different (stream focused) packet type (whether or not part of discv5 protocol).

Additionally, whether it is allowed using several TALKRESP messages to respond a TALKREQ is rather ambiguous. It is not stated in the discv5 specification that it is not allowed for this response, but the only place where it is explicitly stated to do this is in the NODES response message.
This is currently specified in the portal network for the Nodes message, similarly as it is done in discv5 base layer.
The Found Content message that can be similar (or larger probably) in size does not specify this however, so that response cannot exist out of several responses according to specification.

Chain History Section clarification

This not an issue per se, but rather a thread to resolve some questions regarding chain history section.

1. Current spec states that When a portal client requests a certain header, the response includes two accumulator proofs, which suggest that at any point portal client can request one header, receive one header in response along with two accumulator proofs, and be sure that received header is part of canonical chain. Unless I am missing something, it would be good to describe in spec that in reality client needs to randomly sample log(n) random blocks from peer to correctly validate that headers are in fact part of canonical chain. We can probably also re-use some of the ideas from https://eprint.iacr.org/2019/226.pdf paper.

2 It would be nice to add some clarifications what changes will be necessary to clients software and protocol to make this feature available and secure:

  • should new field be added to block header to make accumulator part of consensus. This is approach used in fly client paper for research purposes but it is probably no-go due to the amount of breakage it would do.
  • maybe we can adapt other field from block header for this purposes, like parentHash. This is the way Zcash went, when adapting to fly client protocol - https://zips.z.cash/zip-0221#abstract. Although they adapted other field than parentHash due to troubles with chain reorganisations.
  • Fly client paper mention something called velvet forks, when upgraded and not upgraded clients work in parallel.

Maybe there is some other way to do this, either way it would be great to include in in spec.

I just wanted kick start discussion about those things(or get some clarifications to the spec if those problems are already solved and all is fine) , as from usability point of view the state network is probably most important, as it enables user to check their balances and send transactions, but from security point of view chain history is probably more important one as all state inclusion proofs are validated against header which portal client thinks is the part of canonical chain.

Limiting software churn

Software engineers say "churn" to describe "flying the plane as you build it". In other words, rewriting software as new ideas emerge.

To limit such churn, it may be wise to delay breaking changes so that there are periods of stability. After a period of stability, we can batch any ready braking changes and apply them over a short period of time. Examples of breaking changes include size optimizations like #38 and code merkleization #11.

There may be exceptions. For example, it may be wise to immediately apply bug fixes, for example #45.

Non-breaking changes (which don't affect software) may be useful at any time. For example simplifications like #41.

Placement of content-type within content-key

Currently the state network specifies that the content-key is a concantenation of a 1-byte content-type along with a SSZ Container which contains all other relevant info. The history network on the other hand is just a single SSZ container, with the content-type as one of the fields within the container. I think we decided in the last call that we would settle on the latter, since that would be more efficient.

However @carver mentioned that "if we use a union type, the 4-byte value that SSZ uses to identify the type could fully replace the content-type byte." From my understanding of union types, then, we would only have one value, something like: Union[Container], where the value.value is the Container, and the value.selector is a 4-byte value that indicates the content-type of the Container it is containing.

But why would we want this? If we include content-type as field in the Container, then this uses only 1 byte, whereas the union solution would waste 3 extra bytes. My only thinking why we do this is by leaving content-type out of the Container, we get a self-contained struct that only contains fields relevant to the data item, not the protocol.

Questions regarding uTP

  • Why can't we use TCP purely to stream data, and avoid the discv5/UDP stack entirely during this transfer? What are the benefits of routing it through the current stack?
  • Why can't we build TCP on top of discv5? (my knowledge of TCP fails a bit here, but traditionally TCP would pass segment down to IP, not discv5. but what's preventing us from stuffing a tcp segment into a discv5 packet?)
  • Why uTP over other UDP-extensible protocols like QUIC?

Witness examples

It would be super-nice to have some encoded witness examples together with some note on the expected validity and output tree structure.

These can be really simple as a starter - so both on the number of examples and scope of the tries - but this would already help a lot to have some reference to check against.

Witness chunking

Describe how to rebuild a trie from a witness split into "chunks".

Witness spec: 0xbb canary

<Tree> := 0xbb m:<Metadata> n:<Tree_Node(0)>
          {a tuple (m, n)}

Is the 0xbb a canary or does it have any other use? If only a canary, it would be nice to mention it in a note.

Passive Bridges

This is an idea I had for a new bridge design. Any feedback would be appreciated.

The current bridge design requires nodes to actively push content into the network. This has the drawback of adding content that is potentially not as useful to end users of the network. Passive bridges solve this issue by fetching missing data lazily.

Passive bridges will work like normal network participants but can be identified by a special radius value. The radius value allows other participants to first query the network and if not found query the bridge. This allows for any missing content to be added on demand.

Truthful revelation of radius value

Copied from a side discussion on #89

The major thing that I believe needs to be addressed is that a node's published radius value cannot be trusted and nodes have an incentive to lie about this value. The larger your radius, the more responsibility you carry in the network... and the closer you are to the nodes in the network which take transactions and include them in a block. A node that wants to increase it's probability of getting their transaction included would be incentivized to lie about their radius in order to get closer to the full radius nodes.

So by this logic, we cannot structure our network based on radius unless we are able to make radius a value we can trust. Which leads me towards trying to understand if we can accurately "measure" a node's radius simply by observing what transactions they gossip.

Additional discussions about incentives and how to infer radius values are in issue 89.

Witness compression: storage leafs

The current specification is as follows:

<Storage_Leaf_Node(d<65)> := key:<Bytes32> val:<Bytes32>	
                             {leaf node with value (key, val)}

The key is always a hashed key, so it is not possible to do any reasonable compression on it. (Perhaps it is worth mentioning this in the spec?)

The val however is raw data. Since Solidity is the most widely used language, we can inspect what kind of value is it storing (this is only a subset of possibilities):

  1. Value types (those fitting 256-bits) are stored as single storage elements. Numbers are stored as 256-bit big endian numbers. Short fixed-size byte-arrays and strings are stored as left-to-right.

  2. Larger than 256-bits types are split across multiple keys. This is the case for dynamic-length arrays for example. If the array has regular value types as elements, they can be individually stored as described in 1.

  3. Short types can be packed in "structs".

Based on the above, the case we could provide a simple optimisation for is when numbers are stored in individual storage slots. Is that however a common case? Yes! Solidity "mappings" of balances, a common feature of ERC-20 tokens, would store the values in such a manner.

As a naive option we could extend the storage leafs as follows -- I hope I got the notation correctly:

<Storage_Leaf_Node(d<65)> := 0x00 key:<Bytes32> val:<Bytes32>	
                             {leaf node with value (key, val)}

<Storage_Leaf_Node(d<65)> := 0x01 key:<Bytes32> val:<Integer(256)>
                             {leaf node with value (key, val)}

In the above I refer to the Integer() notation introduced in #36. It could potentially also be combined with #38.

I am not sure it will be a worth optimisation in the end, but I think it would be a valid one to check once there are prototypes ready for experimentation.

Suggestion to abstract the wire protocol from the state and history network specifications.

This is an issue to discuss on how to avoid redoing the same or similar wire protocol code for the different networks.

Abstract wire protocol

The suggestion is to simply abstract out the wire protocol from the specifications, into its own specification that can be reused by the different network specifications, nothing more.
It is different from the Extensible Discover V5 Sub-Protocol Architecture proposal, but attempts to achieve the same, hopefully with less complexity involved. From what we know now from the state and history specifications (and probably also for tx and header), it looks like the Sub-Protocol Architecture proposal is not needed, and gives added complexity with little in return.

So the idea is to simply have the wire protocol in a separate specification and have the option to configure the talk protocol id (e.g. "portal:state", "portal:history", etc.). Each network has its own instance of the wire protocol, and still has its own routing table of course. Anything content related that needs to be customized can be done a layer above.

Both proposals do require to get the actual wire protocol to be the same, that doesn't change. (Although this is probably not strictly necessary to get it done in implementations, but makes it much easier, see custom messages above)
This means:

  • Same messages (request & responses)
  • Same SSZ Containers (aside from custom payload)

So some adjustments will have to be made to the current specifications (one is already being done: #75)

This all would require to abstract out content_key serialization, content_id generation, content storage, and probably other parts. But that should be a good thing, as it is something that ideally gets abstracted away from the networking code. And likely also something that would need to be done with the other proposal anyhow.
This can all be done fairly easy, as the actual SSZ sedes for example for content_key is a ByteList, not the Serialized SSZ Container that is held in the ByteList.

Pros/cons

Imo this suggestion is less complex to implement and to me it is also a bit more intuitive in terms of how the protocols are layered.
Additionally you can avoid deserializing the SSZ containers of the messages when you would not support a certain network. Stopping the data flow basically at the discv5 (talkreq/resp) level. It sounds like a better location of where to "split" the different network functionality.

Splitting also the "overlay" messages from the "content" causes you to split the wire protocol not at the level of the different networks, but at this different message functionality. This is rather annoying as for example both overlay and content will need to access the same routing table (it is the same network).I don't think that we really need to split these messages up in different protocols, wouldn't they always go hand in hand?
Additional custom messages (per network) could be handled by some form of wire protocol extensions, if there is need for such thing.

The Extensible Discover V5 Sub-Protocol Architecture proposal does have the benefit of allowing a custom payload per protocol message. And that is probably the main reason for having this proposal?
Right now, I don't see any of that customized data being necessary, but that can change of course. If someone already knows about specific payloads that are necessary, that would be good to know. Else I don't believe it is wise to complicate this spec/code for something we are not sure yet to be useful. Spec/code can always be altered when this turns out to be needed.
Additionally, I believe such custom payload can be done also with this proposal, by adding an (optional?) field in the container as a ByteList, which can then be differently constructed depending on the network (Yes, this would of course make it slightly more complex again, but at least you do this when you see it is needed)

Thoughts?

Additional info

Witness compression idea for account balances

(I shared this idea back on 21st April on Discord, where it was briefly discussed. I also acknowledge this is likely premature optimisation, but left it here for future reference.)

Account balances are 256-bit and Ether has 18 decimal digits, e.g. 1 Ether balance equals to 1e9, which is quite large, even if encoded as LEB128 as proposed in #30.

Since most account balances have a range between some dust amount and a few hundred Ether, we could use this to our advantage. (This actual median balance could be extracted from the state. As a starting point for experimentation I would suggest 0.1eth as the median balance.)

An idea would be to encode balances as a difference to this median value and store them as signed-LEB128. By signed-LEB128 I mean where a sign bit is stored and not that the value be a two's complement representation.

Note: I would not suggest this median value to be "rolling", rather just be defined in the specification.

Canonical witness verification

We need to be able to mitigate the following issues:

  • a malicious witness that is building a trie containing nodes that aren't used for block execution. That will still keep the root hash correct, but will waste bandwidth;
  • a malicious witness, building a trie with missing (hashed) information. Basically, it is very easy to get a trie that has all the nodes needed for the execution, replace one leaf with a hashnode keeping the root hash correct. That trie will look correct until you try to actually execute the block there.

Defining the structure of hexary patricia trie proofs

We need to define our data structure for working with proofs from the Hexary Patricia Merkle trie (the current trie structure used by Ethereum).

Desired Properties

  • A: There is exactly one correct proof for any account or storage slot.
  • B: Proofs are minimal only containing data that is necessary for the proof and nothing extra.
  • C: Validating proofs can be done in a single pass over the data with no sorting. This means that either the format includes metadata about the structure of the trie nodes or the structure of the trie nodes is fundamentally tied to the correct ordering and serialization of the proof.

Option A: Use eth_getProof

The simplest option is to use the format defined by EIP-1186 which defines the eth_getProof JSON-RPC endpoint.

This approach does not give us any of the expressed goals A, B, or C without additional specifying.

Option B: Extend eth_getProof

We can extend the specification from EIP-1186 such that it satisfies A, B, and C.

This involves precisely specifying the traversal ordering of trie nodes in the proof.

Option C: Write our own

We can still leverage the existing work from EIP-1186, but write our own specification.

Other considerations

Another thing to consider is a possible future where we are sending what I'll refer to as "sub-proofs" for bandwidth savings. Suppose that the proof I'm receiving is to update my proof from block N to the new state at block N+1. Additionally, suppose that I have already received some of the intermediate trie data for block N+1. One thing that we could do to reduce the amount of redundant data that must be transferred would be to allow the one receiving the data to communicate what parts of the trie they already have. This would mean that instead of having all proofs be anchored firmly to the state root, we would instead have the ability for proofs to only be anchored to some intermediate trie node since the receiving node will have already signaled that they have the necessary intermediate trie nodes locally to complete the proof.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.