GithubHelp home page GithubHelp logo

chainsformer's Introduction

Table of Contents generated with DocToc

Overview

Chainsformer is an Apache Arrow Flight service built on top of ChainStorage as a stateless adaptor service. It currently supports batch data processing and micro batch data streaming from ChainStorage service to the Spark data processing platform.

It aims to provide a set of easy to use interfaces to support spark consumers to read and process ChainStorage Data on the Spark platform:

  • It defines a set of standardized block and transaction data schema for each asset class (i.e EVM assets or bitcoin).
  • It provides data transformation capability from protobuf to Arrow format.
  • It can be easily scaled up to support higher data throughput.
  • It can be easily integrated via the Chainsformer Spark Connector (https://github.com/coinbase/chainsformer-spark-source) for structured data streaming.

Quick Start

Make sure your local go version is 1.18 by running the following commands:

brew install [email protected]
brew unlink go
brew link [email protected]

brew install [email protected]
brew unlink protobuf
brew link protobuf

To set up for the first time (only done once):

make bootstrap

Rebuild everything:

make build

Configuration

Environment Variables

Chainsformer depends on the following environment variables to resolve the path of the configuration. The directory structure is as follows: config/chainsformer/{blockchain}/{network}/{environment}.yml.

  • CHAINSFORMER_CONFIG: This env var, in the format of {blockchain}-{network}, determines the blockchain and network managed by the service. The naming is defined in chainstorage/protos/coinbase/c3/common/common.protp
  • CHAINSFORMER_ENVIRONMENT: This env var controls the {environment} in which the service is deployed. Possible values include production , development, and local (which is also the default value).

Service Configurations

Asset specific configurations are stored in the config directory under the Chainsformer service repo. The config folder structure follows the following form ./config/chainsformer/{blockchain}/{network}/base.yml

New Blockchain Configurations

  • Simply follow the config folder structure to add new configurations for any new blockchains or new networks of existing blockchains.
  • Add new tests in the config_test.go
  • Add new test configs in teh testapp.go

Development

Running Chainsformer Server

Clone the Chainsformer service repo:

git clone https://github.com/coinbase/chainsformer.git

Change directory to the Chainsformer service repo:

cd chainsformer

Setup Chainstorage SDK credentials

export CHAINSTORAGE_SDK_AUTH_HEADER=cb-nft-api-token
export CHAINSTORAGE_SDK_AUTH_TOKEN=****

To set up Chainsformer for the first time (only done once):

make bootstrap

Rebuild Chainsformer:

make build

Start the Chainsformer service with default CHAINSFORMER_CONFIG=ethereum-mainnet:

make server

Run test client

Query Chainsformer for a range of blocks

go run ./cmd/client --env local --blockchain ethereum --network mainnet --start 0 --end 10 --table blocks

Query Chainsformer for a range of block events

go run ./cmd/client --env local --blockchain ethereum --network mainnet --start 0 --end 10 --table streamed_blocks

Use grpcurl

Query Chainsformer for a range of blocks

Calling the GetSchema API

cmd=$(echo -n '{"table": "blocks"}' | base64)
grpcurl --plaintext -d '{"cmd":'"\"$cmd\""',"type":2}' localhost:9090 arrow.flight.protocol.FlightService.GetSchema

Calling the GetFlightInfo API to partition the data

cmd=$(echo -n '{"batch_query": {"start_height": 0, "end_height": 10, "table": "blocks"}}' | base64)
grpcurl --plaintext -d '{"cmd":'"\"$cmd\""',"type":2}' localhost:9090 arrow.flight.protocol.FlightService.GetFlightInfo

Take one of the ticket returned by the above command

...
"endpoint": [
    {
      "ticket": {
        "ticket": "eyJiYXRjaF9xdWVyeSI6eyJlbmRfaGVpZ2h0IjoiMTAiLCJ0YWJsZSI6ImJsb2NrcyJ9fQ=="
      }
    }
  ]
...

Calling the DoGet API to get data for one of the partition

grpcurl --plaintext -d '{"ticket": "eyJiYXRjaF9xdWVyeSI6eyJlbmRfaGVpZ2h0IjoiMTAiLCJ0YWJsZSI6ImJsb2NrcyJ9fQ=="}' localhost:9090 arrow.flight.protocol.FlightService.DoGet

Calling the DoGet API to get data of a specific partition

cmd=$(echo -n '{"batch_query":{"start_height":"1", "end_height":"2", "table":"blocks"}}' | base64)
grpcurl --plaintext -d '{"ticket": '"\"$cmd\""'}' localhost:9090 arrow.flight.protocol.FlightService.DoGet

Calling the DoAction API to get the tip in ChainStorage via Chainsformer

grpcurl --plaintext -d '{"type": "TIP"}' localhost:9090 arrow.flight.protocol.FlightService.DoAction | jq '.body | @base64d'

Query Chainsformer for a range of blocks events

Calling the GetSchema API

cmd=$(echo -n '{"table": "streamed_blocks"}' | base64)
grpcurl --plaintext -d '{"cmd":'"\"$cmd\""',"type":2}' localhost:9090 arrow.flight.protocol.FlightService.GetSchema

Calling the GetFlightInfo API to partition the data

cmd=$(echo -n '{"stream_query": {"start_sequence": 0, "end_sequence": 10, "table": "streamed_blocks"}}' | base64)
grpcurl --plaintext -d '{"cmd":'"\"$cmd\""',"type":2}' localhost:9090 arrow.flight.protocol.FlightService.GetFlightInfo

Take one of the ticket returned by the above command

...
"endpoint": [
    {
      "ticket": {
        "ticket": "eyJzdHJlYW1fcXVlcnkiOnsic3RhcnRfc2VxdWVuY2UiOiIxIiwiZW5kX3NlcXVlbmNlIjoiMTAiLCJ0YWJsZSI6InN0cmVhbWVkX2Jsb2NrcyJ9fQ=="
      }
    }
  ]
...

Calling the DoGet API to get data for one of the partition

grpcurl --plaintext -d '{"ticket": "eyJzdHJlYW1fcXVlcnkiOnsic3RhcnRfc2VxdWVuY2UiOiIxIiwiZW5kX3NlcXVlbmNlIjoiMTAiLCJ0YWJsZSI6InN0cmVhbWVkX2Jsb2NrcyJ9fQ=="}' localhost:9090 arrow.flight.protocol.FlightService.DoGet

Calling the DoGet API to get data of a specific partition

cmd=$(echo -n '{"stream_query":{"start_sequence":"1", "end_sequence":"2", "table":"streamed_blocks"}}' | base64)
grpcurl --plaintext -d '{"ticket": '"\"$cmd\""'}' localhost:9090 arrow.flight.protocol.FlightService.DoGet

Calling the DoAction API to get the tip in ChainStorage via Chainsformer

grpcurl --plaintext -d '{"type": "STREAM_TIP"}' localhost:9090 arrow.flight.protocol.FlightService.DoAction | jq '.body | @base64d'

Testing

Unit Test

# Run everything
make test

Integration Test

Under development

Functional Test

Under development

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.