GithubHelp home page GithubHelp logo

cnwangjie / bend-ingest-kafka Goto Github PK

View Code? Open in Web Editor NEW

This project forked from databendcloud/bend-ingest-kafka

1.0 0.0 0.0 114 KB

Ingest kafka data into databend

License: Apache License 2.0

Go 99.92% Makefile 0.08%

bend-ingest-kafka's Introduction

bend-ingest-kafka

Ingest kafka data into databend

Installation

go install  github.com/databendcloud/bend-ingest-kafka@latest

Or download the binary from the release page.

Usage

Json transform mode

The json transform mode is the default mode which will transform the kafka data into databend table, you can use it by setting the --is-json-transform to true.

Create a table according your kafka data structrue

For example, the kafka data like

{"i64": 10,"u64": 30,"f64": 20,"s": "hao","s2": "hello","a16":[1],"a8":[2],"d": "2011-03-06","t": "2016-04-04 11:30:00"}

you should create a table using

CREATE TABLE test_ingest (
			i64 Int64,
			u64 UInt64,
			f64 Float64,
			s   String,
			s2  String,
			a16 Array(Int16),
			a8  Array(UInt8),
			d   Date,
			t   DateTime);

execute bend-ingest-kafka

command line mode

bend-ingest-kafka
  --kafka-bootstrap-servers="127.0.0.1:9092,127.0.0.2:9092"\
  --kafka-topic="Your Topic"\
  --kafka-consumer-group= "Consumer Group"\
  --databend-dsn="http://root:[email protected]:8000"\
  --databend-table="db1.tbl" \
  --data-format="json" \
  --batch-size=100000 \
  --batch-max-interval=300

config file mode

Config the config file confg/conf.json

{
  "kafkaBootstrapServers": "localhost:9092",
  "kafkaTopic": "ingest_test",
  "KafkaConsumerGroup": "test",
  "mockData": "",
  "isJsonTransform": true,
  "databendDSN": "https://cloudapp:password@tn3ftqihs--medium-p8at.gw.aws-us-east-2.default.databend.com:443",
  "databendTable": "default.kfk_test",
  "batchSize": 1,
  "batchMaxInterval": 5,
  "dataFormat": "json",
  "workers": 1,
  "copyPurge": false,
  "copyForce": false
}

and execute the command

./bend-ingest-kafka 

Raw mode

The raw mode is used to ingest the raw data into databend table, you can use it by setting the isJsonTransform to false. In this mode, we will create a table with the name databendTable which columns are (uuid,raw_data,record_metadata,add_time) and ingest the raw data into this table. The record_metadata is the metadata of the kafka record which contains the topic, partition, offset, create_time, key, and the add_time is the time when the record is added into databend.

Example

If the kafka json data is:

{"i64": 10,"u64": 30,"f64": 20,"s": "hao","s2": "hello","a16":[1],"a8":[2],"d": "2011-03-06","t": "2016-04-04 11:30:00"}

run the command

./bend-ingest-kafka 

with config.conf.json and the table default.kfk_test will be created and the data will be ingested into this table.

Parameter References

Parameter Description Default example
kafkaBootstrapServers kafka bootstrap servers "127.0.0.1:64103" "127.0.0.1:9092,127.0.0.2:9092"
kafkaTopic kafka topic "test" "test"
KafkaConsumerGroup kafka consumer group "kafka-bend-ingest" "test"
mockData mock data "" ""
isJsonTransform is json transform true true
databendDSN databend dsn no "http://localhost:8000"
databendTable databend table no "db1.tbl"
batchSize batch size 1000 1000
batchMaxInterval batch max interval 30 30
dataFormat data format json "json"
workers workers thread number 1 1
copyPurge copy purge false false
copyForce copy force false false
DisableVariantCheck disable variant check false false

bend-ingest-kafka's People

Contributors

hantmac avatar flaneur2020 avatar

Stargazers

Michael Eaton avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.