GithubHelp home page GithubHelp logo

kafka-stocks-datagen-with-ksql's Introduction

Stock Streaming Demo with Confluent Datagen and ksqlDB

This demo uses ksql-datagen (or sometimes referred to as kafka-connect-datagen) Kafka Connector to create fake stock data which is then consumed by ksqlDB.

Goals

  1. Produce fake stock data for two fictitious companies (ACME and HOOLI) to a Kafka topic where stock quotes are random doubles choosen between 100-200

  2. Find average stock price over a 1 minute tumbling window and filter down to those that are over 170 and tag with a SELL action or under 130 and tag with BUY action

  3. Dump the filtered 1 minute stock quote windows identified in step 2 to a topic back stream so that a consumer could then react to those buy/sell actions

Process

  1. Clone Repo
git clone https://github.com/amcquistan/kafka-stocks-datagen-with-ksql.git
  1. Fire Up Docker Compose Service
cd kafka-stocks-datagen-with-ksql
docker-compose up -d
  1. Create topic in Kafka
docker exec -it broker kafka-topics \
	--bootstrap-server localhost:9092 --create --topic stocks
  1. Add stocks-datagen Connector

Show Avro Schema and Data Generator Config.

Schema: stocks.avsc

{
  "type": "record",
  "name": "stockquote",
  "fields": [
    {
      "name": "symbol",
      "type": {
        "type": "string",
        "arg.properties": {
          "options": [
            "ACME",
            "HOOLI"
          ]
        }
      }
    },
    {
      "name": "quote",
      "type": {
        "type": "double",
        "arg.properties": {
          "range": {
            "min": 100,
            "max": 200
          }
        }
      }
    }
  ]
}

Connector config: stocks-data-config.json

{
  "name": "stocks-datagen",
  "config": {
    "connector.class": "io.confluent.kafka.connect.datagen.DatagenConnector",
    "kafka.topic": "stocks",
    "schema.string": "{\"type\":\"record\",\"name\":\"stockquote\",\"fields\":[{\"name\":\"symbol\",\"type\":{\"type\":\"string\",\"arg.properties\":{\"options\":[\"ACME\",\"HOOLI\"]}}},{\"name\":\"quote\",\"type\":{\"type\":\"double\",\"arg.properties\":{\"range\":{\"min\":100,\"max\":200}}}}]}",
    "schema.keyfield": "symbol",
    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "value.converter": "io.confluent.connect.avro.AvroConverter",
    "value.converter.schema.registry.url": "http://schema-registry:8081",
    "value.converter.schemas.enable": "false",
    "max.interval": 2000,
    "iterations": 10000
  }
}


I prefer HTTPie HTTP CLI client over CURL so, all REST requests will be shown with HTTPie client

http POST http://localhost:8083/connectors @stocks-data-config.json
  1. Verify Connector was added and Running

Is it added?

http http://localhost:8083/connectors

Should see this output.

HTTP/1.1 200 OK
Content-Length: 18
Content-Type: application/json
Date: Fri, 21 May 2021 19:33:53 GMT
Server: Jetty(9.4.33.v20201020)

[
    "stocks-datagen"
]

Is it running?

http http://localhost:8083/connectors/stocks-datagen/status

Should see this output.

HTTP/1.1 200 OK
Content-Length: 164
Content-Type: application/json
Date: Fri, 21 May 2021 19:34:00 GMT
Server: Jetty(9.4.33.v20201020)

{
    "connector": {
        "state": "RUNNING",
        "worker_id": "connect:8083"
    },
    "name": "stocks-datagen",
    "tasks": [
        {
            "id": 0,
            "state": "RUNNING",
            "worker_id": "connect:8083"
        }
    ],
    "type": "source"
}
  1. Fire Up KSQL Shell
docker exec -it ksqldb-cli ksql http://ksqldb-server:8088
  1. Create stocks_stream on stocks Topic in KSQL
CREATE STREAM stocks_stream_avro (
  symbol VARCHAR,
  quote DOUBLE
) WITH (KAFKA_TOPIC='stocks', VALUE_FORMAT='avro');
  1. Create stocks_1min_windows_tbl Aggregate Table in KSQL
CREATE TABLE stocks_1min_windows_tbl 
WITH (KAFKA_TOPIC='stocks_1min_windows_tbl') AS
SELECT
  symbol,
  AVG(quote) avg_1min_quote,
  CASE
    WHEN AVG(quote) > 170 THEN 'SELL'
    WHEN AVG(quote) < 130 THEN 'BUY'
  END action
FROM stocks_stream_avro
WINDOW TUMBLING ( SIZE 1 MINUTE )
GROUP BY symbol
HAVING AVG(quote) > 170 OR AVG(quote) < 130
EMIT CHANGES;
  1. Create stocks_buysell_interim_stream Stream in KSQL
CREATE STREAM stocks_buysell_interim_stream (
  symbol VARCHAR KEY,
  avg_1min_quote DOUBLE,
  action VARCHAR
) WITH (KAFKA_TOPIC='stocks_1min_windows_tbl', VALUE_FORMAT='avro');
  1. Create final stocks_buysell_stream Stream in KSQL
CREATE STREAM stocks_buysell_stream
WITH (KAFKA_TOPIC='stocks_buysell_stream') AS
SELECT * FROM stocks_buysell_interim_stream
WHERE symbol IS NOT NULL
PARTITION BY symbol
EMIT CHANGES;
  1. Start kafka-avro-console-consumer in KSQL

This would ideally be a consumer (in Java/Python) subscribed to the stocks_buysell_stream topic and initiating some response (actual loading up or off loading stocks).

docker exec -it broker kafka-avro-console-consumer --bootstrap-server localhost:9092 \
  --topic stocks --from-beginning \
  --property schema.registry.url=http://localhost:8081

kafka-stocks-datagen-with-ksql's People

Contributors

sciguymcq avatar

Stargazers

 avatar Kien Tran avatar

Watchers

James Cloos avatar Adam Mcquistan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.