GithubHelp home page GithubHelp logo

ztlpn / kgo-verifier Goto Github PK

View Code? Open in Web Editor NEW

This project forked from redpanda-data/kgo-verifier

0.0 0.0 0.0 130 KB

Test utility based on franz-go, for consistency checking of Redpanda reads vs. writes

Go 100.00%

kgo-verifier's Introduction

kgo-verifier

This is a test utility for validating Redpanda data integrity under various produce/consume patterns, especially random reads that stress tiered storage.

Purpose

Stress test redpanda with concurrent random reads, a particularly important case for validating tiered storage (shadow indexing) where random reads tend to lead to lots of cache promotion/demotion.

The tool is meant to be chaos-tolerant, i.e. it should not drop out when brokers become unavailable or cannot respond to requests, and it should continue to be able to validate its own output if that output was written in bad conditions (e.g. with producer retries).

Additionally verify content of reads, to check for offset translation issues: the key of each message includes the offset where the producer expects it to land. The producer keeps track of which messages were successfully committed at the expected offset, and writes it out to a file. Consumers then consult this file while reading, to check whether a particular offset is expected to contain a valid message (i.e. key matches offset) or not.

Usage

  • Brokers must not use TLS (in BYOC that means run this script inside your k8s cluster and refer to brokers by pod IP)

1. Quick produce+consume smoke test: produce and then consume in the same process

kgo-verifier --brokers $BROKERS --username $SASL_USER --password $SASL_PASSWORD --topic $TOPIC --msg_size 128000 --produce_msgs 10000 --rand_read_msgs 10 --seq_read=1

2. A long running producer

Run exactly one of these at a time, it writes out a valid_offsets_{topic}.json file, so multiple concurrent producers would interfere with one another

kgo-verifier --brokers $BROKERS --username $SASL_USER --password $SASL_PASSWORD --topic $TOPIC --msg_size 128000 --produce_msgs 10000 --rand_read_msgs 0 --seq_read=0

3. A sequential consumer.

Run one of these inside a while loop to continuously stream the whole content of the topic.

kgo-verifier --brokers $BROKERS --username $SASL_USER --password $SASL_PASSWORD --topic $TOPIC --msg_size 128000 --produce_msgs 0 --rand_read_msgs 0 --seq_read=1 

4. A parallel random consumer

The --parallel flag says how many read fibers to run concurently

kgo-verifier --brokers $BROKERS --username $SASL_USER --password $SASL_PASSWORD --topic $TOPIC --msg_size 128000 --produce_msgs 0 --rand_read_msgs 10 --seq_read=0 --parallel 4

5. A very parallel random consumer

aims to emit so many concurrent reads that the shadow index cache may violate its size bounds (e.g. do 64 concurrent reads of 1GB segments, when the cache size limit is only 50GB). Keep rand_read_msgs at 1 to constrain memory usage.

kgo-verifier --brokers $BROKERS --username $SASL_USER --password $SASL_PASSWORD --topic $TOPIC --msg_size 128000 --produce_msgs 0 --rand_read_msgs 1 --seq_read=0 --parallel 64

kgo-verifier's People

Contributors

jcsp avatar nyalialui avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.