GithubHelp home page GithubHelp logo

classicvalues / kafka-tests Goto Github PK

View Code? Open in Web Editor NEW

This project forked from avast/kafka-tests

1.0 1.0 0.0 952 KB

Integration test of Apache Kafka 0.9.0+ and Java clients.

License: BSD 3-Clause "New" or "Revised" License

Java 96.52% Shell 3.48%

kafka-tests's Introduction

Kafka 0.9 Tests

Group of tools to verify Kafka 0.9 is reliable enough and ready for production.

High-Level Description

Basic idea is to verify that every produced message is also consumed by all consumers in a reasonable amount of time.

  • GeneratingProducer
    • Generate groups per N messages and send them to Kafka.
    • Message key is string with ID of messages group.
    • Message value is integer with order in the current group of messages.
    • Sending of each message is marked in database.
    • Confirmation of each send from Kafka broker is marked in database.
  • AutoCommitConsumer
    • Consume messages from Kafka and mark them in database.
    • Let consumer client to commit offsets automatically.
  • SeekingConsumer
    • Consume messages from Kafka and mark them in database.
    • Commit offsets manually after predefined number of messages, handle rebalancing notifications.
    • Skip marking of messages occasionally to simulate e.g. HDFS or Cassandra error and seek back in the queue to the last committed offset.
  • ResultsUpdater
    • Periodically recompute current state and garbage collect processed stuff in database.
    • Verify that all produced messages were really consumed by all consumers on level of message groups.
    • Increase of counters and print their values.
    • There must be always exactly one instance running.
  • ChaoticManager
    • Periodically and randomly change previous decisions, start and stop producers and consumers.
    • There are bounds for min. and max. number of running components, frequency of updates and number of decisions per update.

Preconditions and Requirements

Kafka and ZooKeeper

Install Kafka and ZooKeeper standard way, standalone or as a cluster.

  • The topic below is expected to be present in the tests.
  • Update replication factor and number of partitions according to your needs.
ZOOKEEPER=localhost:2181 && cd ~/kafka && bin/kafka-topics.sh --zookeeper $ZOOKEEPER --create --replication-factor 2 --partitions 9 --topic kafka-test

Redis

It is used as a database for tracking flow of messages.

  • Confirmation that each produced message is also consumed.
  • Storage of results.

Installation (in Debian)

apt-get install redis-server redis-tools

Compilation

mvn package

Execution

Prefer to use shell scripts present in the top level project directory or go deeper to understand how the tools exactly work.

Start

  • Make sure all data are consumed from Kafka by all consumers.
    • Committed offsets should be at the latest positions.
    • Start consumers and stop them after a while if you are unsure.
  • Reset state stored in database.
  • Update configuration according to your needs.
  • Start ResultsUpdater always in one instance.
  • Start one or more instances of AutoCommitConsumer and SeekingConsumer.
    • Note there may be multiple consumers/threads inside based on Configuration.
  • Start one or more instances of GeneratorProducer

Long term stability test

The following test uses periodical pseudo random starting and stopping of producers and consumers with consumers rebalancing.

# On all nodes
./erase_all_data.sh
# Single instance
./run_ResultsUpdater.sh &
# On all nodes, possibly multiple instances
./run_AutoCommitConsumer_ChaoticManager.sh &
./run_SeekingConsumer_ChaoticManager.sh &
./run_GeneratorProducer_ChaoticManager.sh &

Add partitions test

Occasionally increase number of partitions and check logs of producers and consumer to verify they notice the change.

# On all nodes
./erase_all_data.sh
# Single instance
./run_ResultsUpdater.sh &
# On all nodes, possibly multiple instances
./run_AutoCommitConsumer.sh &
./run_AutoCommitConsumer.sh &
./run_SeekingConsumer.sh &
./run_SeekingConsumer.sh &
./run_GeneratorProducer.sh &
./run_GeneratorProducer.sh &
# Sometimes
date ; /opt/kafka/bin/kafka-topics.sh --zookeeper localhost --topic kafka-test --alter --partitions 42

Shutdown broker test

  • Replication factor configured for a topic must be at least 2.
  • Stop and start one of the Kafka brokers.
  • Verify data loss of confirmed messages is in expected range (see acks parameter of producer)
  • Verify assignment of leaders to Kafka brokers.
# On all nodes
./erase_all_data.sh
# Single instance
./run_ResultsUpdater.sh &
# On all nodes, possibly multiple instances
./run_AutoCommitConsumer.sh &
./run_AutoCommitConsumer.sh &
./run_SeekingConsumer.sh &
./run_SeekingConsumer.sh &
./run_GeneratorProducer.sh &
./run_GeneratorProducer.sh &

./run_AutoCommitConsumer.sh &
./run_SeekingConsumer.sh &
./run_GeneratorProducer.sh &
# Show assigned leaders and replicas
/opt/kafka/bin/kafka-topics.sh --zookeeper localhost --topic kafka-test --describe
# Stop or crash Kafka broker (multiple choices)
date ; stop kafka
date ; /opt/kafka/bin/kafka-shutdown-broker.sh
date ; /opt/kafka/bin/kafka-server-stop.sh
date ; pkill -9 -f kafka.Kafka
# Rebalance Kafka leaders, http://kafka.apache.org/documentation.html#basic_ops_leader_balancing
# Or auto.leader.rebalance.enable is enabled by default so waiting for a while should be enough
/opt/kafka/bin/kafka-preferred-replica-election.sh --zookeeper localhost

More instances, rebalancing

  • Look at state in logs of ResultsUpdater.
  • Start and stop producers to have higher/lower load of messages.
  • Start and stop consumers to test behavior of consumer during rebalancing.
    • Always at least one consumer instance in each group should be running.
  • Or use ChaoticManager to start and stop them periodically and pseudo randomly.

Stop

  • Shutdown all producers first.
  • Let all consumers to consume all messages from Kafka.
  • Let ResultsUpdater to process all data in database.

Issues found using this tool

kafka-tests's People

Contributors

mixalturek avatar

Stargazers

Classic Values avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.