GithubHelp home page GithubHelp logo

qpc-github / kafka-monitor Goto Github PK

View Code? Open in Web Editor NEW

This project forked from shopify/kafka-monitor

1.0 2.0 1.0 811 KB

Monitor the availability of Kafka clusters with generated messages.

License: Apache License 2.0

Shell 1.99% JavaScript 16.90% Java 76.30% Makefile 0.28% HTML 2.41% Batchfile 1.79% Dockerfile 0.34%

kafka-monitor's Introduction

Xinfra Monitor

Build Status

Xinfra Monitor (formerly Kafka Monitor) is a framework to implement and execute long-running kafka system tests in a real cluster. It complements Kafka’s existing system tests by capturing potential bugs or regressions that are only likely to occur after prolonged period of time or with low probability. Moreover, it allows you to monitor Kafka cluster using end-to-end pipelines to obtain a number of derived vital stats such as end-to-end latency, service availability, consumer offset commit availability, as well as message loss rate. You can easily deploy Xinfra Monitor to test and monitor your Kafka cluster without requiring any change to your application.

Xinfra Monitor can automatically create the monitor topic with the specified config and increase partition count of the monitor topic to ensure partition# >= broker#. It can also reassign partition and trigger preferred leader election to ensure that each broker acts as leader of at least one partition of the monitor topic. This allows Xinfra Monitor to detect performance issue on every broker without requiring users to manually manage the partition assignment of the monitor topic.

Xinfra Monitor is used in conjunction with different middle-layer services such as li-apache-kafka-clients in order to monitor single clusters, pipeline desination clusters, and other types of clusters as done in Linkedin engineering for real-time cluster healthchecks.

Getting Started

Prerequisites

Xinfra Monitor requires Gradle 2.0 or higher. Java 7 should be used for building in order to support both Java 7 and Java 8 at runtime.

Xinfra Monitor supports Apache Kafka 0.8 to 2.0:

  • Use branch 0.8.2.2 to work with Apache Kafka 0.8
  • Use branch 0.9.0.1 to work with Apache Kafka 0.9
  • Use branch 0.10.2.1 to work with Apache Kafka 0.10
  • Use branch 0.11.x to work with Apache Kafka 0.11
  • Use branch 1.0.x to work with Apache Kafka 1.0
  • Use branch 1.1.x to work with Apache Kafka 1.1
  • Use master branch to work with Apache Kafka 2.0

Configuration Tips

  1. We advise advanced users to run Xinfra Monitor with ./bin/kafka-monitor-start.sh config/kafka-monitor.properties. The default kafka-monitor.properties in the repo provides an simple example of how to monitor a single cluster. You probably need to change the value of zookeeper.connect and bootstrap.servers to point to your cluster.

  2. The full list of configs and their documentation can be found in the code of Config class for respective service, e.g. ProduceServiceConfig.java and ConsumeServiceConfig.java.

  3. You can specify multiple SingleClusterMonitor in the kafka-monitor.properties to monitor multiple Kafka clusters in one Xinfra Monitor process. As another advanced use-case, you can point ProduceService and ConsumeService to two different Kafka clusters that are connected by MirrorMaker to monitor their end-to-end latency.

  4. Xinfra Monitor by default will automatically create the monitor topic based on the e.g. topic-management.replicationFactor and topic-management.partitionsToBrokersRatio specified in the config. replicationFactor is 1 by default and you probably want to change it to the same replication factor as used for your existing topics. You can disable auto topic creation by setting produce.topic.topicCreationEnabled to false.

  5. Xinfra Monitor can automatically increase partition count of the monitor topic to ensure partition# >= broker#. It can also reassign partition and trigger preferred leader election to ensure that each broker acts as leader of at least one partition of the monitor topic. To use this feature, use either EndToEndTest or TopicManagementService in the properties file.

  6. When using Secure Sockets Layer (SSL) or any non-plaintext security protocol for AdminClient, please configure the following entries in the single-cluster-monitor props, produce.producer.props, as well as consume.consumer.props. https://docs.confluent.io/current/installation/configuration/admin-configs.html
    1. ssl.key.password
    2. ssl.keystore.location
    3. ssl.keystore.password
    4. ssl.truststore.location
    5. ssl.truststore.password

Build Xinfra Monitor

$ git clone https://github.com/linkedin/kafka-monitor.git
$ cd kafka-monitor 
$ ./gradlew jar

Start KafkaMonitor to run tests/services specified in the config file

$ ./bin/kafka-monitor-start.sh config/kafka-monitor.properties

Run Xinfra Monitor with arbitrary producer/consumer configuration (e.g. SASL enabled client)

Edit config/kafka-monitor.properties to specify custom configurations for producer in the key/value map produce.producer.props in config/kafka-monitor.properties. Similarly specify configurations for consumer as well. The documentation for producer and consumer in the key/value maps can be found in the Apache Kafka wiki.

$ ./bin/kafka-monitor-start.sh config/kafka-monitor.properties

Run SingleClusterMonitor app to monitor kafka cluster

Metrics produce-availability-avg and consume-availability-avg demonstrate whether messages can be properly produced to and consumed from this cluster. See Service Overview wiki for how these metrics are derived.

$ ./bin/single-cluster-monitor.sh --topic test --broker-list localhost:9092 --zookeeper localhost:2181

Run MultiClusterMonitor app to monitor a pipeline of Kafka clusters connected by MirrorMaker

Edit config/multi-cluster-monitor.properties to specify the right broker and zookeeper url as suggested by the comment in the properties file

Metrics produce-availability-avg and consume-availability-avg demonstrate whether messages can be properly produced to the source cluster and consumed from the destination cluster. See config/multi-cluster-monitor.properties for the full jmx path for these metrics.

$ ./bin/kafka-monitor-start.sh config/multi-cluster-monitor.properties

Get metric values (e.g. service availability, message loss rate) in real-time as time series graphs

Open localhost:8000/index.html in your web browser.

You can edit webapp/index.html to easily add new metrics to be displayed.

Query metric value (e.g. produce availability and consume availability) via HTTP request

curl localhost:8778/jolokia/read/kmf.services:type=produce-service,name=*/produce-availability-avg

curl localhost:8778/jolokia/read/kmf.services:type=consume-service,name=*/consume-availability-avg

You can query other JMX metric value as well by substituting object-name and attribute-name of the JMX metric in the query above.

Run checkstyle on the java code

./gradlew checkstyleMain checkstyleTest

Build IDE project

./gradlew idea
./gradlew eclipse

Wiki

kafka-monitor's People

Contributors

andrewchoi5 avatar arsenaid avatar ccl0326 avatar coffeepac avatar d1egoaz avatar gliptak avatar hackerwin7 avatar jlisam avatar jonathansantilli avatar lindong28 avatar nullne avatar oeegee avatar slevinbe avatar smccauliff avatar warrengreen avatar xiowu0 avatar zhuchen1018 avatar zigarn avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.