GithubHelp home page GithubHelp logo

kholis / flink-service-discovery Goto Github PK

View Code? Open in Web Editor NEW

This project forked from eastcirclek/flink-service-discovery

0.0 0.0 0.0 10 KB

Discover Flink clusters on Hadoop YARN for Prometheus

Python 100.00%

flink-service-discovery's Introduction

flink-service-discovery

Discover your Flink clusters on Hadoop YARN and let Prometheus know. flink-service-discovery runs in two different modes:

  • Single mode
    • Given an application ID, it discovers just a single Flink cluster identified by the ID.
  • Service mode
    • Otherwise, it runs as a standalone service which monitors new applications on YARN and for each newly launched application, does the same as in the single mode.

flink-service-discovery communicates with YARN ResourceManager and Flink JobManager via REST APIs, and communicates with Prometheus via its file-based service discovery mechanism.

Getting Started

Prerequisites

Make sure you have all of the following prerequisites installed:

Usage

usage: discovery.py [-h] [--app-id APP_ID] [--name-filter NAME_FILTER]
                    [--target-dir TARGET_DIR] [--poll-interval POLL_INTERVAL]
                    rm_addr

Discover Flink clusters on Hadoop YARN for Prometheus

positional arguments:
  rm_addr               (required) Specify yarn.resourcemanager.webapp.address
                        of your YARN cluster.

optional arguments:
  -h, --help            show this help message and exit
  --app-id APP_ID       If specified, this program runs once for the
                        application. Otherwise, it runs as a service.
  --name-filter NAME_FILTER
                        A regex to specify applications to watch.
  --target-dir TARGET_DIR
                        If specified, this program writes the target
                        information to a file on the directory. Files are
                        named after the application ids. Otherwise, it prints
                        to stdout.
  --poll-interval POLL_INTERVAL
                        Polling interval to YARN in seconds to check
                        applications that are newly added or recently
                        finished. Default is 5 seconds.

Examples

Single mode

Let's discover all the Prometheus exporters of a Flink cluster identified by "application_1528160315197_0081". Here I assume that the cluster consists of one JobManager and three TaskManagers.

Output to stdout

$ python discovery.py http://master:8088 --app-id application_1528160315197_0081
[{"targets": ["slave03:9249", "slave01:9249", "slave02:9250", "slave02:9249"]}]

Output to a directory

Let's say I have configured prometheus.yml as shown in this link:

scrape_configs:
  - job_name: 'flink-service-discovery'
    file_sd_configs:
      - refresh_interval: 10s
        files:
        - /etc/prometheus/flink-service-discovery/application_*.json

As Prometheus is configured to watch /etc/prometheus/flink-service-discovery/application_*.json, I want to set --targer-directory to /etc/prometheus/flink-service-discovery. When --target-directory is specified, a file named after the application id is created under the specified target directory.

$ python discovery.py http://master:8088 --app-id application_1528160315197_0081 --target-directory /etc/prometheus/flink-service-discovery
/etc/prometheus/flink-service-discovery/application_1528160315197_0081.json : [{"targets": ["slave03:9249", "slave01:9249", "slave02:9250", "slave02:9249"]}]

Prometheus starts scrapping from the new Flink cluster soon after application_1528160315197_0081.json is created under /etc/prometheus/flink-service-discovery/

Service mode

flink-service-discovery in a service mode checks whether a new job is submitted or a previously running job is finished every 5 seconds. You can adjust the polling interval by giving other values to the --poll-interval argument. The --name-filter argument is quite necessary in production Hadoop clusters because you only want to discover Flink clusters out of the whole list of applications.

In this example, I am going to submit a Flink cluster to YARN and shut it down after a while to see whether flink-service-discovery detects changes in the list of applications and accordingly takes actions.

$ python discovery.py http://master:8088 --target-dir /etc/prometheus/flink-service-discovery
start polling every 5 seconds.
==== Wed Aug  1 16:53:00 2018 ====
# running apps :  3
# added        :  {'application_1528160315197_0082'}
# removed      :  set()
runningContainers(4) != jobmanager(1)+taskmanagers(0)
runningContainers(4) != jobmanager(1)+taskmanagers(2)
/etc/prometheus/flink-service-discovery/application_1528160315197_0082.json : [{"targets": ["slave01:9250", "slave02:9251", "slave03:9250", "slave03:9251"]}]

==== Wed Aug  1 16:55:00 2018 ====
# running apps :  2
# added        :  set()
# removed      :  {'application_1528160315197_0082'}
/etc/prometheus/flink-service-discovery/application_1528160315197_0082.json deleted

flink-service-discovery's People

Contributors

eastcirclek avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.