GithubHelp home page GithubHelp logo

bartosz347 / telemetry-demo Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 36 KB

Demo project for application monitoring and distributed tracing.

Dockerfile 1.65% Go 94.20% Python 4.15%
monitoring opentelemetry prometheus tracing microservices

telemetry-demo's Introduction

About

This demo presents 3 ways of collecting metrics from distributed systems based on microservices:

  1. Prometheus
  2. OpenTelemetry metrics
  3. OpenTelemetry metrics generated from spans (using spanmetrics processor)

It also includes OpenTelemetry tracing demonstration (see Tools section).

There are 2 variants of this demo:

  1. Basic – in which requests are sent directly to the microservices.
  2. Extended – utilizing HAProxy, which works as a loadbalancer for microservices and therefore allows horizontal scaling of the microservices.

Project structure

Demo microservice

A simple microservice has been created for demonstration purposes. It has two functions: (1) calling other microservices and (2) simulating internal processing (using a dummy loop). A list of microservices that should be called by given microservice can be set in its SERVICES_TO_CALL environmental variable (see docker-compose.yaml).

Endpoints

The demo microservice provides the following HTTP API endpoints:

  • /api/action – action endpoint that calls services listed in SERVICES_TO_CALL variable and runs a dummy loop that simulates internal processing. In particular, when SERVICES_TO_CALL is empty, only internal processing is executed by given microservice.

    This endpoint returns OK after receiving a response from all called microservices or ERROR if at least one microservice does not return success or at least one request times out.

  • /api/health – health check endpoint that always returns OK.

Example

Let's assume we have prepared the following configuration:

  • 3 microservices: app1 (available publicly at port 8081), app2 and app3.
  • Variable SERVICES_TO_CALL for app1 is set to: app2:8080, app3:8080.
  • Value of SERVICES_TO_CALL for app2 and app3 is empty.

After calling the action endpoint for app1 (http://localhost:8081/api/action), app1 will call internally both app2 and app3. As a result app2 and app3 will execute their 'internal processing' (dummy loop) and respond to app1, app1 will also execute its 'internal processing', and finally return the response.

It is also possible to change the complexity of dummy loops in each microservice individually by supplying a complexity parameter for each of them in the following way: http://localhost:8081/api/action?config=app1:100000,app2:100000,app3:100000

Please note that other scenarios can be easily prepared by creating more microservices and adjusting their SERVICES_TO_CALL variables.

Configuration – environmental variables

  • APP_NAME – name of the app (e.g. app1)
  • SERVICES_TO_CALL – list of services (address and port) that should be called by given service. Be careful, avoid loops! (e.g. app2:8080,app3:8080)
  • OTEL_AGENT – address and port of OpenTelemetry agent (e.g. otel-agent:4317)

Tools

This demo consists of the following frontend tools:

Additional resources:

Deployment

Basic variant:

Start

cd deployment
docker-compose -f docker-compose_basic.yaml up -d

Rebuild the images

docker-compose -f docker-compose_basic.yaml up -d --build

Extended variant (with HAProxy loadbalancer):

Start

cd deployment
docker-compose -f docker-compose.yaml up -d

Rebuild the images

docker-compose -f docker-compose.yaml up -d --build

Available metrics

The following metrics are available in Prometheus for each microservice:

  1. Native Prometheus metrics <APP_NAME>_operation_latency_bucket{status="OK|ERROR", type="internal-only|total"}
    Labels:
    • type=total – whole request processing time (internal processing + external service calls)
    • type=internal-only – internal processing time (only dummy loop)
  2. OpenTelemetry metrics (metrics collected with OpenTelemetry and exported for Prometheus) otel_<APP_NAME>_operation_latency_bucket{status="OK|ERROR", type="internal-only|total"}
    Labels:
    • type=total – whole request processing time (internal processing + external service calls)
    • type=internal-only – internal processing time (only dummy loop)
  3. OpenTelemetry metrics generated from spans (using spanmetrics processor, exported for Prometheus). See Jaeger for more information about labels.
    otel_latency_bucket{service_name="<APP_NAME>", operation="internal-processing|/api/action|GET, status_code=STATUS_CODE_OK|STATUS_CODE_ERROR"}

<APP_NAME> – name of the application (APP_NAME variable), e.g. app1

Click to open a Prometheus example for app1

Notes

  • DNS discovery may cause delays for Prometheus metric updates and HAProxy nodes list.
  • Metrics buckets should be adjusted (see bucketsConfig in app/monitoring/common.go and latency_histogram_buckets in otel-collector-config.yaml).
  • In current configuration OpenTelemetry will trace all requests.

telemetry-demo's People

Contributors

bartosz347 avatar

Stargazers

 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.