GithubHelp home page GithubHelp logo

liushv0 / prometheus-benchmark Goto Github PK

View Code? Open in Web Editor NEW

This project forked from victoriametrics/prometheus-benchmark

3.0 0.0 2.0 1.4 MB

Benchmark for Prometheus-compatible systems

License: MIT License

Go 63.70% Makefile 28.85% Smarty 5.62% Dockerfile 1.82%

prometheus-benchmark's Introduction

update log

增加range querier组件,需自行构建,见 services/range-querier/README.md

issue

1,默认拉起node exporter作为数据源,但node exporter实际的时间线数量未必能达到1230; 2,alert.yaml中配置的部分指标,node exporter中可能并未吐出,所以alert.yaml中有一部分是无效配置;

Prometheus benchmark

Prometheus-benchmark allows testing data ingestion and querying performance for Prometheus-compatible systems on production-like workload.

Prometheus-benchmark provides the following features:

The following systems can be tested with prometheus-benchmark:

How does it work?

The prometheus-benchmark scrapes metrics from node_exporter and pushes the scraped metrics to the configured Prometheus-compatible remote storage systems. These systems must support Prometheus remote_write API for measuring data ingestion performance. Optionally these systems may support Prometheus querying API for measuring query performance.

The helm chart deploys the following pods:

  • vmagent with the following containers:
    • nodeexporter - collects real metrics from Kubernetes node where it runs.
    • nginx - caches responses from nodeexporter for 1 second in order to reduce load on it when scraping big number of targets.
    • vmagent-config-updater - generates config for target scraping. It is also responsible for generating time series churn rate via periodic updating of the generated targets.
    • vmagent - scrapes nodeexporter metrics via nginx for targets generated by vmagent-config-updater.
  • vmalert with the following containers:
    • vmalert - periodically executes these alerting rules (aka read queries) against the testes remote storage.
    • alertmanager - receives notifications from vmalert. It is configured as a blackhole for the received notifications. vmalert pod is optional - it is used for generating read query load.
  • vmsingle - this pod runs a single-node VictoriaMetrics, which collects metrics from vmagent and vmalert pods, so they could be analyzed during benchmark execution.

Articles

Benchmarking Prometheus-compatible time series databases.

How to run

It is expected that Helm3 is already installed and configured to communicate with Kubernetes cluster where the prometheus-benchmark should run.

Check out the prometheus-benchmark sources:

git clone https://github.com/VictoriaMetrics/prometheus-benchmark
cd prometheus-benchmark

Then edit the chart/values.yaml with the desired config params. Then optionally edit the chart/files/alerts.yaml with the desired queries to execute at remote storage systems. Then run the following command in order to install the prometheus-benchmark components in Kuberntes and start the benchmark:

make install

Run the following command in order to inspect the metrics collected by the benchmark:

make monitor

After that go to http://localhost:8428/targets in order to see which metrics are collected by the benchmark. See monitoring docs for details.

After the benchmark is complete, run the following command for removing prometheus-benchmark components from Kuberntes:

make delete

By default the prometheus-benchmark is deployed in vm-benchmark Kubernetes namespace. The namespace can be overriden via NAMESPACE environment variable. For example, the following command starts the prometheus-benchmark chart in foobar k8s namespace:

NAMESPACE=foobar make install

See the Makefile for more details on available make commands.

Monitoring

The benchmark collects various metrics from its components. These metrics are available for querying at http://localhost:8428/vmui after running make monitor command. The following metrics might be interesting to look at during the benchmark:

  • Data ingestion rate:
sum(rate(vm_promscrape_scraped_samples_sum{job="vmagent"})) by (remote_storage_name)
max(vmalert_iteration_duration_seconds{quantile="0.99",job="vmalert"}) by (remote_storage_name)
  • 99th percentile for the duration to push the collected data to the configured remote storage systems at chart/values.yaml:
histogram_quantile(0.99,
  sum(increase(vmagent_remotewrite_duration_seconds_bucket{job="vmagent"}[5m])) by (vmrange,remote_storage_name)
)

It is recommended also to check the following metrics in order to verify whether the configured remote storage is capable to handle the configured workload:

  • The number of dropped data packets when sending them to the configured remote storage. If the value is bigger than zero, then the remote storage refuses to accept incoming data. It is recommended inspecting remote storage logs and vmagent logs in this case.
sum(rate(vmagent_remotewrite_packets_dropped_total{job="vmagent"})) by (remote_storage_name)
  • The number of retries when sending data to remote storage. If the value is bigger than zero, then this is a sign that the remote storage cannot handle the workload. It is recommended inspecting remote storage logs and vmagent logs in this case.
sum(rate(vmagent_remotewrite_retries_count_total{job="vmagent"})) by (remote_storage_name)
  • The amounts of pending data at vmagent side, which isn't sent to remote storage yet. If the graph grows, then the remote storage cannot keep up with the given data ingestion rate. Sometimes increasing the writeConcurrency at chart/values.yaml may help if there is a high network latency between vmagent at prometheus-benchmark and the remote storage.
sum(vm_persistentqueue_bytes_pending{job="vmagent"}) by (remote_storage_name)
  • The number of errors when executing queries from chart/files/alerts.yaml. If the value is bigger than zero, then the remote storage cannot handle the query workload. It is recommended inspection remote storage logs and vmalert logs in this case.
sum(rate(vmalert_execution_errors_total{job="vmalert"})) by (remote_storage_name)

The prometheus-benchmark doesn't collect metrics from the tested remote storage systems. It is expected that a separate monitoring is set up for whitebox monitoring of the tested remote storage systems.

prometheus-benchmark's People

Contributors

dmitryk-dk avatar f41gh7 avatar frezes avatar hagen1778 avatar liushuan-ls avatar valyala avatar zekker6 avatar

Stargazers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.