sinkingpoint / prometheus-gravel-gateway Goto Github PK

A Prometheus Aggregation Gateway for FAAS applications

License: GNU Lesser General Public License v3.0

Rust 99.01% Dockerfile 0.99%

prometheus-gravel-gateway's Introduction

Gravel Gateway

Gravel Gateway is a Prometheus Push Gateway for FAAS applications. In particular it allows aggregation to be controlled by the incoming metrics, and thus provides much more flexibility in the semantics that your metrics can follow. In general, the Gravel Gateway functions as a standard aggregating push gateway - by default, everything except Gauges are sumed, so e.g. if you push

# TYPE value_total counter
value_total 1
# TYPE value2 gauge
value2 1

three times, then Prometheus will scrape

# TYPE value_total counter
value_total 3
# TYPE value2 gauge
value2 1

Where the Gravel Gateway differs, is that it allows you to specify a special clearmode label to dictate how metrics are aggregated.

We currently support three different values of clearmode - aggregate (the default for non gauges), replace (the default for gauges), and family which provides info like semantics. As a practical example, if we push:

# TYPE value_total counter
value_total 1
# TYPE value2 gauge
value2{clearmode="aggregate"} 1
# TYPE version gauge
version{version="0.0.1",clearmode="family"} 1

and then

# TYPE value_total counter
value_total 3
# TYPE value2 gauge
value2{clearmode="aggregate"} 1
# TYPE version gauge
version{version="0.0.2",clearmode="family"} 1

(note the changed version label), Prometheus will scrape:

# TYPE version gauge
version{version="0.0.2"} 1
# TYPE value2 gauge
value2 2
# TYPE value_total counter
value_total 4

With the counter value being replaced, the gauge value being sumed, and the version value completely replacing the old version. You'll also note that the clearmode label is removed by the gateway - it's not included in the metrics exposed to the Prometheus scrape. In that way, this aggregating process is completely transparent to Prometheus.

Usage

Prometheus Gravel Gateway 

USAGE:
    gravel-gateway [FLAGS] [OPTIONS]

FLAGS:
    --cluster-enabled    
        Whether or not to enable clustering

    -h, --help               
            Prints help information

    -V, --version            
            Prints version information


OPTIONS:
        --basic-auth-file <basic-auth-file>    
            The file to use for basic authentication validation.
                            This should be a path to a file of bcrypt hashes, one per line,
                            with each line being an allowed hash.
    -l <listen>                                
            The address/port to listen on [default: localhost:4278]

        --peer <peers>...                      
            The address/port of a peer to connect to

        --peers-file <peers-file>              
            The SRV record to look up to discover peers

        --peers-srv <peers-srv>                
            The SRV record to look up to discover peers

        --tls-cert <tls-cert>                  
            The certificate file to use with TLS

        --tls-key <tls-key>                    
            The private key file to use with TLS

To use, run the gateway:

gravel-gateway

You can then make POSTs to /metrics to push metrics:

echo '# TYPE value_total counter
value_total{clearmode="replace"} 3
# TYPE value2 gauge
value2{clearmode="aggregate"} 1
# TYPE version gauge
version{version="0.0.2",clearmode="family"} 1' | curl --data-binary @- localhost:4278/metrics

And point Prometheus at it to scrape:

global:
  scrape_interval: 15s
  evaluation_interval: 30s
scrape_configs:
  - job_name: prometheus
    honor_labels: true
    static_configs:
      - targets: ["127.0.0.1:4278"]

Authentication

Gravel Gateway supports (pseudo) Basic authentication (with the auth feature). To use, populate a file with bcrypt hashes, 1 per line, e.g.

htpasswd -bnBC 10 "" supersecrets | tr -d ':\n' > passwords

and then start gravel-gateway pointing to that file:

gravel-gateway --basic-auth-file ./passwords

Requests to the POST /metrics endpoint will then be rejected unless they contain a valid Authorization header:

curl http://localhost:4278/metrics -vvv --data-binary @metrics.txt -u :supersecrets

TLS

TLS is provided by the tls-key and tls-cert args. Both are required to start a TLS server, and represent the private key, and the certificate that is presented respectively.

Clustering

To horizonally scale the gateway, you can use clustering. The Gravel Gateway support clustering by maintaining a hash ring of peers, provided by either a static list, an SRV record, or a file. When a request comes in, if clustering is enabled, the job label is hashed to produce an "authoritive" node for that job, and the request is forwarded accordingly. That node thus becomes the only node that will expose metrics for the given job.

To enable clustering, use the cluster-enabled flag, and provide a discovery mechanism. For example:

./gravel-gateway --cluster-enabled --peer localhost:4279 --peer localhost:4280
./gravel-gateway --cluster-enabled -l localhost:4279 --peer localhost:4278 --peer localhost:4280
./gravel-gateway --cluster-enabled -l localhost:4280 --peer localhost:4278 --peer localhost:4279

starts three gravel gateway instances, clustered such that they will forward requests between each other

Pebbles

Some times, for Gauges, you don't want to track just one of your values (the default for Gauges is "replace"). If we have, say, a new release that doubles the memory usage, then we probably want to know about that increase without it being pulled down by weeks of the previous version. For this usecase, the Gravel Gateway supports "pebbles". Pebbles are effectively a circular buffer of time based buckets. Each bucket represents a distinct timeslice, and tracks a pre-aggregated value inside that time slice. The final value for the metric is the same aggregation applied over each bucket.

This means that we actually get an "aggregate of aggregates" out the other end, which does lose some precision compared to storing all the raw data but in practice this hasn't affected us much.

You can start a pebble using a clearmode in the form <aggregation><time> e.g. {clearmode="mean5m"} will take a mean over the last 5 minutes of incoming data. Available aggregations at the moment include "sum" and "mean", but "median" is coming soon, and maybe "percentile" would be a good PR.

Motivation

I recently wrote about my frustrations with trying to orchestrate Prometheus in an FAAS (Functions-As-A-Service) system that will rename nameless. My key frustration was that the number of semantics I was trying to extract from my Prometheus metrics was too much for the limited amount of data you can ship with them. In particular, there was three semantics I was trying to drive:

Aggregated Counters - Things like request counts. FAAS applications only process one request (in general), so each sends a 1 to the gateway and I want to aggregate that into a total request count across all the invocations
Non aggregated Gauges - It doesn't really make sense to aggregate Gauges in the general case, so I want to be able to send gauge values to the gateway and have them replace the old value
Info values - Things like the build information. When a new labelset comes along for these metrics, I want to be able to replace all the old labelsets, e.g. upgrading from {version="0.1"} to {version="0.2"} should replace the {version="0.1"} labelset

Existing gateways, like the prom-aggregation-gateway, or pushgateway are all or nothing in regards to aggregation - the pushgateway does not aggregate at all, completly replacing values as they come in. The aggregation gateway is the opposite here - it aggregates everything. What I wanted was something that allows more flexibility in how metrics are aggregated. To that end, I wrote the Gravel Gateway

prometheus-gravel-gateway's People

Contributors

Stargazers

Watchers

Forkers

lorello lpmi-13 sharadgaur nullren advancedfarm sanderpeters101 ts-mini antoinedeschenes manojkarthick astrolemonade

prometheus-gravel-gateway's Issues

Docker build is broken?

Ran docker build . but it's taking an absurdly long time and seems to be stuck on cargo build step

=> [builder 3/3] RUN cargo build --release                                                                                                           863.3s
 => => # ode-bidi-0.3.7/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,l
 => => # ink -C opt-level=3 -C embed-bitcode=no --cfg 'feature="default"' --cfg 'feature="std"' -C metadata=9385220479cd704e -C extra-filename=-9385220479cd
 => => # 704e --out-dir /home/rust/src/target/x86_64-unknown-linux-musl/release/deps --target x86_64-unknown-linux-musl -L dependency=/home/rust/src/target/
 => => # x86_64-unknown-linux-musl/release/deps -L dependency=/home/rust/src/target/release/deps --cap-lints allow` (signal: 11, SIGSEGV: invalid memory ref
 => => # erence)
 => => # warning: build failed, waiting for other jobs to finish..

Found some unsupported items

Found some unsupported items
example:

# HELP jvm_memory_bytes_used Used bytes of a given JVM memory area
# TYPE jvm_memory_bytes_used gauge
jvm_memory_bytes_used{area="heap",} 9.4094392E12


# Service=myapp, date=Fri May 27 11:34:23 UTC 2022

coma at the end of labels list is unsupported
exponent uppercase letter "E" is not supported, but just lowercase "e"
2 and more blank lines after the last metric is not supported

clearmode="family" only allows removing labels

Hi, when using the clearmode="family" label, it's not possible to add or rename labels on the series.

If a new function version needs to change it's labels, the gateway needs to be restarted to handle this.

However, it's still possible to remove labels from a series, but I believe that it's a side-effect of the are_label_names_equivalent function not comparing against existing labels.
https://github.com/sinkingpoint/prometheus-gravel-gateway/blob/master/src/aggregator.rs#L355

echo "> clearmode="family" initial series"
cat <<EOF | curl --data-binary @- localhost:4278/metrics
# TYPE family_total counter
family_total{foo="bar",clearmode="family"} 5
EOF
curl localhost:4278/metrics

# > clearmode=family initial series
# # TYPE family_total counter
# family_total{foo="bar"} 5

echo "> clearmode="family" replace labels"
cat <<EOF | curl --data-binary @- localhost:4278/metrics
# TYPE family_total counter
family_total{bar="baz",clearmode="family"} 4
EOF
curl localhost:4278/metrics

# > clearmode=family replace labels
# invalid push - new push has different label names than the existing family# TYPE family_total counter
# family_total{foo="bar"} 5

echo "> clearmode="family" remove label"
cat <<EOF | curl --data-binary @- localhost:4278/metrics
# TYPE family_total counter
family_total{clearmode="family"} 3
EOF
curl localhost:4278/metrics

# > clearmode=family remove label
# # TYPE family_total counter
# family_total 3

echo "> clearmode="family" add label"
cat <<EOF | curl --data-binary @- localhost:4278/metrics
# TYPE family_total counter
family_total{foo="bar",baz="foo",clearmode="family"} 2
EOF
curl localhost:4278/metrics

# > clearmode=family add label
# invalid push - new push has different label names than the existing family# TYPE family_total counter
# family_total 3

[Questions] Clustering

Hello @sinkingpoint, i'd like to ask few questions about clustering:

is this feature using any membership algorithm (like serf for example) so that if a node of the cluster is down the metric gets re-assigned to another node?
is the metric proxy best effort? If a node is down or temp unavailable the data point just discarded?
can the cluster be scaled in-out? is the data re-organized when this happens?

From what i'm reading the current clustering feature is just best-effort and depending on the requirements not production ready?

Support K8s deployment via helm charts

In order to deploy the gravel-gateway within our K8s cluster, a helm chart would be helpful.
This would make operations like ha scaling easier.
We tried a drop in replacement in the pushgateway chart but needed to disable the probes for that.
If a Helm chart is not feasible, could you add a section in the Readme which probes (Liveness, Readiness) should be configured.

Cluster Returning Values from All Peers

We have a Gravel Gateway setup in a 3 peer cluster and are pushing metrics via the Python Prometheus Client library. Specifically, we are using push_to_gateway method. This method requires a "job" parameter be set. The push is successful but Gravel is not writing values to the same instance on successive pushes. Over multiple pushes, we end up with 3 different values on the instances.

When scraping, the value returned seems randomly chosen out of the 3 instances.

Expectation: Based on the documentation, the metric should write to the same instance each time and be retrieved from that same instance, regardless of the instance that received the request.

Pebbles - mean not correct

The result of using mean5m seems not correct.
To reproduce run the gateway locally and send:

echo '# TYPE test_value gauge
test_value{clearmode="mean5m"} 22' | curl --data-binary @- localhost:4278/metrics/job/testjob

expected 22, returns 0

curl localhost:4278/metrics
# TYPE test_value gauge
test_value{job="testjob"} 0

the result is wrong for all successive submissions.

metric label overriding

I have 2 lambda functions pointing to the same gateway.
the first lambda sends:

requests_num_total{LAMBDA_NAME="test_function"}

the second lambda sends:

requests_num_total{job="test"}

when I curl the gateway endpoint i see only:

requests_num_total{LAMBDA_NAME="test_function"} 1
requests_num_total{LAMBDA_NAME="test"} 1

where the label job has been overridden by LAMBDA_NAME.

How to reproduce

run the gateway locally
send the first metric

echo '# TYPE requests_num_total counter
requests_num_total{LAMBDA_NAME="test_function"} 1' | curl --data-binary @- 127.0.0.1:4278/metrics

send the second metric

echo '# TYPE requests_num_total counter
requests_num_total{job="test"} 1' | curl --data-binary @- 127.0.0.1:4278/metrics

curl the gateway endpoint curl 127.0.0.1:4278/metrics:

# TYPE requests_num_total counter
requests_num_total{LAMBDA_NAME="test_function"} 1
requests_num_total{LAMBDA_NAME="test"} 1

as you can see the label job has been renamed to LAMBDA_NAME
the first label sent always overrides any other following labels.

Error when dealing with metrics that only contain HELP and TYPE lines

Some metric-pushing clients can generate metrics with HELP and TYPE lines, but without the actual metric value lines.

For example:

# HELP number_of_transactions_total Number of transactions
# TYPE number_of_transactions_total counter
<empty line>

When the client tries to push those metrics to the prometheus-gravel-gateway, the push will fail with the error invalid push - new push has different label names than the existing family. There are two scenarios where this error is currently thrown where I think the gravel gateway should actually accept the push:

the first push did mention a value for the metric along with labels, but a later one didn't:

# HELP number_of_transactions_total Number of transactions
# TYPE number_of_transactions_total counter
number_of_transactions_total{label="value"} 1

followed by:

# HELP number_of_transactions_total Number of transactions
# TYPE number_of_transactions_total counter

The first push didn't mention a value nor labels, but a later one didn't:

# HELP number_of_transactions_total Number of transactions
# TYPE number_of_transactions_total counter

followed by:

# HELP number_of_transactions_total Number of transactions
# TYPE number_of_transactions_total counter
number_of_transactions_total{label="value"} 1

What the gravel-gateway can do for these scenarios is as follows:

Don't merge the new family with the older one; the merge function returns early.
Replace the old family with the new family, the same as if the should_clear_family condition in the merge function.

Label value not decoded

When pushing metrics, most libraries do encode the labels value before sending them, following the specs by the pushgateway (https://github.com/prometheus/pushgateway/blob/master/README.md#url)

So sending metrics{instance="localhost:80"} 1 should return metrics{instance="localhost:80"} 1 on the gateway side.

The issue I have with gravel-gateway is that the labels value are not being decoded, and the gateway display instead:

metrics{instance="localhost%3A80"} 1

Fix will be to decode all labels value before any processing

Support PUT for clearing all existing metrics

All prometheus_client example uses push_to_gateway function to push metrics to the prometheus gateway.
https://github.com/prometheus/client_python#exporting-to-a-pushgateway

The push_to_gateway method uses PUT as http method: https://github.com/prometheus/client_python/blob/v0.14.1/prometheus_client/exposition.py#L448

meanwhile pushadd_to_gateway uses POST as HTTP method:
https://github.com/prometheus/client_python/blob/v0.14.1/prometheus_client/exposition.py#L479

would be nice to add support to PUT method as well so that developers can switch gateways with no code change.

URL label values are being swapped

If I specify more than one label in the push URL, it looks like values get swapped randomly:

Here's a script to replicate the issue:

#!/bin/bash

for i in {1..10}; do
  cat <<EOF | curl --data-binary @- localhost:4278/metrics/namespace/foo/label2/bar/
# TYPE test_total counter
test_total 1
EOF
done

curl localhost:4278/metrics

It outputs the following, while I'd expect no {namespace="bar",label2="foo"} metric:

# TYPE test_total counter
test_total{namespace="foo",label2="bar"} 5
test_total{namespace="bar",label2="foo"} 5

Why is there a TODO for prometheus distributions summaries?

While trying out the gravel gateway, I noticed that the aggregation of distribution summaries is not supported. I stumbled over this while using distribution summaries.

May I ask what was the reason for not implementing it? Maybe I can help here :)

Thanks a lot!

PS: For now, this detail is maybe worth to be mentioned in the readme.

Pushing a metric with a subset or superset of existing label names causes error

Hey there!

First off, thanks for your work on this. It's nice to see a well-thought-out solution to getting serverless functions' metrics into Prometheus.

I bumped into an issue pushing metrics of this form:

function_calls_total{function="rabbit",objective_name="animalz"} 1
function_calls_total{function="wildRabbit"} 1

As a workaround, I can set the objective_name label to the empty string whenever it does not exist.

It seems like under the hood, your https://github.com/sinkingpoint/openmetrics-parser library wants for all metrics to have the same labels. (Maybe I should open an issue there instead?)

It gets mad when one metric has a superset or subset of the other's label names.

According to the Prometheus docs, setting a label to the emptystring is the same as not setting it at all.

So, to me, the expected behavior here is that

function_calls_total{function="rabbit",objective_name="animalz"} 1
function_calls_total{function="wildRabbit"} 1

would be equivalent to pushing

function_calls_total{function="rabbit",objective_name="animalz"} 1
function_calls_total{function="wildRabbit",objective_name=""} 1

Curious to hear your thoughts!

Segmentation Fault (SIGSEGV) when cluster is enabled

Using version 1.6.2 (spec. prometheus-gravel-gateway_v1.6.2_x86_64-unknown-linux-musl), I've noticed that when the cluster is enabled, a segmentation fault causes it to crash when writing a metric.

GDB shows it as:

0x00007fb8f0195ab9 in parking_lot::condvar::Condvar::wait_until_internal ()
0x00007fb8f017d0ae in _ZN5tokio4park6thread5Inner4park17h6473e14d105f4ff5E.llvm.5059386539350869202 ()
0x00007fb8f017d44b in <tokio::park::thread::CachedParkThread as tokio::park::Park>::park ()
0x00007fb8efe4f655 in tokio::park::thread::CachedParkThread::block_on ()

Thread 2 "tokio-runtime-w" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 11614]
0x0000000000000000 in ?? ()

It was simple enough to reproduce using:

gravel-gateway --cluster-enabled -l localhost:4280 --peer localhost:4281 --peer localhost:4282
gravel-gateway --cluster-enabled -l localhost:4281 --peer localhost:4280 --peer localhost:4282
gravel-gateway --cluster-enabled -l localhost:4282 --peer localhost:4281 --peer localhost:4280

echo 'gravel_test_value_4{clearmode="aggregate"} 1' | curl -Lvv --data-binary @- localhost:4281/metrics
*   Trying 127.0.0.1:4281...
* Connected to localhost (127.0.0.1) port 4281 (#0)
> POST /metrics HTTP/1.1
> Host: localhost:4281
> User-Agent: curl/7.81.0
> Accept: */*
> Content-Length: 45
> Content-Type: application/x-www-form-urlencoded
> 
* Empty reply from server
* Closing connection 0

[Pebbles] AggregationError

Trying to use pebbles with {clearmode="mean5m"} and it doesn't seem to work, maybe i'm getting the doc wrong.
To replicate, run the gateway locally and perform:

echo '# TYPE value2 gauge
value2{clearmode="mean5m"} 1' | curl --data-binary @- localhost:4278/metrics/job/lttest

this POST will successfully complete, by perform a GET /metrics we get:

# TYPE value2 gauge
value2{job="lttest"} 0

let's now trying to push another value for the same gauge metric:

echo '# TYPE value2 gauge
value2{clearmode="mean5m"} 10' | curl --data-binary @- localhost:4278/metrics/job/lttest

will receive the following error:

Unhandled rejection: AggregationError(Error("invalid push - new push has different label names than the existing family"))

Reset Endpoint

Would be nice to expose a "reset endpoint" or support DELETE method to make tests/integrations more convenient

Base64 encode the authorization header

I understand the comment and intention here:

You'll note that we don't base64 the authorization header, so it's not technically Basic Auth, but I don't like Base64ing it because I believe that gives a false sense of security. Instead, you should enable TLS

However, the basic auth spec themselves mention that base64 has nothing to do with security and should be used in conjuction with TLS https://www.rfc-editor.org/rfc/rfc7617#section-1

The downside of not following the spec is integration efforts with libraries. For example, I want to use prom-client to push to gravel gateway but now I can't use the methods meant for that because they will obviously base64 the authorization header for me.

I would request the change to expect a base64 encoded authorization header to improve client integration. Or maybe just accept both ways for backwards compatibility

Persistent state

I saw your talk on KubeCon EU and found this project quite interesting, one thing I was wondering how you are tackling in your use case (if it is needed) is persistent state.

As far as I can see in the readme or the code, there isnt anything written to disk and it is "more or less" stateless.

I assume this would mean that if the gateway is restarted then it will loose the metrics that was already located in there and you "risk" having a scrape with absent metrics?

How do you tackle this issue in your setup?, and is persisting the state something that you would consider useful? The state could for example be written to blob/s3 like storage periodically and allow it to start up again with data from the last session.

Error 500 when pushing same metric with different label

Hi,

thanks for developing Gravel Gateway!

I have an issue in pushing the same metric with different label values.
I'm not sure whether it's something I'm missing, but here's how to reproduce:

clone master@06e50df0210797b47a7ca13ccf8fb0f4b8bc159e
cargo run (rustc 1.57.0 (f1edd0429 2021-11-29))
The following curl returns correctly with 200 OK

echo '# HELP mymetric My Metric
> # TYPE mymetric gauge
> mymetric{clearmode="aggregate",label="value1",job="my_metric"} 1.0' | curl -v -H 'Authorization: x' --data-binary @- localhost:4278/metrics

Multiple curl invocations as above yield the expected results (summed values for the metric).
But the following curl returns with 500 with message Unhandled rejection: AggregationError(ParseError(InvalidMetric("Cannot add a sample with 3 labels into a family with 2")))* (see label="value2")

echo '# HELP mymetric My Metric
> # TYPE mymetric gauge
> mymetric{clearmode="aggregate",label="value2",job="my_metric"} 1.0' | curl -v -H 'Authorization: x' --data-binary @- localhost:4278/metrics

Is this something to be expected?

Please let me know if I can be of further help.

Thanks!

Metrics that do not always have a recorded value cause 400 errors.

Hello, when trialing out this project I encountered a bug when a job ran that didn't have a value recorded for one of the metrics. Any metrics detailed in the PUT after the empty metric would cause the server to throw back a 400 error, the Prometheus pushgateway handles empty metrics without errors.

Sample PUT:

# HELP metric_without_values_total This metric does not always have values
# TYPE metric_without_values_total counter
# HELP metric_with_values_total This metric will always have values
# TYPE metric_with_values_total counter
metric_with_values_total{a_label="label_value",another_label="a_value"} 1.0
# HELP metric_with_values_created This metric will always have values
# TYPE metric_with_values_created gauge
metric_with_values_created{a_label="label_value",another_label="a_value"} 1.665577650707084e+09

Gives the following error back:

Invalid metric name in family. Family name is metric_without_values_total, but got a metric called metric_with_values_total

Code to reproduce (Python):

from prometheus_client import CollectorRegistry, Counter, push_to_gateway

registry = CollectorRegistry()

no_values = Counter(
    "metric_without_values",
    "This metric does not always have values",
    ["label"],
    registry=registry
)

has_values = Counter(
    "metric_with_values",
    "This metric will always have values",
    ["a_label", "another_label"],
    registry=registry
)

has_values.labels(a_label="label_value", another_label="a_value").inc()
push_to_gateway("localhost:4278", job="test_job", registry=registry)

Use per-job aggregators

Hello,

Currently, there is a single metrics aggregator used for the whole gateway process. That means that if 2 different jobs push metrics to the same gateway, and they want to use metrics with the same name (let's say "http_requests"), it's an issue:

Job 1 might be using a http library that adds a code label
Job 2 might be using a http library that adds a status_code label

When that happens, either Job 1 or 2 (the last one to push metrics) won't be able to ever push metrics because there would be different labels for the single metric family http_requests.

It would be nice if aggregating would only error on "non-matching labels" if:

the 2 families have no job label, or
the 2 families have the same job label.

This way the gateway could receive series from multiple different jobs, and report a time series for each job when scraped by Prometheus later:

http_requests{job="one", code="500"} 2
http_requests{job="two", status_code="500"} 1

I think this could be allowed by using a HashMap of Aggregator instead of a single one, using the job name as a key. And I suppose that answering to scraping would come down to merging all existing aggregators together. It's also somewhat related to #8 , since we could have a less destructive endpoint if we could only ask to wipe a single job's aggregator

Release v1.6.1 missing

Looks like the GitHub release and Docker Hub images are missing for v1.6.1