GithubHelp home page GithubHelp logo

ringcentral / metrics-facade Goto Github PK

View Code? Open in Web Editor NEW
19.0 11.0 4.0 1023 KB

RingCentral Metrics Facade is a Java library for working with metrics, allowing extremely flexible configuration of metrics and their export, designed to be generic and not tied to a specific implementation.

License: MIT License

Java 100.00%
metrics prometheus java monitring ringcentral jmx zabbix open-source histogram hdrhistogram

metrics-facade's Introduction

metrics-facade's People

Contributors

devromik avatar ikurovsky avatar kkolyan avatar mairovichaa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metrics-facade's Issues

Scale histogram: prevent overflow when calculating TOTAL_SUM and other measurables

// 1) Create registry
var registry = new DefaultMetricRegistry();

// 2) Define labels
var service = new Label("service");
var server = new Label("server");

// 3) Register metric
Histogram histogram = registry.histogram(withName("failover", "count", "histogram"), () -> withHistogram()
    .description("Failover count histogram")
    .labels(service, server)
    .measurables(TOTAL_SUM, MEAN)
    .impl(scale().with(linearScale().from(0).steps(1, 2).withInf())));

// 4) Update metric
histogram.update(4, forLabelValues(service.value("service-1"), server.value("server-1-1")));
histogram.update(5, forLabelValues(service.value("service-1"), server.value("server-1-1")));

// Metric instances are added asynchronously
sleep(100);

// 5) Create exporter
PrometheusMetricsExporter exporter = new PrometheusMetricsExporter(registry);

// 6) Export metrics
System.out.println(exporter.exportMetrics());

Output:

# HELP failover_count_histogram Failover count histogram
# TYPE failover_count_histogram summary
failover_count_histogram_sum{service="service-1",server="server-1-1",} -2.0
# HELP failover_count_histogram_mean Failover count histogram
# TYPE failover_count_histogram_mean gauge
failover_count_histogram_mean{service="service-1",server="server-1-1",} -1.0

Always deregister/register *Var metrics

com/ringcentral/platform/metrics/var/AbstractVar.java:178 contains check, which prevents registration of metric if it exists.
Registration (as well as removal) happens concurrently whereas check doesn't.
This is why the following:

            LongVar metric = registry.longVar(getName(),Var.noTotal(),
                    () -> withLongVar().dimensions(DIMENSION_1, DIMENSION_2)
            );
            // 1
            metric.register(successProvider, successDimensionValues);
            metric.register(failedProvider, failedDimensionValues);
            // 2
            metric.deregister(successDimensionValues);
            metric.deregister(failedDimensionValues);
            // 3
            metric.register(successProvider, successDimensionValues);
            metric.register(failedProvider, failedDimensionValues);

could result with absence of metrics at all:

  1. (1) registration is scheduled and metrics are registered right away
  2. (2) deregistration is scheduled
  3. (3) registration finds out that metrics exist, so registration is skipped
  4. actual deregistration is hapenning

Suggestion: remove the check.

Support accumulative modifications

By modifications we mean MetricRegistry.preConfigure, MetricRegistry.postConfigure, PrometheusInstanceSampleSpecModsProvider, etc.

Currently, modifications are based on the original entity (Metric, MetricInstance) and overwrite the previous ones. We need to support accumulative modifications that take into account both the original entity and all the previous modifications.

Add stop with MetricDimensionValues parameter to Stopwatch interface

Let's consider the following case:

    public static void main(String[] args) throws InterruptedException {
        DropwizardMetricRegistry registry = new DropwizardMetricRegistry();

        MetricDimension statusDimension = new MetricDimension("status");

        Timer timer = registry.timer(
                withName("timer"),
                () -> withTimer().dimensions(statusDimension)
        );
        
        Stopwatch stopwatch = timer.stopwatch();
        try {
            operation();
            stopwatch.stop(dimensionValues(statusDimension.value("success")));
        } catch (ExceptionInInitializerError ex) {
            stopwatch.stop(dimensionValues(statusDimension.value("failure")));
        }
    }

At the moment of stopwatch's creation value of statusDimension isn't known.

So in this case it's not possible to use Stopwatch and a need to use some custom solution appears.

However stop with MetricDimensionValues dimensionValues parameter seems to fit naturally.

I suggest to add it.

ps:
In my opinion , dimensionValues has to be passed to Stopwatch instance only once.

The second attempt should lead to exception.
For example,

        Stopwatch stopwatch = timer.stopwatch(dimensionValues(statusDimension.value("success")));
        operation();
        // an exception should be thrown
        stopwatch.stop(dimensionValues(statusDimension.value("fatal")));

Add a styleguide

We should create an automatic formatter for IDEA.
The list of rules:

  1. There should be an empty line after a class header:
public abstract class A {
    
      public final MetricName name;
      ...
}

There should be no empty lines if the definition of the class consists only of constants.

public abstract class A {
      public static final MetricName DEFAULT_NAME_PREFIX = MetricName.of("Buffers");
}
  1. We use only 4-space indentation for all the cases.
  2. We use the following formatting schema for long enough parameter lists (2+ usually but it depends):
protected Supplier<LongVarConfigBuilder> longVarConfigBuilderSupplier(
        String description, 
        MetricDimension... dimensions) {

        ...
}

or

protected Supplier<LongVarConfigBuilder> longVarConfigBuilderSupplier(
        String description, MetricDimension... dimensions) {

        ...
}

We add an empty line after a multi-line definition. For example,

registry.longVar(
    nameWithSuffix("startTime"),
    runtimeMxBean::getStartTime,
    longVarConfigBuilderSupplier("The start time of the Java virtual machine in milliseconds"));

registry.longVar(
     nameWithSuffix("uptime", "ms"),
     runtimeMxBean::getUptime,
     longVarConfigBuilderSupplier("The uptime of the Java virtual machine in milliseconds"));
  1. We don't use the prefix 'is' for fields or parameters, only for methods: isDimensional.
  2. We don't add an empty line after a method signature:
public void produceMetrics(MetricRegistry registry) {
   // no empty line here
    for (int i = 0; i < ATTRS.length; ++i) { 
  1. We don't use a space for type casting: (double)i
  2. There should be an empty line before and after if/try inside a method body (but not for the last statement).
  3. There should be no empty before closing brace in a class definition.

Add methods for setting metric implementation builders that take an appropriate interface as a parameter instead of a java.lang.Object

For example, the method .with(...) used in com.ringcentral.platform.metrics.samples.histogram.HistogramSample takes a java.lang.Object that is an error-prone:

.with(hdrImpl()
  .resetByChunks(6, Duration.ofMinutes(2))
  .highestTrackableValue(1000, REDUCE_TO_HIGHEST_TRACKABLE)
  .significantDigits(3)
  .snapshotTtl(30, SECONDS))

We should add a method that takes some appropriate interface instead.

Support annotation-based metric implementations discovery

Currently, we use the following approach to register custom metric implementations:

DefaultMetricRegistry registry = new DefaultMetricRegistry();
registry.extendWith(LastValueHistogramImplConfig.class, new LastValueHistogramImplMaker());

See com.ringcentral.platform.metrics.samples.histogram.HistogramSample for more details.
We need to add support for annotation-based automatic metric implementations discovery.

Add docs for Spring integration

There is an example app, which shows how to integrate metrics-facade with Spring, but there is nothing about Spring support in docs.

I think it's worth to add a section, which will describe corresponding functionality (existing configuration options, how to customize, etc).

Use custom Spring bean name prefix

An application may be based on a Spring-based framework with pre-defined metrics registry named without qualifiers which will result in bean definition confict if used together with metrics facade starter.

We need to set a custom bean name prefix to avoid such conflicts.

Improve eviction and expiration of labeled metrics

  1. AbstractMeter: order MetricInstances within the same millisecond based on their insertion order.
  2. Provide CompletableFuture<Iterator> iterator(Executor completionExecutor) to return iterator after removeExpiredInstances(). Update exporters accordingly.
  3. Make periodic actualisation (removeExpiredInstancesAndSchedule) of MetricInstances optional.
  4. Consider eliminating EXPIRED_INSTANCES_REMOVAL_ADDITIONAL_DELAY_MS in
    executor.schedule(
    this::removeExpiredInstancesAndSchedule,
    baseDelayMs + EXPIRED_INSTANCES_REMOVAL_ADDITIONAL_DELAY_MS, MILLISECONDS);

Consider min(baseDelayMs, 10000)

Split the test methods into smaller ones

It will allow to add more descriptive names/descriptions, which could contain preconditions and expected outcome.
The following tests could be refactored:

  • PrometheusMetricsExporterTest
  • PrometheusInstanceSampleTest
  • PrometheusSampleMakerTest
  • SimpleCollectorRegistryPrometheusInstanceSamplesProviderTest

Get rid of forLabelValues wrapper

The 'forLabelValue' wrapper currently doesn't provide beneficial functionality and produces undesirable garbage when a labeled metric is updated. This task involves eliminating this wrapper to avoid these issues.

Predicate passed in forMetricInstancesMatching isn't applied

    public static void main(String[] args) {
        DropwizardMetricRegistry registry = new DropwizardMetricRegistry();
        PrometheusInstanceSampleSpecModsProvider instanceSampleSpecModsProvider = new PrometheusInstanceSampleSpecModsProvider();
        PrometheusInstanceSamplesProvider instanceSamplesProvider = new PrometheusInstanceSamplesProvider(
                instanceSampleSpecModsProvider,
                registry
        );


        MetricName name = MetricName.of("counter_1", "suffix");
        System.out.println(name + " part 0 is 'counter_1': " + name.part(0).equals("counter_1"));
        Counter counter = registry.counter(name);
        counter.inc();

        instanceSampleSpecModsProvider.addMod(
                forMetricInstancesMatching(
                        nameMask("counter_1.suffix"),
                        mi -> mi.name().part(0).equals("counter_1")
                ),
                (metric, instance) -> prometheusInstanceSampleSpec()
                        .name(MetricName.name("new_name_for_counter_1"))
        );


        MetricName name2 = MetricName.of("counter_2", "type_1", "suffix");
        System.out.println(name2 + " part 0 is 'counter_2': " + name2.part(0).equals("counter_2"));
        Counter counter2 = registry.counter(name2);
        counter2.inc();

        instanceSampleSpecModsProvider.addMod(
                forMetricInstancesMatching(
                        nameMask("counter_2.**.suffix"),
                        mi -> mi.name().part(0).equals("counter_2")
                ),
                (metric, instance) -> prometheusInstanceSampleSpec()
                        .name(MetricName.name("new_name_for_counter_2"))
        );

        PrometheusMetricsExporter prometheusMetricsExporter = new PrometheusMetricsExporter(instanceSamplesProvider);
        System.out.println(prometheusMetricsExporter.exportMetrics());
    }

Actual output:

counter_1.suffix part 0 is 'counter_1': true
counter_2.type_1.suffix part 0 is 'counter_2': true
# HELP counter_2_type_1_suffix Generated from metric instances with name counter_2.type_1.suffix
# TYPE counter_2_type_1_suffix gauge
counter_2_type_1_suffix 1.0
# HELP new_name_for_counter_1 Generated from metric instances with name counter_1.suffix
# TYPE new_name_for_counter_1 gauge
new_name_for_counter_1 1.0

Expected output:

counter_1.suffix part 0 is 'counter_1': true
counter_2.type_1.suffix part 0 is 'counter_2': true
# HELP new_name_for_counter_1 Generated from metric instances with name counter_1.suffix
# TYPE new_name_for_counter_1 gauge
new_name_for_counter_1 1.0
# HELP new_name_for_counter_2 Generated from metric instances with name counter_2.type_1.suffix
# TYPE new_name_for_counter_2 gauge
new_name_for_counter_2 1.0

It's expected that names for both counters will be replaced.
But currently only counter_1's name is replaced.

It seems that the following line contains a problem:
https://github.com/ringcentral/metrics-facade/blob/master/metrics-facade-base/src/main/java/com/ringcentral/platform/metrics/infoProviders/MaskTreeMetricNamedInfoProvider.java#L149

named should be used instead of name.
At least it solves the issue in my case.

Please, take a look.

Support OpenMetrics

Extend support for OpenMetrics including appropriate export.
Currently there is no way to define some of its domain entries such as types (e.g. info and stateset), there is no support for units.

Generate metrics used by SystemMetricsProducer using dimensions

Now if one registers SystemMetricsProducer and exports it (for example using Prometheus exporter), there will be metrics in the following format:

...
# HELP Memory_pools_G1_Survivor_Space_usage Generated from metric instances with name Memory.pools.G1-Survivor-Space.usage
# TYPE Memory_pools_G1_Survivor_Space_usage gauge
Memory_pools_G1_Survivor_Space_usage 0.04085540771484375
# HELP Memory_pools_G1_Old_Gen_usage Generated from metric instances with name Memory.pools.G1-Old-Gen.usage
# TYPE Memory_pools_G1_Old_Gen_usage gauge
Memory_pools_G1_Old_Gen_usage 0.017514586448669434
# HELP Memory_pools_Compressed_Class_Space_usage Generated from metric instances with name Memory.pools.Compressed-Class-Space.usage
# TYPE Memory_pools_Compressed_Class_Space_usage gauge
Memory_pools_Compressed_Class_Space_usage 0.006308794021606445
# HELP Memory_pools_CodeCache_usage Generated from metric instances with name Memory.pools.CodeCache.usage
# TYPE Memory_pools_CodeCache_usage gauge
Memory_pools_CodeCache_usage 0.2337773640950521
# HELP Memory_non_heap_usage Generated from metric instances with name Memory.non-heap.usage
# TYPE Memory_non_heap_usage gauge
Memory_non_heap_usage -6.582004E7
# HELP Memory_pools_G1_Eden_Space_usage Generated from metric instances with name Memory.pools.G1-Eden-Space.usage
# TYPE Memory_pools_G1_Eden_Space_usage gauge
Memory_pools_G1_Eden_Space_usage 0.35365853658536583
# HELP Memory_pools_Metaspace_usage Generated from metric instances with name Memory.pools.Metaspace.usage
# TYPE Memory_pools_Metaspace_usage gauge
Memory_pools_Metaspace_usage 0.9908933534726991
...

Example above contains part of MemoryMetricsProducer's output .

It could be reworked in the following way:

# HELP jvm_Memory_pools_usage Generated from metric instances with name jvm.Memory.pools.usage
# TYPE jvm_Memory_pools_usage gauge
jvm_Memory_pools_usage{type="G1-Old-Gen",
} 0.0026739835739135742
jvm_Memory_pools_usage{type="G1-Survivor-Space",
} 1.0
jvm_Memory_pools_usage{type="Metaspace",
} 0.960856277461597
jvm_Memory_pools_usage{type="CodeHeap-'non-profiled-nmethods'",
} 0.04282428770387274
jvm_Memory_pools_usage{type="Compressed-Class-Space",
} 0.007082119584083557
jvm_Memory_pools_usage{type="CodeHeap-'non-nmethods'",
} 0.1676891268980477
jvm_Memory_pools_usage{type="G1-Eden-Space",
} 0.7045454545454546

The second format seems to be more convenient. Also it allows to avoid of redundant HELP and TYPE and to provide proper description for the whole group of metrics.

So I suggest add a parameter to SystemMetricsProducer (and Producers used by it), which will specify format of the metrics.

Here by "specify format" I mean that during construction of corresponding producer, it will create corresponding metrics using dimensions and won't create a dedicated metric per each (in case of example above it's "type" dimension)

Structure the docs

  • split README.md up into multiple smaller docs
  • add info on where to find the samples output (localhost:9095) and samples in general
  • add Contribution section
  • export model
  • benchmarks
  • migrate to wiki
  • TBD

Support removing export modifications

Some of the users of the library are currently migrating from Dropwizard. In Dropwizard, they test monitoring triggers by first deregistering a real metric and then registering a mock one with the same name and a value that activates the corresponding trigger. Since we believe that modifying real metrics is not a good solution (e.g., possible concurrent metric updates could be lost), we've decided to solve the same problem by supporting temporal export modifications so as not to affect the real metrics. Example:

@RequestMapping(value = "/counter/prometheus/value/{value}", method = POST)
public synchronized void modifyPrometheusValue(@PathVariable String value) {
    sampleSpecModsProvider.removeMod("testTriggerFor.counter");

    if (!"real".equalsIgnoreCase(value)) {
        sampleSpecModsProvider.addMod(
            "testTriggerFor.counter",
             forMetricWithName("counter"),
             (instanceSampleSpec, instance, measurableValues, measurable, currSpec) ->
                 measurable instanceof Count ? sampleSpec().value(parseDouble(value)) : null);
    }
}

Fix "le" label for Histogram

.measurables(
    ..                        
    Bucket.of(27.5),
    ..)

Expected

histogram_fullConfig_byService_bucket{sample="histogram",service="service_1",le="27.5",} 2.0

Actual

histogram_fullConfig_byService_bucket{sample="histogram",service="service_1",le="27p5",} 2.0

Add SystemMetricsProducer to spring autoconfiguration

SystemMetricsProducer isn't part of Spring autoconfiguration.

I suggest to add a possibility to turn on its autoconfiguration by means of application's properties.

For example,

# application.yml
...
management:
    metrics:
        mf:
            collect:
                system: true
...

In this case
org.springframework.boot.actuate.autoconfigure.metrics.JvmMetricsAutoConfiguration
and
org.springframework.boot.actuate.autoconfigure.metrics.SystemMetricsAutoConfiguration
should be excluded from autoconfiguration as they collect similar metrics.

Redesign export

  1. Make modification API simpler and fluent. TBD
  2. DSL?
  3. Avoid recreating specs (InstanceSampleSpec, SampleSpec) for MetricInstances that have not changed since the last export
  4. Eliminate boxing/unboxing of measurable values to improve performance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.