GithubHelp home page GithubHelp logo

Comments (15)

mxinden avatar mxinden commented on May 23, 2024 2

If I can find some time it’d be nice yeah, hopefully I’ll be able to find time to dig into this.

Great. Let me know in case you need any help!

I think I’d like a way to specify running the encode function in a separate thread

Can you expand on what you want to optimize? All metric implementations are synchronized (e.g. Counter is just an atomic integer), thus allowing metric recording and metric collection to happen in different threads.

(which would require the registry to be Sync ?)

I am guessing that you are referring to interior mutability and not the Sync marker trait? I don't see why one would need interior mutability for the Registry itself, unless one wants to register metrics after startup, which I would argue is an anti-pattern.

from client_rust.

gagbo avatar gagbo commented on May 23, 2024 1

Hello,

Sorry for the late answer, I've been struggling to find time these days :(

For the time being I don't see myself working too much on this as our monitoring coverage is decent now and we have other priorities but I'll keep this in mind when it becomes an important subject again!

Can you expand on what you want to optimize?

I thought that all encode calls had to be executed in the same thread as the one recording the events, but as you clarified it shouldn't be an issue after all.

I am guessing that you are referring to interior mutability and not the Sync marker trait?

I was really thinking about the marker trait here (so that registries can be borrowed by an encode call in another thread), but it doesn't matter as it's irrelevant now, my understanding of the Registry structure seems wrong.

from client_rust.

mxinden avatar mxinden commented on May 23, 2024

Hi Gerry,

As you noted above open-metrics-client does not itself support exposition of process metrics today.

I was thinking probably as another crate

That sounds good. I am open to whether that crate would live in this repository or not. The former would likely make development easier as one can make atomic changes across both crates at once (with one pull request).

Off the top of my head the interface of such crate could look like:

fn register_process_metrics(registry: &mut Registry)

The crate would register multiple custom metrics, e.g. a gauge metric exposing the number of threads. Those custom metrics would each implement EncodeMetric and collect the concrete metric values from the system on EncodeMetric::encode.

@gagbo does the above make sense? Would you be interested in contributing such a crate?

from client_rust.

gagbo avatar gagbo commented on May 23, 2024

If I can find some time it’d be nice yeah, hopefully I’ll be able to find time to dig into this.

I think I’d like a way to specify running the encode function in a separate thread (which would require the registry to be Sync ?), so that all the metrics related calls can be handled by a different core if need be ? (My point is that you probably don’t need to be running in the same thread to collect process info, so offloading those calls to another core and another cache might be nice to have)

from client_rust.

dovreshef avatar dovreshef commented on May 23, 2024

Hi, all

I've tried to take a stab at implementing this, since we need this as well. I looked at the current implementation for the other prometheus crate.

As part of the implementation it reads data from procfs to figure out stats on the process. In the current implementation, it is implemented as a custom collector, so the data is read once, and then used for all metrics. But with the design of the current crate, I found it hard to emulate this, since each metrics is its own separate thing, and the logic is spread out across each encode metric impl (If I understood it correctly).

I think it would help if it was also possible to have something analogous to the Collect trait for a group of metrics that share a source, so to speak.

I also found issue #49 which I think is showing other usecases where it would be helpful.

Just my 2c.

from client_rust.

mxinden avatar mxinden commented on May 23, 2024

@dovreshef would you be retrieving the information from the system in time or on an interval. I think the former is the Prometheus way.

Would the custom metric example not be the interface you need? I.e. be called on scrape to retrieve and generate the metrics?

https://github.com/prometheus/client_rust/blob/master/examples/custom-metric.rs

from client_rust.

dovreshef avatar dovreshef commented on May 23, 2024

@dovreshef would you be retrieving the information from the system in time or on an interval. I think the former is the Prometheus way.

Sorry, I'm not sure I understand the question. I'll be retrieving the info in the EncodeMetric trait encode function, as demoed in the example.

There are multiple metrics that the process collector gathers, and the process to gather each one is similar. We read the /proc file system for the calling process, and extract the data from several files there.
The issue is that they all share a few (relatively) expensive initial steps, and if I'll gather the data for each metric separately I'll be repeating the steps for each metric, which is a bit of a waste.

In the existing Prometheus client implementation all the metrics are gathered in a single collect call, and so they share those initial steps.

So I think it would help if we would have a way to collect/encode multiple metrics in a single call.

from client_rust.

mxinden avatar mxinden commented on May 23, 2024

The issue is that they all share a few (relatively) expensive initial steps, and if I'll gather the data for each metric separately I'll be repeating the steps for each metric, which is a bit of a waste.

Ah, sorry, I forgot having this discussion in the past.

As suggested on #49 (comment), what do you think of the option to be able to register a Collector on a Registry. A Collector would be able to return a set of metrics where each metric can have a different metric type.

For the process collector, you would implement the Collector trait and register an instance with a Registry. On encode we would iterate the Collectors registered with the Registry, call Collector::collect and encode each returned metric.

Does that make sense @dovreshef? If so, would you like to prototype this?

As an aside, we would likely want to introduce StaticCounter, StaticGauge, ... so that you don't have to pay the cost of an AtomicU64 on each Collector::collect call.

from client_rust.

dovreshef avatar dovreshef commented on May 23, 2024

Does that make sense @dovreshef? If so, would you like to prototype this?
Sure.

So the design is:

  • Have a new Collector trait that looks something like:
trait Collector<'a, M>
where
    M: EncodeMetric + 'a,
    Self::List: Iterator<Item = &'a (Descriptor, M)>
{
    type List;

    fn collect(&self) -> Self::List;
}
  • Registry implements Collector, which returns RegistryIterator.
  • text::encode calls the collect method on the registry to retrieve the iterator.

Now I can see two ways to continue from here:

Either:

  • Registry no longer holds sub_registries: Vec<Registry<M>> but instead holds Vec<Box<dyn Collector>> .
  • Registry will have a new function to add a subregistry as a Box<dyn Collector>.
  • No need to add new fields.

Or:

  • Have a new field on the registry that holds the Box<dyn Collector>.
  • RegistryIterator will also iterate over all the collectors.

WDYT? Any other design?

from client_rust.

mxinden avatar mxinden commented on May 23, 2024
  • Have a new Collector trait that looks something like:
trait Collector<'a, M>
where
    M: EncodeMetric + 'a,
    Self::List: Iterator<Item = &'a (Descriptor, M)>
{
    type List;

    fn collect(&self) -> Self::List;
}

👍

Small nit, maybe type Collection; would be more intuitive.

  • Registry implements Collector, which returns RegistryIterator.

👍

  • text::encode calls the collect method on the registry to retrieve the iterator.

Instead of taking a Registry, text::encode could now even take some C: Collector.

Either:

* Registry no longer holds `sub_registries: Vec<Registry<M>>` but instead holds `Vec<Box<dyn Collector>>` .

* Registry will have a new function to add a subregistry as a `Box<dyn Collector>`.

* No need to add new fields.

That would be very clean in my opinion. My gut feeling tells me we will be running in some trait object issues. That said, I think we should give it a try.

Thanks @dovreshef for looking into this!

from client_rust.

mxinden avatar mxinden commented on May 23, 2024

@dovreshef are you still interested in contributing the Collector pattern? :)

from client_rust.

mxinden avatar mxinden commented on May 23, 2024

Cross referencing proposal for Collector here: #82

from client_rust.

dovreshef avatar dovreshef commented on May 23, 2024

@dovreshef are you still interested in contributing the Collector pattern? :)

Sorry I missed that message (and disappeared) but I planned to do this for work, and got pulled to other issues.

from client_rust.

baryluk avatar baryluk commented on May 23, 2024

Is there standardized process collector available as a library? I would like to have as a minimum something similar to Go, Python:

Generic - MUST haves really:

# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 138100.24
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 26
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 3.7982208e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.70893894953e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.346695168e+09
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19

(I often also add own process_uptime_seconds)

Rust specific, with some inspiration from Go (of course goroutine , gc does not make sense, but compiler version, thread count, any allocation statistics, i.e. allocator cache hits, fragmentation estimation, number and sum of allocations, etc would be nice)

go_build_info{checksum="",path="",version=""} 1
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 2.6526e-05
go_gc_duration_seconds{quantile="0.25"} 3.1393e-05
go_gc_duration_seconds{quantile="0.5"} 4.3811e-05
go_gc_duration_seconds{quantile="0.75"} 6.8233e-05
go_gc_duration_seconds{quantile="1"} 0.003802359
go_gc_duration_seconds_sum 4.637331431
go_gc_duration_seconds_count 76360
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 13
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.20.6"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 1.8776528e+07
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 2.72165399288e+11
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.615149e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 6.057485348e+09
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 8.426024e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 1.8776528e+07
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 4.145152e+06
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 2.0430848e+07
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 205940
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 1.744896e+06
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 2.4576e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.7107767458342338e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 6.057691288e+09
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 2400
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 15600
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 282880
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 326400
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 2.123056e+07
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 689603
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 589824
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 589824
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 3.62386e+07
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 10

Th

from client_rust.

mxinden avatar mxinden commented on May 23, 2024

Is there standardized process collector available as a library? I would like to have as a minimum something similar to Go, Python:

As the above conversation says, the prometheus-client crate is still missing the process collector functionality. Contributions welcome.

from client_rust.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.