beam-telemetry / telemetry_poller Goto Github PK
View Code? Open in Web Editor NEWPeriodically gather measurements and publish them as Telemetry events
Home Page: https://hexdocs.pm/telemetry_poller
License: Apache License 2.0
Periodically gather measurements and publish them as Telemetry events
Home Page: https://hexdocs.pm/telemetry_poller
License: Apache License 2.0
Would it make sense to have something like https://github.com/ferd/recon/blob/master/src/recon.erl#L358 in telemetry_poller_builtin.erl
?
Hi!
Is there a particular reason why telemetry 1.0 is required? AFAIK there were no API changes compared to 0.4.
The reason I'm asking is because we're currently in a weird transition moment, where many packages still depend on telemetry 0.4
without allowing 1.0
, so if some packages (like this one) don't allow 0.4, we're forced to specify 1.0
with an override. In my project this succeeds on mix deps.get
dependency resolution, but unfortunately, other tasks like mix test
etc fail to start (I'm not exactly sure why but other people may also be hitting this problem).
I think it could be good to specify ~> 0.4 or ~> 1.0
, at least for some time
As Fred Herbert wrote in Stuff Goes Bad: Erlang In Anger: "Tracking [process count] over time can be extremely helpful to try and characterize load or detect process leaks..."
Adding an additional measurement to Telemetry.Poller to report the process count at periodic intervals could be very useful.
If we agree this is useful and someone wants to take this on, I would greatly appreciate it as I am personally not incredibly adept at writing Erlang. If no one is available to take it on, I'll probably give it a try, but it may take me some time ๐
Bonus feature: I'm also thinking it would be very neat to be able to poll for a top-list of processes based on reds or memory, similar to :etop, but I understand if this is more niche and maybe not appropriate as a default for this lib.
My team is creating long-living dynamically created GenServer instances (created with a DynamicSupervisor) when our assets connect to our servers. Looking at the code, and as far as I understand, it is currently not possible to dynamically use telemetry_poller to measure such processes.
Which approach is the best in this case and would that make sense to extend telemetry_poller
for this use case? It seems to me that adding an add_measurement
call in telemetry_poller.erl
would be enough.
Edit: probably not enough since process_info/3
takes a name, so this would not work on unnamed process but you get the idea
A default poller is started with
telemetry_poller
responsible for emitting
measurements formemory
andtotal_run_queue_lengths
. You can customize
the behaviour of the default poller by setting thedefault
key under the
telemetry_poller
application environment. Setting it tofalse
disables
the poller.
This makes it pretty hard to use multiple instances of :telemetry_poller
(which is currently required to use multiple periods). I think a good approach would be to move it to :telemetry_poller
startup options. This would also fit with the "avoid config files" approach which is more flexible.
Now that we have standardized telemetry events to include multiple measurements, I think it becomes a bit more straight-forward to provide an API that provides general process information. In particular, I would like to include process_info
(with perhaps port_info
and ets_info
) coming in the future. My biggest question is how we are going to expose this to users.
One option is to have a Telemetry.Poller.Measurements
module and have users do something like this:
{Telemetry.Poller.Measurements, :process_info, [process_name, [:message_queue_len], [:my_app, :event]]}
I do find this API too verbose though. So my other proposal is something like this:
{:process_info, name: MyApp.Foo, event: [:my_app, :process_data], measurements: [:message_queue_len, :memory]}
Then I would unify vm_measurements and measurements, into a single key like this:
:memory
| :total_run_queue_lengths
| {:process_info, keyword()}
| {module(), function :: atom(), args :: list()}
And we would replace vm_measurements: :default
by measurements: :default
.
@arkgil thoughts?
For atoms, ets, ports and processes. We need to discuss about alarms though, because the value in itself is not enough.
We do not need to use the accurate one but the one that sums the values of all schedulers.
Wondering if this library shouldn't also be Erlang since telemetry is and this is something that will definitely be written again if telemetry catches on with Erlang.
I may ask the same regarding metrics but after discussing more today I'm not sure its use is what I was thinking, so need to think about that one some more.
It should be configurable like this:
config :telemetry_poller,
vm_measurements: :default, # :default | [:foo, :bar] | :all
period: ...
The default values for this poller are:
Keyword.merge([name: Telemetry.Poller, vm_measurements: :default], Application.get_all_env(:telemetry_poller))
We should change start_link
to have the vm_measurements
function so we unify the APIs. The vm_measuremets
function that exists should be removed.
@fishcakez brought to my attention that the name sampling is confusing, especially in regards to the OpenCensus nomenclature.
To be more precise, the confusion arises because this library provides a "sampling rate", which is a very specific case of sampling, compared to sampling in general.
One option is to rename this library, but we could not come up with a good name. Maybe telemetry_timer
. In any case, we thought we could raise the topic for debate.
If persistent_term is available, we should emit an event with this:
http://erlang.org/doc/man/persistent_term.html#info-0
This will help us measure any bad usage of persistent_term.
In the telemetry_poller hex 0.4.0 release its does include the telemetry
dependency.
The version 0.3.0 includes.
The problem is when we add the telemetry_poller
and has an older version of telemetry, like 0.3.0, it will not have a dependency issue, it will run and them crash, since the telemetry:execute/3
change its api.
When using telemetry_poller on OTP-22 + macOS Catalina, I would expect atom_count
to be included in the system_counts
event, but it's missing:
[Telemetry.Metrics.ConsoleReporter] Got new event!
Event name: vm.system_counts
All measurements: %{port_count: 18, process_count: 310}
All metadata: %{}
I suspect the problem is in rebar.config
:
{erl_opts, [{platform_define, "19", 'OTP19'}, debug_info]}.
When the platform happens to contain a 19, for instance 22.0-x86_64-apple-darwin19.3.0
, then this matches because the regex isn't constrained to the start of the line. See: erlang/rebar3#2105 (comment)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.