GithubHelp home page GithubHelp logo

netdata / dashboard Goto Github PK

View Code? Open in Web Editor NEW
66.0 66.0 43.0 9.15 MB

Netdata Agent v1 Dashboard (deprecated)

License: GNU General Public License v3.0

JavaScript 62.98% HTML 12.20% CSS 1.95% TypeScript 22.70% SCSS 0.18%

dashboard's Introduction

Netdata Netdata

Monitor your servers, containers, and applications,
in high-resolution and in real-time.


GitHub Stars
Live Demo Latest release Latest nightly build
CII Best Practices Coverity Scan License: GPL v3+
Discord Discourse topics GitHub Discussions

Visit the Project's Home Page


MENU: WHAT IS NEW | GETTING STARTED | HOW IT WORKS | FAQ | DOCS | COMMUNITY | CONTRIBUTE

Netdata collects metrics per second and presents them in beautiful low-latency dashboards. It is designed to run on all of your physical and virtual servers, cloud deployments, Kubernetes clusters, and edge/IoT devices, to monitor your systems, containers, and applications.

It scales nicely from just a single server to thousands of servers, even in complex multi/mixed/hybrid cloud environments, and given enough disk space it can keep your metrics for years.

WHAT CAN BE MONITORED WITH NETDATA:

Netdata monitors all the following:

Component Linux FreeBSD macOS Windows*
System Resources
CPU, Memory and system shared resources
Full Yes Yes Yes
Storage
Disks, Mount points, Filesystems, RAID arrays
Full Basic Basic Basic
Network
Network Interfaces, Protocols, Firewall, etc
Full Basic Basic Basic
Hardware & Sensors
Fans, Temperatures, Controllers, GPUs, etc
Full Some Some Some
O/S Services
Resources, Performance and Status
Yes
systemd
- - Basic
Logs Yes
systemd-journal
- - -
Processes
Resources, Performance, OOM, and more
Yes Yes Yes Yes
Network Connections
Live TCP and UDP sockets per PID
Yes - - -
Containers
Docker/containerd, LXC/LXD, Kubernetes, etc
Yes - - -
VMs (from the host)
KVM, qemu, libvirt, Proxmox, etc
Yes
cgroups
- - Yes
Hyper-V
Synthetic Checks
Test APIs, TCP ports, Ping, Certificates, etc
Yes Yes Yes Yes
Packaged Applications
nginx, apache, postgres, redis, mongodb,
and hundreds more
Yes Yes Yes Yes
Custom Applications
OpenMetrics, StatsD
Yes Yes Yes Yes

When Netdata runs on Linux, it monitors every kernel feature available, providing full coverage of all kernel technologies that can be monitored.

Netdata provides full enterprise hardware coverage, monitoring all components that provide hardware error reporting, like PCI AER, RAM EDAC, IPMI, S.M.A.R.T., NVMe, Fans, Power, Voltages, and more.

* Netdata runs on Linux, FreeBSD and macOS. For Windows, we rely on Windows Exporter (so a Netdata running on Linux, FreeBSD or macOS is required, next to the monitored Windows servers).

KEY CHARACTERISTICS:

  • πŸ’₯ Collects data from 800+ integrations
    Operating system metrics, container metrics, virtual machines, hardware sensors, applications metrics, OpenMetrics exporters, StatsD, and logs.

  • πŸ’ͺ Real-Time, Low-Latency, High-Resolution
    All data are collected per second and are on the dashboard immediately after data collection.

  • πŸ˜Άβ€πŸŒ«οΈ Unsupervised Anomaly Detection
    Trains multiple Machine-Learning (ML) models for each metric and uses AI to detect anomalies based on the past behavior of each metric.

  • πŸ”₯ Powerful Visualization
    Fully automated dashboard providing corellated visualization of all metrics, allowing you to understand any dataset at first sight, but also to filter, slice and dice the data directly on the dashboards, without the need to learn a query language.

  • πŸ”” Out of box Alerts
    Comes with hundreds of alerts out of the box to detect common issues and pitfalls, revealing issues that can easily go unnoticed. It supports several notification methods to let you know when your attention is needed.

  • 😎 Low Maintenance
    Fully automated in every aspect: automated dashboards, out-of-the-box alerts, auto-detection and auto-discovery of metrics, zero-touch machine-learning, easy scalability and high availability, and CI/CD friendly.

  • ⭐ Open and Extensible
    Netdata is a modular platform that can be extended in all possible ways and it also integrates nicely with other monitoring solutions.


πŸ’₯ NEW: Network Connections Explorer πŸ’₯

Network Connections viewer is currently in the nightly builds of Netdata!

network-connections

This tool visualizes all the sockets each server has (IPv4 and IPv6, TCP and UDP). It can classify them as inbound, outbound, listen and local and allow filtering on them.

The visualization has 4 sides:

  • public (ie. public IPs),
  • private (ie. private and reserved IPs),
  • servers (ie. listening and inbound sockets),
  • clients (ie. sockets towards other servers).

The position of each application on the chart is determined by the classification of the sockets it has. To the top are clients, to the bottom are servers, to the right are internet facing applications, to the left is internal network applications.

The size of each application in the chart is determined by the number of sockets it has, and each application is a pie chart representing the percentage of each kind of sockets it has.


⭐ Netdata is the most energy-efficient monitoring tool ⭐

Energy Efficiency Energy efficiency

Dec 11, 2023: University of Amsterdam published a study related to the impact of monitoring tools for Docker based systems, aiming to answer 2 questions:

  1. What is the impact of monitoring tools on the energy efficiency of Docker-based systems?
  2. What is the impact of monitoring tools on the performance of Docker-based systems?
  • πŸš€ Netdata excels in energy efficiency: "... Netdata being the most energy-efficient tool ...", as the study says.
  • πŸš€ Netdata excels in CPU Usage, RAM Usage and Execution Time, and has a similar impact in Network Traffic as Prometheus.

The study did not normalize the results based on the number of metrics collected. Given that Netdata usually collects significantly more metrics than the other tools, Netdata managed to outperform the other tools, while ingesting a much higher number of metrics. Read the full study here.


Netdata Netdata

On the same workload, Netdata uses 35% less CPU, 49% less RAM, 12% less bandwidth, 98% less disk I/O, and is 75% more disk space efficient on high resolution metrics storage, while providing more than a year of overall retention on the same disk footprint Prometheus offers 7 days of retention. Read the full analysis in our blog.


NEW: Netdata and LOGS ! πŸ₯³

Check the systemd-journal plugin of Netdata, that allows you to view, explore, analyze and query systemd journal logs!

image


Β 

CNCF CNCF
Netdata actively supports and is a member of the Cloud Native Computing Foundation (CNCF)
Β 
...and due to your love ❀️, it is one of the most ⭐'d projects in the CNCF landscape!

Β 

Below is an animated image, but you can see Netdata live!
FRANKFURT | NEWYORK | ATLANTA | SANFRANCISCO | TORONTO | SINGAPORE | BANGALORE
They are clustered Netdata Parents. They all have the same data. Select the one closer to you.
All these run with the default configuration. We only clustered them to have multi-node dashboards.

Netdata Agent


Important πŸ’‘
People get addicted to Netdata. Once you use it on your systems, there's no going back!


What's New and Coming?

Click to see our immediate development plans and a summary view of the last 12 months' releases... Β 
What Description When Status
Netdata Cloud
On-Prem
Netdata Cloud available for On-Prem installation! available fill this form
State manager monitor Centralized and immediate visibility to the state of your apps and services. soon planned
More Customizable Set default settings for all charts and views! soon in progress
AWS Integrated billing Run Netdata our your AWS instances and get your billing integrated on your AWS account. soon in progress
Alert Silence Manager R2 Improvements to the Alert Silencing Manager with recurring schedules and more! soon in progress
Okta SSO Facilitate the integration of Netdata into your organizations user management process. soon in progress
Prometheus/OpenMetrics
improvements
Allow users to configure how metrics should be ingested and presented. soon in progress
Loki logs Another Logs integration, bring your Loki logs onto the UI! soon in progress
UCUM Units Migrate all metrics to the Unified Code for Units of Measure. soon in progress
Dynamic Configurations Configure Alerts and Data Collectors from the UI! soon Beta release v1.45 - in progress
WebRTC Browser to Agent communication via WebRTC. later interrupted
Advanced Troubleshooting Expanded view of dashboard charts integrating Metrics Correlations, Anomaly Advisor, and many more. later interrupted
Homelab plan Unlimited Netdata plan targeted for homelabbers or students. Feb
2024
v1.45
Easy Custom
Dashboards
Drag and drop charts to create custom dashboards on the fly, while troubleshooting! Feb
2024
v1.45
Netdata Notifications
Mobile App
You can receive and manage alert and reachability notifications from your subscribed spaces. Jan
2024
v1.45
systemd journal View the systemd journal logs of your systems on the dashboard. Oct
2023
v1.43
Integrations Netdata Integrations Marketplace! Aug
2023
v1.42
New Agent UI Now Netdata Cloud and Netdata Agent share the same dashboard! Jul
2023
v1.41
Summary Dashboards High level tiles everywhere! Jun
2023
v1.40
Machine Learning Multiple ML models per metric. Jun
2023
v1.40
SSL Netdata Agent gets a new SSL layer. Jun
2023
v1.40
New Cloud UI Filter, slice and dice any dataset from the UI! ML-first! May
2023
v1.39
Microsoft Windows Monitor Windows hosts and apps! May
2023
v1.39
Virtual Nodes Go collectors can now be assigned to virtual nodes! May
2023
v1.39
DBENGINE v2 Faster, more reliable, far more scalable! Feb
2023
v1.38
Netdata Functions Netdata beyond metrics! Monitoring anything! Feb
2023
v1.38
Events Feed Live feed of events about topology changes and alerts. Feb
2023
v1.38
Role Based
Access Control
More roles, offering finer control over access to infrastructure. Feb
2023
v1.38
Infinite Scalability Streaming compression. Replication. Active-active clustering. Nov
2022
v1.37
Grafana Plugin Netdata Cloud as a data source for Grafana. Nov
2022
v1.37
PostgreSQL Completely rewritten, to reveal all the info, even at the table level. Nov
2022
v1.37
Metrics Correlations Advanced algorithms to find the needle in the haystack. Aug
2022
v1.36
Database Tiering Netdata gets unlimited retention! Aug
2022
v1.36
Kubernetes Monitor your Kubernetes workloads. Aug
2022
v1.36
Machine Learning Anomaly Rate information on every chart. Aug
2022
v1.36
Machine Learning Anomaly Advisor! Bottom-up unsupervised anomaly detection. Jun
2022
v1.35
Machine Learning Metrics Correlation on the Agent. Jun
2022
v1.35

Getting Started

User base Servers monitored Sessions served Docker Hub pulls
New users today New machines today Sessions today Docker Hub pulls today

1. Install Netdata everywhere ✌️

Netdata can be installed on all Linux, macOS, and FreeBSD systems. We provide binary packages for the most popular operating systems and package managers.

Check also the Netdata Deployment Guides to decide how to deploy it in your infrastructure.

By default, you will have immediately available a local dashboard. Netdata starts a web server for its dashboard at port 19999. Open up your web browser of choice and navigate to http://NODE:19999, replacing NODE with the IP address or hostname of your Agent. If installed on localhost, you can access it through http://localhost:19999.

2. Configure Collectors πŸ’₯

Netdata auto-detects and auto-discovers most operating system data sources and applications. However, many data sources require some manual configuration, usually to allow Netdata to get access to the metrics.

  • For a detailed list of the 800+ collectors available, check this guide.
  • To monitor Windows servers and applications use this guide.
  • To monitor SNMP devices check this guide.

3. Configure Alert Notifications πŸ””

Netdata comes with hundreds of pre-configured alerts, that automatically check your metrics, immediately after they start getting collected.

Netdata can dispatch alert notifications to multiple third party systems, including: email, Alerta, AWS SNS, Discord, Dynatrace, flock, gotify, IRC, Matrix, MessageBird, Microsoft Teams, ntfy, OPSgenie, PagerDuty, Prowl, PushBullet, PushOver, RocketChat, Slack, SMS tools, Syslog, Telegram, Twilio.

By default, Netdata will send e-mail notifications, if there is a configured MTA on the system.

4. Configure Netdata Parents πŸ‘ͺ

Optionally, configure one or more Netdata Parents. A Netdata Parent is a Netdata Agent that has been configured to accept streaming connections from other Netdata agents.

Netdata Parents provide:

  • Infrastructure level dashboards, at http://parent.server.ip:19999/.

    Each Netdata Agent has an API listening at the TCP port 19999 of each server. When you hit that port with a web browser (e.g. http://server.ip:19999/), the Netdata Agent UI is presented. When the Netdata Agent is also a Parent, the UI of the Parent includes data for all nodes that stream metrics to that Parent.

  • Increased retention for all metrics of all your nodes.

    Each Netdata Agent maintains each own database of metrics. But Parents can be given additional resources to maintain a much longer database than individual Netdata Agents.

  • Central configuration of alerts and dispatch of notifications.

    Using Netdata Parents, all the alert notifications integrations can be configured only once, at the Parent and they can be disabled at the Netdata Agents.

You can also use Netdata Parents to:

  • Offload your production systems (the parents run ML, alerts, queries, etc. for all their children)
  • Secure your production systems (the parents accept user connections, for all their children)

5. Connect to Netdata Cloud ☁️

Optionally, sign-in to Netdata Cloud and claim your Netdata Agents and Parents. If you connect your Netdata Parents, there is no need to connect your Netdata Agents. They will be connected via the Parents.

When your Netdata nodes are connected to Netdata Cloud, you can (on top of the above):

  • Access your Netdata agents from anywhere
  • Access sensitive Netdata agent features (like "Netdata Functions": processes, systemd-journal)
  • Organize your infra in spaces and Rooms
  • Create, manage, and share custom dashboards
  • Invite your team and assign roles to them (Role Based Access Control - RBAC)
  • Get infinite horizontal scalability (multiple independent Netdata Agents are viewed as one infra)
  • Configure alerts from the UI (coming soon)
  • Configure data collection from the UI (coming soon)
  • Netdata Mobile App notifications (coming soon)

🀟 Netdata Cloud does not prevent you from using your Netdata Agents and Parents directly, and vice versa.

πŸ‘Œ Your metrics are still stored in your network when you connect your Netdata Agents and Parents to Netdata Cloud.


Netdata Agent 2


How it works

Netdata is built around a modular metrics processing pipeline.

Click to see more details about this pipeline... Β 

Each Netdata Agent can perform the following functions:

  1. COLLECT metrics from their sources
    Uses internal and external plugins to collect data from their sources.

    Netdata auto-detects and collects almost everything from the operating system: including CPU, Interrupts, Memory, Disks, Mount Points, Filesystems, Network Stack, Network Interfaces, Containers, VMs, Processes, systemd units, Linux Performance Metrics, Linux eBPF, Hardware Sensors, IPMI, and more.

    It collects application metrics from applications: PostgreSQL, MySQL/MariaDB, Redis, MongoDB, Nginx, Apache, and hundreds more.

    Netdata also collects your custom application metrics by scraping OpenMetrics exporters, or via StatsD.

    It can convert web server log files to metrics and apply ML and alerts to them, in real-time.

    And it also supports synthetic tests / white box tests, so you can ping servers, check API responses, or even check filesystem files and directories to generate metrics, train ML and run alerts and notifications on their status.

  2. STORE metrics to a database
    Uses database engine plugins to store the collected data, either in memory and/or on disk. We have developed our own dbengine for storing the data in a very efficient manner, allowing Netdata to have less than 1 byte per sample on disk and amazingly fast queries.

  3. LEARN the behavior of metrics (ML)
    Trains multiple Machine-Learning (ML) models per metric to learn the behavior of each metric individually. Netdata uses the kmeans algorithm and creates by default a model per metric per hour, based on the values collected for that metric over the last 6 hours. The trained models are persisted to disk.

  4. DETECT anomalies in metrics (ML)
    Uses the trained machine learning (ML) models to detect outliers and mark collected samples as anomalies. Netdata stores anomaly information together with each sample and also streams it to Netdata Parents so that the anomaly is also available at query time for the whole retention of each metric.

  5. CHECK metrics and trigger alert notifications
    Uses its configured alerts (you can configure your own) to check the metrics for common issues and uses notifications plugins to send alert notifications.

  6. STREAM metrics to other Netdata Agents
    Push metrics in real-time to Netdata Parents.

  7. ARCHIVE metrics to 3rd party databases
    Export metrics to industry standard time-series databases, like Prometheus, InfluxDB, OpenTSDB, Graphite, etc.

  8. QUERY metrics and present dashboards
    Provide an API to query the data and present interactive dashboards to users.

  9. SCORE metrics to reveal similarities and patterns
    Score the metrics according to the given criteria, to find the needle in the haystack.

When using Netdata Parents, all the functions of a Netdata Agent (except data collection) can be delegated to Parents to offload production systems.

The core of Netdata is developed in C. We have our own libnetdata, that provides:

  • DICTIONARY
    A high-performance algorithm to maintain both indexed and ordered pools of structures Netdata needs. It uses JudyHS arrays for indexing, although it is modular: any hashtable or tree can be integrated into it. Despite being in C, dictionaries follow object-oriented programming principles, so there are constructors, destructors, automatic memory management, garbage collection, and more. For more see here.

  • ARAL
    ARray ALlocator (ARAL) is used to minimize the system allocations made by Netdata. ARAL is optimized for maximum multi-threaded performance. It also allows all structures that use it to be allocated in memory-mapped files (shared memory) instead of RAM. For more see here.

  • PROCFILE
    A high-performance /proc (but also any) file parser and text tokenizer. It achieves its performance by keeping files open and adjusting its buffers to read the entire file in one call (which is also required by the Linux kernel). For more see here.

  • STRING
    A string internet mechanism, for string deduplication and indexing (using JudyHS arrays), optimized for multi-threaded usage. For more see here.

  • ARL
    Adaptive Resortable List (ARL), is a very fast list iterator, that keeps the expected items on the list in the same order they are found in input list. So, the first iteration is somewhat slower, but all the following iterations are perfectly aligned for best performance. For more see here.

  • BUFFER
    A flexible text buffer management system that allows Netdata to automatically handle dynamically sized text buffer allocations. The same mechanism is used for generating consistent JSON output by the Netdata APIs. For more see here.

  • SPINLOCK
    Like POSIX MUTEX and RWLOCK but a lot faster, based on atomic operations, with significantly smaller memory impact, while being portable.

  • PGC
    A caching layer that can be used to cache any kind of time-related data, with automatic indexing (based on a tree of JudyL arrays), memory management, evictions, flushing, pressure management. This is extensively used in dbengine. For more see here.

The above, and many more, allow Netdata developers to work on the application fast and with confidence. Most of the business logic in Netdata is a work of mixing the above.

Netdata data collection plugins can be developed in any language. Most of our application collectors though are developed in Go.

FAQ

πŸ›‘οΈ Is Netdata secure?

Of course it is! We do our best to ensure it is!

Click to see detailed answer ... Β 
Β 

We understand that Netdata is a software piece that is installed on millions of production systems across the world. So, it is important for us, Netdata to be as secure as possible:

Β 
Β 

πŸŒ€ Will Netdata consume significant resources on my servers?

No. It will not! We promise this will be fast!

Click to see detailed answer ... Β 
Β 

Although each Netdata Agent is a complete monitoring solution packed into a single application, and despite the fact that Netdata collects every metric every single second and trains multiple ML models per metric, you will find that Netdata has amazing performance! In many cases, it outperforms other monitoring solutions that have significantly fewer features or far smaller data collection rates.

This is what you should expect:

  • For production systems, each Netdata Agent with default settings (everything enabled, ML, Health, DB) should consume about 5% CPU utilization of one core and about 150 MiB or RAM.

    By using a Netdata parent and streaming all metrics to that parent, you can disable ML & health and use an ephemeral DB mode (like alloc) on the children, leading to utilization of about 1% CPU of a single core and 100 MiB of RAM. Of course, these depend on how many metrics are collected.

  • For Netdata Parents, for about 1 to 2 million metrics, all collected every second, we suggest a server with 16 cores and 32GB RAM. Less than half of it will be used for data collection and ML. The rest will be available for queries.

Netdata has extensive internal instrumentation to help us reveal how the resources consumed are used. All these are available in the "Netdata Monitoring" section of the dashboard. Depending on your use case, there are many options to optimize resource consumption.

Even if you need to run Netdata on extremely weak embedded or IoT systems, you will find that Netdata can be tuned to be very performant.

Β 
Β 

πŸ“œ How much retention can I have?

As much as you need!

Click to see detailed answer ... Β 
Β 

Netdata supports tiering, to downsample past data and save disk space. With default settings, it has 3 tiers:

  1. tier 0, with high resolution, per-second, data.
  2. tier 1, mid-resolution, per minute, data.
  3. tier 2, low-resolution, per hour, data.

All tiers are updated in parallel during data collection. Just increase the disk space you give to Netdata to get a longer history for your metrics. Tiers are automatically chosen at query time depending on the time frame and the resolution requested.

Β 
Β 

πŸš€ Does it scale? I have really a lot of servers!

Netdata is designed to scale and can handle large volumes of data.

Click to see detailed answer ... Β 
Β 
Netdata is a distributed monitoring solution. You can scale it to infinity by spreading Netdata servers across your infrastructure.

With the streaming feature of the Agent, we can support monitoring ephemeral servers but also allow the creation of "monitoring islands" where metrics are aggregated to a few servers (Netdata Parents) for increased retention, or for offloading production systems.

  • ✈️ Netdata Parents provide great vertical scalability, so you can have as big parents as the CPU, RAM and Disk resources you can dedicate to them. In our lab we constantly stress test Netdata Parents with several million metrics collected per second, to ensure it is reliable, stable, and robust at scale.

  • πŸš€ In addition, Netdata Cloud provides virtually unlimited horizontal scalability. It "merges" all the Netdata parents you have into one unified infrastructure at query time. Netdata Cloud itself is probably the biggest single installation monitoring platform ever created, currently monitoring about 100k online servers with about 10k servers changing state (added/removed) per day!

Example: the following chart comes from a single Netdata Parent. As you can see on it, 244 nodes stream to it metrics of about 20k running containers. On this specific chart there are 3 dimensions per container, so a total of about 60k time-series queries are needed to present it.

image

Β 
Β 

πŸ’Ύ My production servers are very sensitive in disk I/O. Can I use Netdata?

Yes, you can!

Click to see detailed answer ... Β 
Β 

Netdata has been designed to spread disk writes across time. Each metric is flushed to disk every 17 minutes, but metrics are flushed evenly across time, at an almost constant rate. Also, metrics are packed into bigger blocks we call extents and are compressed with LZ4 before saving them, to minimize the number of I/O operations made.

Netdata also employs direct I/O for all its database operations, ensuring optimized performance. By managing its own caches, Netdata avoids overburdening system caches, facilitating a harmonious coexistence with other applications.

Single node Agents (not Parents), should have a constant rate of about 50 KiB/s or less, with some spikes above that every minute (flushing of tier 1) and higher spikes every hour (flushing of tier 2).

Health Alerts and Machine-Learning run queries to evaluate their expressions and learn from the metrics' patterns. These are also spread over time, so there should be an almost constant read rate too.

To make Netdata not use the disks at all, we suggest the following:

  1. Use database mode alloc or ram to disable writing metric data to disk.
  2. Configure streaming to push in real-time all metrics to a Netdata Parent. The Netdata Parent will maintain metrics on disk for this node.
  3. Disable ML and health on this node. The Netdata Parent will do them for this node.
  4. Use the Netdata Parent to access the dashboard.

Using the above, the Netdata Agent on your production system will not use a disk.

Β 
Β 

🀨 How is Netdata different from a Prometheus and Grafana setup?

Netdata is a "ready to use" monitoring solution. Prometheus and Grafana are tools to build your own monitoring solution.

Netdata is also a lot faster, requires significantly less resources and puts almost no stress on the server it runs. For a performance comparison check this blog.

Click to see detailed answer ... Β 
Β 

First, we have to say that Prometheus as a time-series database and Grafana as a visualizer are excellent tools for what they do.

However, we believe that such a setup is missing a key element: A Prometheus and Grafana setup assumes that you know everything about the metrics you collect and you understand deeply how they are structured, they should be queried and visualized.

In reality, this setup has a lot of problems. The vast number of technologies, operating systems, and applications we use in our modern stacks, makes it impossible for any single person to know and understand everything about anything. We get testimonials regularly from Netdata users across the biggest enterprises, that Netdata manages to reveal issues, anomalies and problems they were not aware of and they didn't even have the means to find or troubleshoot.

So, the biggest difference of Netdata to Prometheus, and Grafana, is that we decided that the tool needs to have a much better understanding of the components, the applications, and the metrics it monitors.

  • When compared to Prometheus, Netdata needs for each metric much more than just a name, some labels, and a value over time. A metric in Netdata is a structured entity that correlates with other metrics in a certain way and has specific attributes that depict how it should be organized, treated, queried, and visualized. We call this the NIDL (Nodes, Instances, Dimensions, Labels) framework.

    Maintaining such an index is a challenge: first, because the raw metrics collected do not provide this information, so we have to add it, and second because we need to maintain this index for the lifetime of each metric, which with our current database retention, it is usually more than a year.

    At the same time, Netdata provides better retention than Prometheus due to database tiering, scales easier than Prometheus due to streaming, supports anomaly detection and it has a metrics scoring engine to find the needle in the haystack when needed.

  • When compared to Grafana, Netdata is fully automated. Grafana has more customization capabilities than Netdata, but Netdata presents fully functional dashboards by itself and most importantly it gives you the means to understand, analyze, filter, slice and dice the data without the need for you to edit queries or be aware of any peculiarities the underlying metrics may have.

    Furthermore, to help you when you need to find the needle in the haystack, Netdata has advanced troubleshooting tools provided by the Netdata metrics scoring engine, that allows it to score metrics based on their anomaly rate, their differences or similarities for any given time frame.

Still, if you are already familiar with Prometheus and Grafana, Netdata integrates nicely with them, and we have reports from users who use Netdata with Prometheus and Grafana in production.

Β 
Β 

🀨 How is Netdata different from DataDog, New Relic, Dynatrace, X SaaS Provider?

With Netdata your data are always on-prem and your metrics are always high-resolution.

Click to see detailed answer ... Β 
Β 

Most commercial monitoring providers face a significant challenge: they centralize all metrics to their infrastructure and this is, inevitably, expensive. It leads them to one or more of the following:

  1. be unrealistically expensive
  2. limit the number of metrics they collect
  3. limit the resolution of the metrics they collect

As a result, they try to find a balance: collect the least possible data, but collect enough to have something useful out of it.

We, at Netdata, see monitoring in a completely different way: monitoring systems should be built bottom-up and be rich in insights, so we focus on each component individually to collect, store, check and visualize everything related to each of them, and we make sure that all components are monitored. Each metric is important.

This is why Netdata trains multiple machine-learning models per metric, based exclusively on their own past (no sampling of data, no sharing of trained models) to detect anomalies based on the specific use case and workload each component is used.

This is also why Netdata alerts are attached to components (instances) and are configured with dynamic thresholds and rolling windows, instead of static values.

The distributed nature of Netdata helps scale this approach: your data is spread inside your infrastructure, as close to the edge as possible. Netdata is not one data lane. Each Netdata Agent is a data lane and all of them together build a massive distributed metrics processing pipeline that ensures all your infrastructure components and applications are monitored and operating as they should.

Β 
Β 

🀨 How is Netdata different from Nagios, Icinga, Zabbix, etc?

Netdata offers real-time, comprehensive monitoring, with a user-friendly interface and the ability to monitor everything, without any custom configuration required.

Click to see detailed answer ... Β 
Β 

While Nagios, Icinga, Zabbix, and other similar tools are powerful and highly customizable, they can be complex to set up and manage. Their flexibility often comes at the cost of ease-of-use, especially for users who are not systems administrators or do not have extensive experience with these tools. Additionally, these tools generally require you to know what you want to monitor in advance and configure it explicitly.

Netdata, on the other hand, takes a different approach. It provides a "ready to use" monitoring solution with a focus on simplicity and comprehensiveness. It automatically detects and starts monitoring many different system metrics and applications out-of-the-box, without any need for custom configuration.

In comparison to these traditional monitoring tools, Netdata:

  • Provides real-time, high-resolution metrics, as opposed to the often minute-level granularity that tools like Nagios, Icinga, and Zabbix provide.

  • Automatically generates meaningful, organized, and interactive visualizations of the collected data. Unlike other tools, where you have to manually create and organize graphs and dashboards, Netdata takes care of this for you.

  • Applies machine learning to each individual metric to detect anomalies, providing more insightful and relevant alerts than static thresholds.

  • Is designed to be distributed, so your data is spread inside your infrastructure, as close to the edge as possible. This approach is more scalable and avoids the potential bottleneck of a single centralized server.

  • Has a more modern and user-friendly interface, making it easy for anyone, not just experienced administrators, to understand the health and performance of their systems.

Even if you're already using Nagios, Icinga, Zabbix, or similar tools, you can use Netdata alongside them to augment your existing monitoring capabilities with real-time insights and user-friendly dashboards.

Β 
Β 

😳 I feel overwhelmed by the amount of information in Netdata. What should I do?

Netdata is designed to provide comprehensive insights, but we understand that the richness of information might sometimes feel overwhelming. Here are some tips on how to navigate and utilize Netdata effectively...

Click to see detailed answer ... Β 
Β 

Netdata is indeed a very comprehensive monitoring tool. It's designed to provide you with as much information as possible about your system and applications, so that you can understand and address any issues that arise. However, we understand that the sheer amount of data can sometimes be overwhelming.

Here are some suggestions on how to manage and navigate this wealth of information:

  1. Start with the Overview Dashboard
    Netdata's Overview Dashboard provides a high-level summary of your system's status. We have added summary tiles on almost every section, you reveal the information that is more important. This is a great place to start, as it can help you identify any major issues or trends at a glance.

  2. Use the Search Feature
    If you're looking for specific information, you can use the search feature to find the relevant metrics or charts. This can help you avoid scrolling through all the data.

  3. Customize your Dashboards
    Netdata allows you to create custom dashboards, which can help you focus on the metrics that are most important to you. Sign-in to Netdata and there you can have your custom dashboards. (coming soon to the agent dashboard too)

  4. Leverage Netdata's Anomaly Detection
    Netdata uses machine learning to detect anomalies in your metrics. This can help you identify potential issues before they become major problems. We have added an AR button above the dashboard table of contents to reveal the anomaly rate per section so that you can easily spot what could need your attention.

  5. Take Advantage of Netdata's Documentation and Blogs
    Netdata has extensive documentation that can help you understand the different metrics and how to interpret them. You can also find tutorials, guides, and best practices there.

Remember, it's not necessary to understand every single metric or chart right away. Netdata is a powerful tool, and it can take some time to fully explore and understand all of its features. Start with the basics and gradually delve into more complex metrics as you become more comfortable with the tool.

Β 
Β 

☁️ Do I have to subscribe to Netdata Cloud?

Subscribing to Netdata Cloud is optional but many users find it enhances their experience with Netdata.

Click to see detailed answer ... Β 
Β 

The Netdata Agent dashboard and the Netdata Cloud dashboard are the same. Still, Netdata Cloud provides additional features, that the Netdata Agent is not capable of. These include:

  1. Access your infrastructure from anywhere.
  2. Have SSO to protect sensitive features.
  3. Customizable (custom dashboards and other settings are persisted when you are signed in to Netdata Cloud)
  4. Configuration of Alerts and Data Collection from the UI (coming soon)
  5. Security (role-based access control - RBAC).
  6. Horizontal Scalability ("blend" multiple independent parents in one uniform infrastructure)
  7. Central Dispatch of Alert Notifications (even when multiple independent parents are involved)
  8. Mobile App for Alert Notifications (coming soon)

So, although it is not required, you can get the most out of your Netdata setup by using Netdata Cloud.

We encourage you to support Netdata by buying a Netdata Cloud subscription. A successful Netdata is a Netdata that evolves and gets improved to provide a simpler, faster and easier monitoring for all of us.

For organizations that need a fully on-prem solution, we provide Netdata Cloud for on-prem installation. Contact us for more information.

Β 
Β 

πŸ”Ž What does the anonymous telemetry collected by Netdata entail?

Your privacy is our utmost priority. As part of our commitment to improving Netdata, we rely on anonymous telemetry data from our users who choose to leave it enabled. This data greatly informs our decision-making processes and contributes to the future evolution of Netdata.

Should you wish to disable telemetry, instructions for doing so are provided in our installation guides.

Click to see detailed answer ... Β 
Β 

Netdata is in a constant state of growth and evolution. The decisions that guide this development are ideally rooted in data. By analyzing anonymous telemetry data, we can answer questions such as: "What features are being used frequently?", "How do we prioritize between potential new features?" and "What elements of Netdata are most important to our users?"

By leaving anonymous telemetry enabled, users indirectly contribute to shaping Netdata's roadmap, providing invaluable information that helps us prioritize our efforts for the project and the community.

We are aware that for privacy or regulatory reasons, not all environments can allow telemetry. To cater to this, we have simplified the process of disabling telemetry:

  • During installation, you can append --disable-telemetry to our kickstart.sh script, or
  • Create the file /etc/netdata/.opt-out-from-anonymous-statistics and then restart Netdata.

These steps will disable the anonymous telemetry for your Netdata installation.

Please note, even with telemetry disabled, Netdata still requires a Netdata Registry for alert notifications' Call To Action (CTA) functionality. When you click an alert notification, it redirects you to the Netdata Registry, which then directs your web browser to the specific Netdata Agent that issued the alert for further troubleshooting. The Netdata Registry learns the URLs of your agents when you visit their dashboards.

Any Netdata Agent can act as a Netdata Registry. Simply designate one Netdata Agent as your registry, and our global Netdata Registry will no longer be in use. For further information on this, please refer to this guide.

Β 
Β 

😏 Who uses Netdata?

Netdata is a widely adopted project...

Click to see detailed answer ... Β 
Β 

Browse the Netdata stargazers on GitHub to discover users from renowned companies and enterprises, such as ABN AMRO Bank, AMD, Amazon, Baidu, Booking.com, Cisco, Delta, Facebook, Google, IBM, Intel, Logitech, Netflix, Nokia, Qualcomm, Realtek Semiconductor Corp, Redhat, Riot Games, SAP, Samsung, Unity, Valve, and many others.

Netdata also enjoys significant usage in academia, with notable institutions including New York University, Columbia University, New Jersey University, Seoul National University, University College London, among several others.

And, Netdata is also used by numerous governmental organizations worldwide.

In a nutshell, Netdata proves invaluable for:

  • Infrastructure intensive organizations
    Such as hosting/cloud providers and companies with hundreds or thousands of nodes, who require a high-resolution, real-time monitoring solution for a comprehensive view of all their components and applications.

  • Technology operators
    Those in need of a standardized, comprehensive solution for round-the-clock operations. Netdata not only facilitates operational automation and provides controlled access for their operations engineers, but also enhances skill development over time.

  • Technology startups
    Who seek a feature-rich monitoring solution from the get-go.

  • Freelancers
    Who seek a simple, efficient and straightforward solution without sacrificing performance and outcomes.

  • Professional SysAdmins and DevOps
    Who appreciate the fine details and understand the value of holistic monitoring from the ground up.

  • Everyone else
    All of us, who are tired of the inefficiency in the monitoring industry and would love a refreshing change and a breath of fresh air. πŸ™‚

Β 
Β 

🌐 Is Netdata open-source?

The Netdata Agent back-end is entirely open-source. We ship 3 different versions of the UI: 2 open-source versions and 1 closed-source version.

Click to see detailed answer ... Β 
Β 

The entire back-end of the Netdata Agent is open-source, licensed under GPLv3+. We don't develop a separate enterprise version. All users, including commercial ones, use the same Netdata Agent.

The Netdata Agent is shipped with multiple UI versions:

  • http://agent.ip:19999/v0/, the original open-source single-node UI, GPLv3+.
  • http://agent.ip:19999/v1/, the latest open-source single-node UI, GPLv3+.
  • http://agent.ip:19999/v2/, a snapshot of the latest Netdata Cloud UI as it was at the time the agent was released, licensed to be distributed with Netdata Agents under NCUL1.

When you access a Netdata Agent via http://agent.ip:19999/ a splash screen attempts to use the latest live version of Netdata Cloud UI (downloaded from Cloudflare). This only happens when the web browser has internet connectivity and Netdata Cloud is not disabled at the agent configuration. Otherwise, it falls back to http://agent.ip:19999/v2/.

The Netdata Cloud UI is not open-source. But we thought that it is to the benefit of the community to allow everyone to use it directly with Netdata Agents, for free, even if Netdata Cloud is not used.

Β 
Β 

πŸ’° What is your monetization strategy?

Netdata generates revenue through subscriptions to advanced features of Netdata Cloud and sales of on-premise and private versions of Netdata Cloud.

Click to see detailed answer ... Β 
Β 

Netdata generates revenue from these activities:

  1. Netdata Cloud Subscriptions
    Direct funding for our project's vision comes from users subscribing to Netdata Cloud's advanced features.

  2. Netdata Cloud On-Prem or Private
    Purchasing the on-premises or private versions of Netdata Cloud supports our financial growth.

Our Open-Source Community and the free access to Netdata Cloud, contribute to Netdata in the following ways:

  • Netdata Cloud Community Use
    The free usage of Netdata Cloud demonstrates its market relevance. While this doesn't generate revenue, it reinforces trust among new users and aids in securing appropriate project funding.

  • User Feedback
    Feedback, especially issues and bug reports, is invaluable. It steers us towards a more resilient and efficient product. This, too, isn't a revenue source but is pivotal for our project's evolution.

  • Anonymous Telemetry Insights
    Users who keep anonymous telemetry enabled, help us make data informed decisions in refining and enhancing Netdata. This isn't a revenue stream, but knowing which features are used and how, contributes in building a better product for everyone.

We don't monetize, directly or indirectly, users' or "device heuristics" data. Any data collected from community members are exclusively used for the purposes stated above.

Netdata grows financially when technology intensive organizations and operators, need - due to regulatory or business requirements - the entire Netdata suite (including Netdata Cloud) on-prem or private, bundled with top-tier support. It is a win-win case for all parties involved: these companies get a battle tested, robust and reliable solution, while the broader community that helps us build this product, enjoys it at no cost.

Β 
Β 

πŸ“– Documentation

Netdata's documentation is available at Netdata Learn.

This site also hosts a number of guides to help newer users better understand how to collect metrics, troubleshoot via charts, export to external databases, and more.

πŸŽ‰ Community

Discord Discourse topics GitHub Discussions

Netdata is an inclusive open-source project and community. Please read our Code of Conduct.

Join the Netdata community:

Meet Up πŸ§‘β€πŸ€β€πŸ§‘πŸ§‘β€πŸ€β€πŸ§‘πŸ§‘β€πŸ€β€πŸ§‘
The Netdata team and community members have regular online meetups, usually every 2 weeks.
You are welcome to join us! Click here for the schedule.

You can also find Netdata on:
Twitter | YouTube | Reddit | LinkedIn | StackShare | Product Hunt | Repology | Facebook

πŸ™ Contribute

Contributions are essential to the success of open-source projects. In other words, we need your help to keep Netdata great!

What is a contribution? All the following are highly valuable to Netdata:

  1. Let us know of the best-practices you believe should be standardized
    Netdata should out-of-the-box detect as many infrastructure issues as possible. By sharing your knowledge and experiences, you help us build a monitoring solution that has baked into it all the best-practices about infrastructure monitoring.

  2. Let us know if Netdata is not perfect for your use case
    We aim to support as many use cases as possible and your feedback can be invaluable. Open a GitHub issue, or start a GitHub discussion about it, to discuss how you want to use Netdata and what you need.

    Although we can't implement everything imaginable, we try to prioritize development on use-cases that are common to our community, are in the same direction we want Netdata to evolve and are aligned with our roadmap.

  3. Support other community members
    Join our community on GitHub, Discord and Reddit. Generally, Netdata is relatively easy to set up and configure, but still people may need a little push in the right direction to use it effectively. Supporting other members is a great contribution by itself!

  4. Add or improve integrations you need
    Integrations tend to be easier and simpler to develop. If you would like to contribute your code to Netdata, we suggest that you start with the integrations you need, which Netdata does not currently support.

General information about contributions:

  • Check our Security Policy.
  • Found a bug? Open a GitHub issue.
  • Read our Contributing Guide, which contains all the information you need to contribute to Netdata, such as improving our documentation, engaging in the community, and developing new features. We've made it as frictionless as possible, but if you need help, just ping us on our community forums!

Package maintainers should read the guide on building Netdata from source for instructions on building each Netdata component from the source and preparing a package.

License

Netdata is released under GPLv3+. Netdata re-distributes other open-source tools and libraries. Please check the third party licenses.

The Latest Netdata UI, is distributed under NCUL1. It also uses third party open source components. Check the UI third party licenses

dashboard's People

Contributors

alexnti avatar allelos avatar andrewm4894 avatar apostolidhs avatar azein avatar builat avatar burbuli8ra avatar christophidesp avatar cosmix avatar danshilm avatar dependabot[bot] avatar ferroin avatar habetdin avatar hugovalente-pm avatar ilyam8 avatar jacekkolasa avatar jjtsou avatar joelhans avatar ktsaou avatar lokerhp avatar mjnaderi avatar novykh avatar odyslam avatar rex4539 avatar vlegakis avatar zack-shoylev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dashboard's Issues

Dashboard: clicking time selection causes RangeError: Invalid time value

Bug report summary

Cannot change timeframe

OS / Environment

Ubuntu host
Chrome 94.0.4606.81

Netdata version

v1.31.0-370-nightly

Installation method

script

Component Name

Dashboard

Steps To Reproduce
  1. Open dashboard, url http://xx.xx.xx.xx:19999/#after=-300;before=0;theme=slate;utc=Europe%2FHelsinki
  2. Click time selection
    => dialog does not open and nothing works aftetr that. See
    image

Console log shows:
react-dom.production.min.js:196 RangeError: Invalid time value
at Module.H (index.js:373)
at Lt (react-datepicker.min.js:1)
at n.renderCurrentMonth (react-datepicker.min.js:1)
at n.renderDefaultHeader (react-datepicker.min.js:1)
at n.renderHeader (react-datepicker.min.js:1)
at n.renderMonths (react-datepicker.min.js:1)
at n.value (react-datepicker.min.js:1)
at Ja (react-dom.production.min.js:181)
at Qa (react-dom.production.min.js:180)
at Tc (react-dom.production.min.js:261)

Expected behavior

User can select time

dashboard ignores /usr/share/netdata/web/version.txt

@jacekkolasa
Using the solution mentioned here netdata/netdata#233 (comment) no longer works

Another bad user experience caused by this bug is that when you've installed a packaged version of netdata it's not covered by the documentation (since it's impossible to cover all distros etc) and it's not advisable to work around package management frameworks in general
https://repology.org/project/netdata/versions

Please make this optional and preferably hide the icon when disabled

Tested operating system (not relevant): FreeBSD 13
Netdata version: 1.31.0

Best regards,
Daniel

[BUG] Datetime picker not integrated with snapshots

Bug report summary

The new datetime picker is not integrated correctly with the snapshot feature.

If I change my time selection using the datetime picker this isn't taken into consideration when I try to export a snapshot, once I interact with the charts to pan, zoom, etc. and then when I try to export the snapshot the current datetime selection (triggered by the interaction on the chart) it is taken into consideration

OS / Environment

Reproduced on katsuna

Netdata version

v1.31.0-366-nightly

Installation method

Don't know

Component Name

Snapshots with Datetime picker

Steps To Reproduce
  1. Check that datetime picker is for 15min
  2. Go to export a snapshot and for a precision of 1 second per point see that you have 540 points per dimension. Expected size on disk 2.6 MB, at browser memory 2.5 MB (witk pako.deflate.base64)
  3. Change the datetime picker is for 24 hours/1 day
  4. Go to export a snapshot and for a precision of 1 second per point see that you still have 540 points per dimension. Expected size on disk 2.6 MB, at browser memory 2.5 MB (witk pako.deflate.base64)
  5. Pan the chart with your mouse
  6. Go to export a snapshot and see that precision was automatically changed to 113 seconds per point, if you place it 1 second per point you see. "The snapshot will have 86676 points per dimension. Expected size on disk 324.9 MB, at browser memory 307.8 MB."
Expected behavior

The changes on datetime picker should be reflected on the snapshot feature

create a release on new git tag

When new git tag is created, CI/CD should trigger a pipeline:

npm install (it's important that it's npm, not yarn)
npm run build

The contents of the built /build folder should be part of the release.
Checksum file should be built, to validate the compilation

Contributing guidelines

Hey @jacekkolasa,

I want to align this repo with the others, in the work that I am doing to remove friction in the contribution process.

Can you please walk me through the contribution process for the dashboard?

  • What are the parts that have to do with Netdata Cloud (thus the contributor does not care)
  • How should a contributor develop? (simply fork and edit files? Any hints on the structure and/or workflow?
  • From the readme I understand that I don't need to do anything on the agent installation, it talks directly with the API so I just edit my fork, it is served by some static server in the npm scripts and it connects to the agent that is running on my computer, right?

Can't plot tiny plots with netdata

Hi!

I'm trying to plot small plots on the bottom of my screen using uebersicht and netdata.

I am able to achieve the following

image

However, I would like to even further decrease the height of the box, but the issue is that the content then is cut off, instead of being scaled. Is there a solution for this?

This is what it looks like after

image

Thanks!

[Bug]: Fragment id in URL is mangled after zooming/moving/highlighting in graph area

Bug description

Every now and then I share or temporarily bookmark references to specific events, and recently (somewhere after updating to 1.30 or so) I noticed that links do not work as expected, i.e. all time ranges, zoom scale and event location in menu is reset.

For instance, I have URL like this after scrolling in or clicking on menu:

http://192.168.1.1:19999/#menu_cpu;after=0;before=0;theme=slate;utc=Europe/Berlin

Fragment is #menu_cpu, which is correct, and this URL could be easily shared with someone.

But as soon as I zoom in/out, move to different time period or use highlight, the URL is mangled - looks like sorting is applied, and fragment reference "#menu_cpu" gets a "value":

http://192.168.1.1:19999/#after=-726;before=0;menu_cpu=undefined;theme=slate;utc=Europe%2FBerlin

At this point, the location and interval are definitely lost - reload, bookmarking or sending URL to someone will set interval to last 7m and zoom will be reset - which makes URL sharing pointless unless you set highlight (clicking on highlight will display marked interval).

Navigation (at least scrolling) to other graphs will temporarily restore "proper" URL - but only until next zoom/move/highlight action.

This problem exists in 1.33.0, few previous versions, and is browser independent - reproducible in all current Edge/Chrome/Firefox.

Expected behavior

Location on page and highlight/zoom/interval should be preserved when URL is shared or bookmarked and used later or in different browser.

Steps to reproduce

  1. Scroll in to any graph to update URL anchor
  2. Zoom in/out, move to another timeframe or use highlight
  3. Check the URL and open it in different browser/window/tab - it is mangled now, only highlight position is preserved (if set)

Installation method

kickstart-static64.sh

System info

/etc/lsb-release:DISTRIB_ID=Ubuntu
/etc/lsb-release:DISTRIB_RELEASE=18.04
/etc/lsb-release:DISTRIB_CODENAME=bionic
/etc/lsb-release:DISTRIB_DESCRIPTION="Ubuntu 18.04.6 LTS"

Netdata build info

Version: netdata v1.33.0
Configure options:  '--prefix=/opt/netdata/usr' '--sysconfdir=/opt/netdata/etc' '--localstatedir=/opt/netdata/var' '--libexecdir=/opt/netdata/usr/libexec' '--libdir=/opt/netdata/usr/lib' '--with-zlib' '--with-math' '--with-user=netdata' '--enable-cloud' '--without-bundled-protobuf' '--with-bundled-libJudy' 'CFLAGS=-static -O3 -I/openssl-static/include' 'LDFLAGS=-static -L/openssl-static/lib' 'PKG_CONFIG_PATH=/openssl-static/lib/pkgconfig'
Install type: kickstart-static
    Binary architecture: x86_64

Additional info

No response

add event_source = "agent dashboard" to posthog.register()

It will be useful to add an attribute called "event_source" to all agent dashboard events. We will use event_source as important metadata in PostHog and elsewhere.

So i think we can just add an "event_source" attribute set to "agent dashboard" into the posthog.register() call.

add `noreferrer` to replicated nodes links

To avoid any unwanted ip and referrer info from getting into PostHog we should add 'noreferrer' option to the replicated nodes links in the UI to avoid any risk of leakage.

illegal characters (;) in URLS (fragment)

@karlhungus commented on Fri Aug 16 2019

Bug report summary

netdata uses a ; as a separator for it fragments. This trips up some security software. RFC 3986 describes valid characters after fragments.

OS / Environment

N/A

Netdata version (ouput of netdata -V)

1.10.0 (but also head)

Component Name

web

Steps To Reproduce

Visit a dashboard on the provided docker image url contains ; i.e. http://localhost:19999/#;theme=slate

Expected behavior

Ideally replace semicolons with something valid like .

Related Issue netdata/netdata#3819


@thiagoftsm commented on Fri Aug 16 2019

Hi @karlhungus ,

The Netdata webserver follows the RFC using '&' as separator, this specific feature is used in our dashboard that wil be changed in a near future, I will discuss with our team about this problem that you are reporting, but when I tried to access the Netdata( I am using the latest version) I did not have any problem to see the charts and work with the software.

Best regards!


@ilyam8 commented on Fri Aug 16 2019

The Netdata webserver follows the RFC using '&' as separator

That separator is for querires, the issue about fragments separator

rel: https://stackoverflow.com/questions/566276/what-two-separator-characters-would-work-in-a-url-anchor

сс: @jacekkolasa

Basic Dashboard Issues (tooltips and more)

Please use a tooltip library that has a tip indicating the place it refers to.


Screenshot from 2021-09-17 18-03-13

  • Change the tooltip to:
Sign in to manage all your systems and applications together,
have infrastructure wide composite dashboards, create
custom dashboards, central dispatch of alarm notifications,
intelligent troubleshooting features and more.

Screenshot from 2021-09-17 18-09-40

Now there are 2 tooltips:

  • on the critical badge
  • on the warning badge

Both of the above badges are clickable and they open the alerts modal.

  • Please add a tooltip on the ring icon See all health checks and make it clickable.

The following appears on hover to the caret next to play/pause button:

Screenshot from 2021-09-17 18-14-08

  • Change it to Manually select play mode

When the caret is clicked the following appears:

Screenshot from 2021-09-17 18-16-11

Please add a tooltip for each option:

  • Play -> Automatically refresh charts with new data
  • Pause -> Don't auto-refresh the charts
  • Force Play -> Automatically refresh, even when you work on other windows

Screenshot from 2021-09-17 18-22-37

The is no tooltip on the timezone picker.

  • Add tooltip: Select the timezone of the dashboard

  • There is no tooltip on the hostname (clicking the hostname should be somewhat smart, especially when the node is claimed - TBD)
  • There is no tooltip on the netdata logo (it should be clickable too to lead to our site)
  • There is no tooltip on the caret to expand the tab on the left.
  • There are no tooltips on the space selection list (neither when signed in, nor when signed out)
  • The tooltip on "help" has a different tooltip engine
  • The tooltip on "Settings" has a different tooltip engine
  • The tooltip on "Profile" has a different tooltip engine and it says User Settings. It should say User profile

[BUG] Constant gaps in all charts

Describe the bug

I recently installed a fresh copy of Ubuntu LTS 20.04 with a virtualmin server configured on it, running a couple websites. I’m having a strange issue with the netdata dashboard on both the agent and cloud side of things where every minute or so I get a gap of a few seconds of lost data - on everything. I’ve tried googling this issue and cannot seem to find other occurrences like mine.

The odd thing is - my server is hardly being utilized. It is using 5% CPU at max, and has plenty of RAM and bandwidth available. The gaps also do appear to be pretty consistent, appearing every 30-35 seconds. Does anyone have any ideas about what may be causing this?

User reported bug on forums: https://community.netdata.cloud/t/constant-gaps-in-all-charts/2342

To Reproduce
Steps to reproduce the behavior:
N/A

Expected behavior

No gap is expected to appear on charts when no valid cause seems to exist.

Screenshots

Check forum thread

Error logs

Check forum thread

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Missing units

Netdata dashboard is missing the following conversion:

'bytes/s': {
            'bytes/s': 1,
            'kilobytes/s': 1024,
            'megabytes/s': 1024 * 1024,
            'gigabytes/s': 1024 * 1024 * 1024,
            'terabytes/s': 1024 * 1024 * 1024 * 1024
        },

 'B/s': {
            'B/s': 1,
            'KiB/s': 1024,
            'MiB/s': 1024 * 1024,
            'GiB/s': 1024 * 1024 * 1024,
            'TiB/s': 1024 * 1024 * 1024 * 1024
        },

The PR netdata/netdata#9333 is fixing this for the old dashboard, but we also need to fix here.

Horizontal scrollbar while netdata is loading

Bug report summary

While netdata is loading and before the CSS fully kicks in, there's a horizontal scrollbar on the page. I believe this happens before bootstrap css loads, and there's no border-box styles, which causes the padding on body to overflow.

image

OS / Environment

Windows 10, Chrome 83 and Firefox 77

Netdata version

netdata v1.22.1-218-nightly

Component Name

Loading UI.

Steps To Reproduce
  1. Open https://london.my-netdata.io/default.html
  2. Observe the loading overlay
Expected behavior

No horizontal scrollbar.

Turn posthog on for all agent dashboards and disable GA

As we are happy with the PostHog data from our demo servers we are now ready to turn it on by default for all agent dashboards.

I think we should:

  • Turn it on for all agent dashboards.
  • Disable the GA code but leave it there should we need to re-enable it (idea being we can then full remove it later as part of a final clean up).

@jacekkolasa does that sound ok to you? Or better to just take out the GA code now and easy to just add it back in if needed?

Whatever you think is best.

Agent dashboard top bar gets cut with lower resolutions

Bug report summary

When opening the agent dashboard and you are on lower resolution (below 1330px) the top bar elements start to get cut.
Note: This issues was also reported on the Cloud

OS / Environment

Not relevant

Netdata version

v1.31.0-447-nightly

Installation method

Not relevant

Component Name

Dashboard

Steps To Reproduce
  1. Open the agent dashboard
  2. Using browsers device toolbar, go to a resolution below 1330px width
  3. See that Sign in button, if you are not signed in, gets cut
Expected behavior

Ideally the top bar should be full responsive but some other improvements could be made to minimize this:

  1. The spacing of the header element can be improved by lowering the whitespace/padding between elements
  2. The date-time picker can be improved by auto-resizing concerning its contents. Also, the "▢️ Playing" can be shortened by removing the label since the visual metaphor is sufficient.
  3. Below 1280px width resolution the left navigation menu will collapse. When expanding will act as an overlay i.e. will not "push" the elements of the page. --> this could be a "nice to have"

Incorrect time display in charts past midnight (24:01:30... till 01:00)

@aldem commented on Mon Jul 13 2020

Bug report summary

In dashboard chart display, the time since midnight and till 01:00 is reported as 24:01:30 and so on instead of 00:01:30:
image

I see it in most recent nightly (v1.23.1-77-g013c9da6) but not in v1.23.1 stable and it affects all displayed charts, not only one.

OS / Environment
Linux pve 5.4.44-2-pve #1 SMP PVE 5.4.44-2
Debian GNU/Linux 10 (buster) (Proxmox)
Netdata version

v1.23.1-77-g013c9da6

Component Name

web (probably)

Steps To Reproduce

Not sure if this is relevant but after I have installed nightly version I have deleted all dbengine data so it was a clean start.

My configuration has nothing special:

[global]
        history = 90000
        memory mode = dbengine
        page cache size = 512
        dbengine disk space = 2048
        access log = none

[plugins]
        ebpf = no
        ebpf_process = no
Expected behavior

Display time since midnight properly, not as 24:..


@cakrit commented on Mon Oct 26 2020

I upped the prio to medium, this shouldn't be hard to fix.


@jacekkolasa commented on Tue Oct 27 2020

This is related to local browser settings, timezone and language.
@aldem Do you still experience this issue? If yes, can you please share which timezone settings to your have in the Dashboard? It's under settings/locale/"Show browser local time or server time?" section

Can you also tell which browser are you using and check your detected language, by running navigator.language in browser console?


@aldem commented on Tue Oct 27 2020

@jacekkolasa Yes, I still do (at least in 1.26 stable, didn't try nighties yet).

Indeed, the problem is probably browser related - I have no issue with Firefox but it is present in Chrome and Edge (all most recent). Language is detected as "en" (Edge) and "en-US" (Chrome). Timezone is set to server which is in turn set to CET (though this is also client's side time zone).


@jacekkolasa commented on Tue Nov 17 2020

i need to prioritise it lower since i cannot recreate it and perhaps that issue doesn't cause that many problems..

[Bug]: agent dashboard `after` and `before` url params not respected

Bug description

If i try to load an agent dashboard url with specific after and before params they get overwritten on page load with -420 and 0 respectively.

link to internal slack thread on this: https://netdata-cloud.slack.com/archives/CQELFFSUB/p1642003156001200

Expected behavior

The specific after and before url params take precendence.

Steps to reproduce

  1. scroll to a specific url with an after and before populated.
  2. copy and paste the url into a new tab.
  3. refresh the page - you do not go to the after and before you expected.
    ...

Installation method

kickstart.sh

System info

no

Netdata build info

no

Additional info

No response

[BUG] Chart still plays when export snapshot modal is open

Bug report summary

Reproduced on katsuna

OS / Environment

Reproduced on katsuna

Netdata version

v1.31.0-366-nightly

Installation method

Don't know

Component Name

Snapshots with Play/Pause controls

Steps To Reproduce
  1. Check that your chart is playing
  2. Open the export snapshot modal
  3. Chart continues to play on the background
Expected behavior

Chart should be paused while user has the export snapshot modal open, modal is referent to the period the user was seeing when he asked the modal to open

mask referrer attributes in posthog.

We should have a list of "allowed_referrer_domains"

e.g.

allowed_referrer_domains = ['$direct', 'www.google.com', 'duckduckgo.com', 'app.netdata.cloud', 'www.reddit.com', 'my-netdata.io', 'github.com', 'www.netdata.cloud', 'staging.netdata.cloud']
  • If $initial_referring_domain is not in allowed_referrer_domains then we should hash the values of $initial_referrer and $initial_referrer_domain and set them in the posthog.register().
  • If $referring_domain is not in allowed_referrer_domains then we should hash the values of $referrer and $referrer_domain and set them in the posthog.register().

This will avoid any data we don't want like ip addresses etc from coming in via those attributes.

units conversion (seconds): do not convert negative values to positive

Convert seconds to time current behavior:

-1 => 00:00:01

Expected behavior:

-1 => -00:00:01

My use case: there is rdb_current_bgsave_time_sec redis metrics, it means

Duration of the on-going RDB save operation if any

-1 means there is no on-going operation, on the dashboard i see it as 00:00:01 which means there is on-going operation and its durations is 1 second.

Probably i am missing something and there is need to abs the value, however i think that it is much better to not do it because it is misleading as it is now.

Slave overwrite menu

When I am selecting an slave on the menu:
err9

Instead to open in the main screen, the response is overwritten the menu:

err10

Disabling proc plugin causes snapshot save to report error

@KickerTom commented on Mon Nov 02 2020

Bug report summary

When I disable plugin:proc using netdata.conf configuration and then try saving snapshot using the default dashboard,js, I receive an error that one graph failed to be saved.
This seems to be caused by this section in index.html (starting line 124), hardcoding access to system.intr data:

    <div id="masthead" style="display: none;">
        <div class="container">
            <div class="row">
                <div class="col-md-7">
                    <h1>Netdata
                        <p class="lead">Real-time performance monitoring, in the greatest possible detail</p>
                    </h1>
                </div>
                <div class="col-md-5">
                    <div class="well well-lg">
                        <div class="row">
                        <div class="col-md-6">
                            <b>Drag</b> charts to pan.
                            <b>Shift + wheel</b> on them, to zoom in and out.
                            <b>Double-click</b> on them, to reset.
                            <b>Hover</b> on them too!
                            </div>
                        <div class="col-md-6">
                            <div class="netdata-container" data-netdata="system.intr" data-chart-library="dygraph" data-dygraph-theme="sparkline" data-dygraph-type="line" data-dygraph-strokewidth="3" data-dygraph-smooth="true" data-dygraph-highlightcirclesize="6" data-after="-90" data-height="60px" data-colors="#C66"></div>
                            </div>
                        </div>
                    </div>
                </div>
            </div>
        </div>
    </div>

For my use-case, I have commented out this section entirely, but that not a good generic solution. Any reason the system.intr reference is hardcoded here?

Netdata version

1.26.0, but present in master as well.

Component Name

web

Steps To Reproduce
  1. Add this to netdata.conf:
[plugins]
proc = no

  1. Restart netdata
  2. Try exporting snapshot, you will receive error "1 failed to be downloaded".

@stelfrag commented on Fri Apr 02 2021

Hi @KickerTom

Sorry it took so long to get back to you 😞 I tried this is the latest master and the issue seems to be resolved. Could you please confirm so we can close this or investigate further if this doesn't work for you?


@KickerTom commented on Tue Apr 06 2021

I am sorry to bring the bad news :-), but it still fails for me. What I did:

  • clean clone of master
  • build (autoreconf -ivf; ./configure --prefix=/opt/netdata --localstatedir=/tmp/netdata --disable-ebpf --disable-cloud --enable-https=no --enable-dbengine=no; make)
  • install (make install)
  • copy example netdata.conf from system folder in sources, add the section disabling proc plugin
  • start netdata, let it run for couple of seconds
  • try exporting snapshot

What I got was red line in the dialog "1 charts have failed to be downloaded" and when the progress line is full, I get a popup saying "1 failed to be downloaded". Tested with Firefox and Chromium, both on Linux.

Let me know if you need anything else.


@ilyam8 commented on Tue Apr 06 2021

@stelfrag i confirm the issue

using the default dashboard,js

noticed that sentence and decided to check our old dashboard (/old)

Screenshot 2021-04-06 at 19 33 58


@stelfrag commented on Tue Apr 06 2021

Confirmed on the old dashboard .. thank you @KickerTom !


@stelfrag commented on Tue Apr 06 2021

@jacekkolasa can this be addressed?


@KickerTom commented on Tue Apr 06 2021

Just out of curiosity - I am using the default dashboard installed by make install. If this is "the old" dashboard, where is "the new" one, and why it is not the default one? :-)


@Ferroin commented on Wed Apr 07 2021

Just out of curiosity - I am using the default dashboard installed by make install. If this is "the old" dashboard, where is "the new" one, and why it is not the default one? :-)

The β€˜new’ one can be found at https://github.com/netdata/dashboard. The same code is also used as part of the Netdata Cloud UI, so it got split to a separate repo so it could be more easily shared in both places.

As of right now, it’s installed correctly by our install scripts packaging/installer/kickstart.sh and netdata-installer.sh), but not by a plain manual install with make install. The lack of make install support is honestly something we should fix though.

@jacekkolasa @stelfrag I think the cleanest way to actually handle the dashboard properly is to make it a submodule (given that we are now obviously allowing usage of them) and have it handled by existing build system instead of the installer. That would get us the proper up-to-date dashboard in our release tarballs, which would also mean that most third-party packages would have it as well, as well as allowing make install to properly install the current version of the dashboard.


@KickerTom commented on Thu Apr 08 2021

Hmm, more bad news :-). I just tried the new dashboard, and the problem is still there.
To make sure I haven't made anything stupid, I used the official release installation script (netdata-v1.30.0.gz.run). You need to make sure netdata does not start automatically at all after install. Then you modify configuration and disable the proc plugin, then you can start netdata and try exporting snapshot.
If you had proc plugin running in the past, or started during install, you need to clear your history data (by deleting the local db, caches etc). Otherwise even the old data in the db are enough to hide this problem.


@ilyam8 commented on Thu Apr 08 2021

This is the case, i confirm.


@odyslam commented on Thu Apr 08 2021

Thanks @KickerTom for helping us figure this out.

@stelfrag @jacekkolasa I second @Ferroin's proposal in the sense that we need to address this discrepancy if we really want to remove unnecessary friction from Netdata installation via different methods (e.g homebrew, distro packages, etc.)


@jacekkolasa commented on Thu Apr 15 2021

@KickerTom Indeed, there is a bug (on new Dashboard, too), your instructions are very accurate, thanks!
FTR it still allows using snapshots, the problem is only that annoying alert popup.

@Ferroin re: submodules we should also consult with @novykh and @netdata/cloud-fe , i like the idea.


@Ferroin commented on Fri Apr 16 2021

@jacekkolasa Do you want to set up a meeting to discuss this some time next week? The requirements for getting the dashboard integrated as a submodule are not exactly trivial, but should not be particularly difficult either, and it would be good to make sure everyone is on the same page about what would be needed before we start trying to decide if we want to do it or not.

convert URL in the `info` field to a link (href) when rendering Netdata Alarms

I want to add a link to the alarm info field.

My use case is notifying a user that he uses a deprecated collector, it's gonna be removed soon and he needs to migrate his configuration to the new collector. And I want to add a link to the new collector documentation.


plaintext link

info: You are using deprecated python.d sensors module that will be removed in v1.30.3. \
      Migrate your configuration to the go.d apache module \
      (https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/apache).

The result is not good because the right part (warning when, calculation, etc) doesn't fit into the screen if the link length is big.

Screenshot 2021-04-21 at 16 30 23

using HTML tags

info: You are using deprecated python.d sensors module that will be removed in <code>v1.30.3</code>. \
      Migrate your configuration to the \
      <a href="https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/apache" target="_blank">go.d apache</a> module.

Looks perfect. However, it doesn't really do, because using HTML tags breaks some notification methods (for instance telegram, you get 400).

Screenshot 2021-04-21 at 16 35 25

[Request] convert URL in the info field to a link (href)

info: You are using deprecated python.d sensors module that will be removed in <code>v1.30.3</code>. \
      Migrate your configuration to the go.d apache module (https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/apache).

I want to get on the Dashboard

You are using the deprecated python.d sensors module that will be removed in v1.30.3.
Migrate your configuration to the go.d apache module (link).

Text font size is far too small to read without large zooming on gauges when using an iPad.

@TheOldMan2000 commented on Sat Oct 17 2020

Bug report summary

Although responsive, the text that accompanies dashbaord gauges is far too small to read when using a standard iPad.
(Gauges shown at top in System Overview and those recently restored in Go.d Web_Log)

OS / Environment

ipad OS 14x

Netdata version

v1.26.0-6-g33c7a110

Component Name

web

Steps To Reproduce

General usage.

Expected behavior

Larger gauges to make better use of available horizontal space with increased font size to ensure readable text labels.

A wonderful example of node dep hell

From peer dependencies not being installed, precompiled versions not avail, to weirdness like this.

Building: /usr/local/bin/node /tmp/dashboard/node_modules/node-gyp/bin/node-gyp.js rebuild --verbose --libsass_ext= --libsass_cflags= --libsass_ldflags= --libsass_library=
gyp info it worked if it ends with ok
gyp verb cli [
gyp verb cli   '/usr/local/bin/node',
gyp verb cli   '/tmp/dashboard/node_modules/node-gyp/bin/node-gyp.js',
gyp verb cli   'rebuild',
gyp verb cli   '--verbose',
gyp verb cli   '--libsass_ext=',
gyp verb cli   '--libsass_cflags=',
gyp verb cli   '--libsass_ldflags=',
gyp verb cli   '--libsass_library='
gyp verb cli ]
gyp info using [email protected]
gyp info using [email protected] | freebsd | x64
gyp verb command rebuild []
gyp verb command clean []
gyp verb clean removing "build" directory
gyp verb command configure []
gyp verb check python checking for Python executable "/usr/local/bin/python3.8" in the PATH
gyp verb `which` succeeded /usr/local/bin/python3.8 /usr/local/bin/python3.8
gyp ERR! configure error
gyp ERR! stack Error: Command failed: /usr/local/bin/python3.8 -c import sys; print "%s.%s.%s" % sys.version_info[:3];
gyp ERR! stack   File "<string>", line 1
gyp ERR! stack     import sys; print "%s.%s.%s" % sys.version_info[:3];
gyp ERR! stack                       ^
gyp ERR! stack SyntaxError: invalid syntax
gyp ERR! stack
gyp ERR! stack     at ChildProcess.exithandler (child_process.js:383:12)
gyp ERR! stack     at ChildProcess.emit (events.js:400:28)
gyp ERR! stack     at maybeClose (internal/child_process.js:1058:16)
gyp ERR! stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:293:5)
gyp ERR! System FreeBSD 13.0-STABLE
gyp ERR! command "/usr/local/bin/node" "/tmp/dashboard/node_modules/node-gyp/bin/node-gyp.js" "rebuild" "--verbose" "--libsass_ext=" "--libsass_cflags=" "--libsass_ldflags=" "--libsass_library="
gyp ERR! cwd /tmp/dashboard/node_modules/node-sass
gyp ERR! node -v v14.18.1
gyp ERR! node-gyp -v v3.8.0
gyp ERR! not ok

I have many more, all hail node!

poor Dashboard->Alarms performance when there are a lot of alarms under the same section (family)

@ilyam8 commented on Mon Apr 05 2021

Bug report summary

It takes 6-7 seconds to load netdata alarms when there a lot of alarms under the same section.

Screenshot 2021-04-05 at 15 39 57

Exact number of alarms in my test is 100

[pc ilyam]# curl -s http://127.0.0.1:19999/api/v1/alarms?all | jq ".alarms" | jq length
100

I was trying to add alarms for systemd units state. systemdunits collector creates a chart per unit type. Number of dimensions in charts equal to number of units per the type (up to 90 on my server).

I added the following alarm

template: systemd_service_units_state
      on: systemd.service_units_state
  lookup: max -1s foreach *
   units: ok/failed
   every: 10s
    warn: $this != nan AND $this == 5
   delay: down 5m multiplier 1.5 max 1h
    info: systemd service crashed
      to: sysadmin

And noticed performance problems.

OS / Environment
Linux pc 5.10.23-1-MANJARO #1 SMP PREEMPT Thu Mar 11 18:47:18 UTC 2021 x86_64 GNU/Linux
release [Click]
/etc/arch-release:Manjaro Linux
/etc/lsb-release:DISTRIB_ID=ManjaroLinux
/etc/lsb-release:DISTRIB_RELEASE=21.0
/etc/lsb-release:DISTRIB_CODENAME=Ornara
/etc/lsb-release:DISTRIB_DESCRIPTION="Manjaro Linux"
/etc/manjaro-release:Manjaro Linux
/etc/os-release:NAME="Manjaro Linux"
/etc/os-release:ID=manjaro
/etc/os-release:ID_LIKE=arch
/etc/os-release:BUILD_ID=rolling
/etc/os-release:PRETTY_NAME="Manjaro Linux"
/etc/os-release:ANSI_COLOR="32;1;24;144;200"
/etc/os-release:HOME_URL="https://manjaro.org/"
/etc/os-release:DOCUMENTATION_URL="https://wiki.manjaro.org/"
/etc/os-release:SUPPORT_URL="https://manjaro.org/"
/etc/os-release:BUG_REPORT_URL="https://bugs.manjaro.org/"
/etc/os-release:LOGO=manjarolinux
Netdata version

Version: netdata v1.30.0-3-g0068b7c1

buildinfo [Click]
Configure options:  '--prefix=/opt/netdata/usr' '--sysconfdir=/opt/netdata/etc' '--localstatedir=/opt/netdata/var' '--libexecdir=/opt/netdata/usr/libexec' '--libdir=/opt/netdata/usr/lib' '--with-zlib' '--with-math' '--with-user=netdata' '--disable-cloud' 'CFLAGS=-O2' 'LDFLAGS='
Features:
    dbengine:                YES
    Native HTTPS:            YES
    Netdata Cloud:           NO (by user request)
    TLS Host Verification:   YES
Libraries:
    jemalloc:                NO
    JSON-C:                  YES
    libcap:                  YES
    libcrypto:               YES
    libm:                    YES
    LWS:                     YES shared-lib
    mosquitto:               YES
    tcalloc:                 NO
    zlib:                    YES
Plugins:
    apps:                    YES
    cgroup Network Tracking: YES
    CUPS:                    YES
    EBPF:                    YES
    IPMI:                    NO
    NFACCT:                  NO
    perf:                    YES
    slabinfo:                YES
    Xen:                     NO
    Xen VBD Error Tracking:  NO
Exporters:
    AWS Kinesis:             NO
    GCP PubSub:              NO
    MongoDB:                 NO
    Prometheus Remote Write: YES
Installation method

From source.

Component Name

web/gui

Steps To Reproduce

Easy to reproduce using go.d/example collector

  1. activate go.d/example collector
# go.d.conf

modules:
  example: yes
  1. configure go.d/example job
# go.d/example.conf

jobs:
  - name: myexample
    charts:
      num: 5
      dimensions: 20
  1. create an alarm
# /etc/netdata/health.d/example.conf

template: my_example
      on: example.random
  lookup: max -1s foreach *
   units: ok/failed
   every: 10s
    warn: $this != nan AND $this == nan
   delay: down 5m multiplier 1.5 max 1h
    info: example
      to: sysadmin
  1. Restart netdata service, go to Dashboard->Alarms section
Expected behavior

It doesn't take 6-7 seconds to load netdata alarms when there a lot of alarms under the same section. I expect it to be fast (less then a second?).


@ilyam8 commented on Mon Apr 05 2021

@jacekkolasa im not 100% sure, but i think this issue is web related issue, so assigning you. If you need to help with reproducing (see Steps To Reproduce) the problem, let me know.

Rename release file to include version number

Hello! I am the package maintainer for netdata in MacPorts. I noticed that the release tarball for the dashboard is named simply dashboard.tar.gz, without any version information in the file name.

Unfortunately, this causes a few complications for us (and likely for other package management software): MacPorts caches source tarballs and uses the version information to both 1) be able to store multiple versions in our cache and 2) detect when a new version is released. When a source tarball does not contain version information, we lose the ability to do both of these things.

Would you consider renaming the release tarballs to dashboard-${version}.tar.gz?

Thanks!

Add some metadata event attributes to posthog implementation

We would like to add the following event attributes, derived from /api/v1/info as additional event attributes sent for each posthog event. This will provide a lot more context about a users session to help with product analytics and usage statistics.

  • netdata_version = version
  • mirrored_host_count = count of mirrored_hosts
  • alarms_normal = alarms.normal
  • alarms_warning = alarms.warning
  • alarms_critical = alarms.critical
  • host_os_name = os_name
  • host_os_id = os_id
  • host_os_id_like = os_id_like
  • host_os_version = os_version
  • host_os_version_id = os_version_id
  • host_os_detection = os_detection
  • system_cores_total = cores_total
  • system_total_disk_space = total_disk_space
  • system_cpu_freq = cpu_freq
  • system_ram_total = ram_total
  • system_kernel_name = kernel_name
  • system_kernel_version = kernel_version
  • system_architecture = architecture
  • system_vitrualization = virtualization
  • system_virt_detection = virt_detection
  • system_container = container
  • system_container_detection = container_detection
  • container_os_name = container_os_name
  • container_os_id = container_os_id
  • container_os_id_like = container_os_id_like
  • container_os_version = container_os_version
  • container_os_version_id = container_os_version_id
  • host_collectors_count = count of collectors
  • host_cloud_enabled = cloud-enabled
  • host_cloud_available = cloud-available
  • host_agent_claimed = agent-claimed
  • host_aclk_available = aclk-available
  • host_is_parent = _is_parent from host_labels
  • mirrored_hosts_reachable = Count of reachable=true from mirrored_hosts_status
  • mirrored_hosts_unreachable = Count of reachable=false from mirrored_hosts_status
  • host_collector_modules = '|' separated string concatenated list of pluginfrom collectors e.g. apps.plugin|python.d.plugin|proc.plugin|proc.plugin… there should be one for each collector duplicates are ok in the string as we want the order to match that of host_collector_plugins.
  • host_collector_plugins = '|' separated string concatenated list of modulefrom collectors e.g. |redis|/proc/diskstats|/proc/softirqs|/proc/uptime … notice the β€œ β€œ at start of that list - that's ok.

The event attributes can be set at the same time of the initial posthog.register call that we use to mask things like ip and host etc.

https://github.com/netdata/dashboard/blob/posthog-test/src/domains/global/sagas.ts#L86

This will mean that important metadata like netdata version etc will now be available as part of the agent telemetry and so will make things like spotting anomalies and bugs in usage behaviour by netdata version etc much much easier.

add person_id to `posthog.register()`

  • If the person_id is available then we should pass it as the distinct_id when calling posthog.register(). If it is not available then we should just not set any distinct_id and so fallback to the default one from PostHog
  • We should also add an event attribute in the posthog.register called netdata_person_id and if the person_id is not available or known then pass the string "Unavailable" as the event attribute.

PostHog dev dashboard v2.12.0-alpha.5 release

Hello @jacekkolasa

I wanted to just make a ticket of the changes i think we want to try look at for another iteration on the PostHog stuff.

  1. We want to use the default PostHog cookies so i think all we need to do for this is to not set distinct_id here and hoping then the posthog library will handle it.
  2. Can we add a new event attribute called "netdata_machine_guid" = machineGuid in the posthog.register call (since we removing it in the bullet point above)
  3. Can we add logic whereby PostHog is only turned on for demo servers and some dev servers that we can define in some way? Not sure what best option here is but the thinking is we would like to have it enabled for demo servers and our own few specific development machines on launch. If this is too messy or complicated i'm happy to just not do it, and instead maybe we just have our own internal version that just adds our dev servers to the list of demo servers. So not worth time if this is a complicated thing to do and all throwaway effort.
  4. Manolis Vasilakis (@MrZammler) is working to add the few more config based event attributes to the agent backend events and i am asking him if we can ask make them available via the /api/v1/info endpoint. If so then we might have a few more event attributes we want to pull from there to add to the posthog.register call too. I'll update the issue when i have a list of what we want to add. (we will do this in a later dashboard release instead)

Any questions just give me a ping.

unmask certain event attributes on demo sites

For demo sites we actually should not mask certain event attributes and understanding a bit more about our demo site traffic will actually be very useful for various things.

So for demo sites we should just not register the masked values for the following attributes: $current_url, $pathname, $host

Instead we should just let them be whatever the normal way posthog captures them are.

I think we can do this by just not setting those attributes in the posthog.register() when on netdata demo sites.

Can't view replicated node

When trying to open netdata for child from parent netdata I see nothing but this error in JS console:

Screenshot

Parent netdata version: v1.31.0-265-ge7cd69dba
Child netdata version: v1.19.0-10-g48525f9

Packaging dashboard

I'm trying to package the new netdata dashboard, but it seems that the build process does not produce a dashboard_info.js. Is this expected? Where should I obtain this file from?

Problematic handling of sub-second durations

@cakrit commented on Fri Jan 29 2021

Durations in certain charts can be returned in a strange precision (hundreds of nanoseconds), but our charts can only display durations in seconds, causing the following:

image

The specific charts are from Redis (collected via the prometheus collector) in our production.

posthog send `host_collectors` as an array

It's looking like it will be better and easier to maintain and validate if we just send the json from /api/v1/info/ collectors as a string to PostHog.

It can then be wrangled further down the pipeline and will be easier for data validation using this approach.

Issues detected on dashboard

I executed the following tests using Google Chrome and Mozilla Firefox:

  • I moved from top to bottom Netdata dashboard to verify that all charts were displayed correctly.
  • I used the direct zoom on charts using keyboard and I also used + and - to create zooms
  • I tried to open alarms to visualize them.
  • I raised alarms to verify that alarm centralization was working.
  • I had one slave running to be sure that the move from one host to other would not create problems with the charts.
  • I hide dimensions and select multiple dimensions on different charts
  • I repeat the previous tests with browser console opened to check possible problems with JS variables.

During the tests with latest dashboard the following issues were found:

  • Small charts used to describe cpu utilization are on top of text.
  • When I try to open Alarms using dashboard, the first time that I click on link it raises the error Cannot load the required JS library: bootstrap-table, but when I close the message and the alarm and I try to access again, everything works.
  • On the right side menu, eBPF and Applications has the same image, but eBPF is not showing.
  • When dashboard tries to access badge.svg?chart=system.cpu&alarm=10min_cpu_iowait&refresh=auto:1 I receive NS_ERROR_FAILURE.
  • I am detecting in different times the message scrollToId is not defined on events onclick. I received now a different message on this variableUncaught ReferenceError: scrollToId is not defined

wrong message?

text: "This agent cannot connect to Netdata Cloud. Please talk to system's administrator for"
+ " more information.",

This message has 2 issues:

  1. This agent cannot connect to Netdata Cloud, what does it mean? The browser cannot connect to Netdata Cloud or the agent is not connected to netdata cloud (lost ACLK)?

  2. Please talk to system's administrator for more information. assumes that the user is not the system administrator and it is funny and awkward if the user is indeed the admin. So, do not make assumptions on corporate processes and procedures. Just state the problem and let the user decide how to deal with it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.