GithubHelp home page GithubHelp logo

naemon / naemon-livestatus Goto Github PK

View Code? Open in Web Editor NEW
26.0 17.0 30.0 841 KB

Naemon - Livestatus Eventbroker Module

License: GNU General Public License v2.0

C++ 81.19% C 8.36% Shell 0.09% Python 5.37% Ruby 1.13% Makefile 0.97% M4 0.86% Gherkin 2.03%

naemon-livestatus's Introduction

This is a fork of Mattias Kettner's mk_livestatus. It has been ported to
Naemon by OP5 and contains some extra features such as sorting and page offsets.
This branch is being maintained by the Naemon team until such a time
comes when everything we need has been merged back upstream.

To find out more about mk_livestatus, see
 http://mathias-kettner.de/checkmk_livestatus.html

=== Differences between mk_livestatus and naemons fork of livestatus ===

Three additions to Livestatus Query Language is added:

-----
Sort: <column name> <asc/desc>

Sorts the result set by the specified column in the given direction. Multiple
Sort lines can be added. First sort line takes precedance.

Example:

GET hosts
Sort: last_hard_state_change desc

-----
Offset: <number of lines>

Lines to skip from the beginning of the result set. Useful for pagination in
combination with Limit header.

Example:

GET services
Sort: host_name asc
Sort: description asc
Limit: 100
Offset: 300

-----
OutputFormat: wrapped_json

An extension to the json output format.
The result set is packed in a json object, with a couple of possible fields:
- columns: an array of column names. (optional)
- data: an array of arrays, describing the result set, in the same syntax common
  json output, without embedded column names.
- total_count: The number of lines in the resultsed, except the limitation of
  Limit and Offset headers.

=== About Naemon ===

To find out more about Naemon, see https://www.naemon.io

=== INSTALL ===

To install, run:
 autoreconf -s
 automake --add-missing
 ./configure CPPFLAGS=-I$(path to naemon include files, usually /usr/local/naemon/include)
 make
 sudo make install

naemon-livestatus's People

Contributors

ageric avatar catharsis avatar chifac08 avatar dereckson avatar dhoffend avatar jacobbaungard avatar jdagemark avatar llange avatar mathiaskettner avatar ozamosi avatar pbiering avatar psharmaop5 avatar rhagman avatar sjoegren avatar sni avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

naemon-livestatus's Issues

naemon-livestatus crashes on CentOS 8 if "Sort: name asc" is used

On CentOS 8 with packages installed from Consol Stable Repo:

rpm -qa | egrep '(naemon|thruk)' | sort
libnaemon-1.2.0-0.x86_64
libthruk-2.34-0.x86_64
naemon-1.2.0-0.noarch
naemon-core-1.2.0-0.x86_64
naemon-livestatus-1.2.0-0.x86_64
naemon-thruk-1.2.0-0.noarch
thruk-2.34-2.x86_64
thruk-base-2.34-2.x86_64
thruk-plugin-reporting-2.34-2.x86_64

thruk leads naemon to crash, manually reduced the query, issue is on "Sort: name asc":

echo -e "GET hosts\nColumns: address alias\nLimit: 100\nSort: name
asc\nOutputFormat: wrapped_json\nResponseHeader: fixed16" | unixcat
/var/cache/naemon/live

Foreground execution shows as last line before crashing:

/usr/include/c++/8/bits/stl_vector.h:932: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = void*; _Alloc = std::allocator<void*>; std::vector<_Tp, _Alloc>::reference = void*&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.

On CentOS 7 same query behaves fine

2 processes listening on livestatus socket

I've previously commented on this livestatus issue but probably should have opened a new one here instead. Sorry.

Basically, the problem I see is that even in a fresh install without any custom configuration except for the TCP livestatus socket, after a systemctl reload naemon, there are two processes listening:

vagrant@bookworm:~$ sudo netstat -tupan | grep -e Recv -e naemon
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:6557          0.0.0.0:*               LISTEN      4067/naemon         
tcp        3      0 0.0.0.0:6557          0.0.0.0:*               LISTEN      4072/naemon   

One of them is not responding (waiting to be reaped? although top is not saying it's a zombie)
As a result, Thruk sometimes behaves erratically, says the backend is down etc.

This is the config:

vagrant@bookworm:~$ cat /etc/naemon/module-conf.d/livestatus.cfg 
# Naemon config
broker_module=/usr/lib/naemon/naemon-livestatus/livestatus.so inet_addr=0.0.0.0:6557 debug=1
event_broker_options=-1

This can be easily reproduced in vagrant.

$ vagrant init debian/bookworm64
$ vagrant up
$ vagrant ssh

Next, copy these commands into a script and execute it.

vagrant@bookworm:~$ wget -O reproduce https://github.com/naemon/naemon-livestatus/files/13950328/reproduce.txt
vagrant@bookworm:~$ chmod +x reproduce
vagrant@bookworm:~$ ./reproduce

This should result in something like

<installation>


----------
Restarting

tcp        0      0 0.0.0.0:6557            0.0.0.0:*               LISTEN      5408/naemon         

naemon,5408 --daemon /etc/naemon/naemon.cfg
  ├─naemon,5409 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5410 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5411 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5412 --worker /var/lib/naemon/naemon.qh
  └─naemon,5413 --daemon /etc/naemon/naemon.cfg

systemd,5367 --user
  └─(sd-pam),5368


----------------------
Reloading until broken

Success.
tcp        0      0 0.0.0.0:6557            0.0.0.0:*               LISTEN      5408/naemon         
tcp        0      0 0.0.0.0:6557            0.0.0.0:*               LISTEN      5413/naemon         

naemon,5408 --daemon /etc/naemon/naemon.cfg
  ├─naemon,5413 --daemon /etc/naemon/naemon.cfg
  ├─naemon,5434 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5435 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5436 --worker /var/lib/naemon/naemon.qh
  └─naemon,5437 --worker /var/lib/naemon/naemon.qh

systemd,5367 --user
  └─(sd-pam),5368

---------------------------------
Running "GET status" every second, response size 0 is not good:
2024-01-16T13:08:05+00:00 976
2024-01-16T13:08:06+00:00 976
2024-01-16T13:08:07+00:00 0
2024-01-16T13:08:10+00:00 977
2024-01-16T13:08:11+00:00 0
^C

Notice that process 5413 already exists when naemon is first started, but only after the reload, it also starts listening on that socket.

My current workaround is to restart instead of reload after each config change, but this takes a lot longer than reloading (rather large config). Or I should go back to xinetd.

Livestatus query containing sort causes naemon process SIGABRT

After upgrading from naemon 1.0.7 and Thruk 2.19 (long overdue upgrade on my part), the naemon process would terminate with a SIGABRT when access status information from Thruk. Some sections would display the information as expected while others would report "connection refused" for the livestatus socket (obviously due to the crash above). Tracked the issue down to when livestatus is sent a query containing a sort option. Capture the query sent just before the crash, removing the sort options and sending it again (unixcat to the livestatus socket) resulted in data being returned, adding the sort options back in resulted in a backend process crash. Not sure if this is a livestatus or naemon-core issue.

Current versions

Naemon = 1.0.8
Thruk = 2.22

Query that causes crash (not just limited to this one, any with sort seem to be an issue)

GET services
Columns: accept_passive_checks acknowledged action_url action_url_expanded active_checks_enabled check_command check_interval check_options check_period check_type checks_enabled comments current_attempt current_notification_number description event_handler event_handler_enabled custom_variable_names custom_variable_values execution_time first_notification_delay flap_detection_enabled groups has_been_checked high_flap_threshold host_acknowledged host_action_url_expanded host_active_checks_enabled host_address host_alias host_checks_enabled host_check_type host_latency host_plugin_output host_perf_data host_current_attempt host_check_command host_comments host_groups host_has_been_checked host_icon_image_expanded host_icon_image_alt host_is_executing host_is_flapping host_name host_notes_url_expanded host_notifications_enabled host_scheduled_downtime_depth host_state host_accept_passive_checks host_last_state_change icon_image icon_image_alt icon_image_expanded is_executing is_flapping last_check last_notification last_state_change latency low_flap_threshold max_check_attempts next_check notes notes_expanded notes_url notes_url_expanded notification_interval notification_period notifications_enabled obsess_over_service percent_state_change perf_data plugin_output process_performance_data retry_interval scheduled_downtime_depth state state_type modified_attributes_list last_time_critical last_time_ok last_time_unknown last_time_warning display_name host_display_name host_custom_variable_names host_custom_variable_values in_check_period in_notification_period host_parents long_plugin_output
Filter: host_name = ..*
Sort: host_name asc
Sort: description asc

OutputFormat: wrapped_json
ResponseHeader: fixed16

Same query without the sort(s) work fine.

GET services
Columns: accept_passive_checks acknowledged action_url action_url_expanded active_checks_enabled check_command check_interval check_options check_period check_type checks_enabled comments current_attempt current_notification_number description event_handler event_handler_enabled custom_variable_names custom_variable_values execution_time first_notification_delay flap_detection_enabled groups has_been_checked high_flap_threshold host_acknowledged host_action_url_expanded host_active_checks_enabled host_address host_alias host_checks_enabled host_check_type host_latency host_plugin_output host_perf_data host_current_attempt host_check_command host_comments host_groups host_has_been_checked host_icon_image_expanded host_icon_image_alt host_is_executing host_is_flapping host_name host_notes_url_expanded host_notifications_enabled host_scheduled_downtime_depth host_state host_accept_passive_checks host_last_state_change icon_image icon_image_alt icon_image_expanded is_executing is_flapping last_check last_notification last_state_change latency low_flap_threshold max_check_attempts next_check notes notes_expanded notes_url notes_url_expanded notification_interval notification_period notifications_enabled obsess_over_service percent_state_change perf_data plugin_output process_performance_data retry_interval scheduled_downtime_depth state state_type modified_attributes_list last_time_critical last_time_ok last_time_unknown last_time_warning display_name host_display_name host_custom_variable_names host_custom_variable_values in_check_period in_notification_period host_parents long_plugin_output
Filter: host_name = ..*
OutputFormat: wrapped_json
ResponseHeader: fixed16

Naemon sigsegv when using thruk logcache

Hello,

I have a quite strange behavior with one of our naemon.

naemon daemon segfault when logcache update runs.

here the gdb backtrace :

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7e48700 (LWP 13796)]
strlen () at ../sysdeps/x86_64/strlen.S:106
106	../sysdeps/x86_64/strlen.S: No such file or directory.
(gdb) bt
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
naemon/naemon#1  0x00007ffff67b9e6c in LogEntry::serviceStateToInt(char*) () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#2  0x00007ffff67ba33f in LogEntry::handleNotificationEntry() () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#3  0x00007ffff67ba549 in LogEntry::LogEntry(unsigned int, char*) () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#4  0x00007ffff67ba8a4 in Logfile::processLogLine(unsigned int, unsigned int) () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#5  0x00007ffff67ba9e5 in Logfile::loadRange(_IO_FILE*, unsigned int, TableLog*, long, long, unsigned int) () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#6  0x00007ffff67baae0 in Logfile::load(TableLog*, long, long, unsigned int) () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#7  0x00007ffff67bad17 in Logfile::answerQueryReverse(Query*, TableLog*, long, long, unsigned int) () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#8  0x00007ffff67bf444 in TableLog::answerQuery(Query*) () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#9  0x00007ffff678b6af in Store::answerGetRequest(InputBuffer*, OutputBuffer*, char const*) () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#10 0x00007ffff678b9fb in Store::answerRequest(InputBuffer*, OutputBuffer*) () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#11 0x00007ffff678ab29 in store_answer_request () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#12 0x00007ffff67c4073 in client_thread () from /usr/lib/naemon/naemon-livestatus/livestatus.so
naemon/naemon#13 0x00007ffff7bc7064 in start_thread (arg=0x7ffff7e48700) at pthread_create.c:309
naemon/naemon#14 0x00007ffff6e3e62d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

the issue seems to happens here

char *last = s + strlen(s) - 1;

I've tried various version of naemon from stable and testing and i've the same behavior.

a second naemon, with same versions but not se same generated config, does not crash.

I suspect something strange between the generated config (by Nconf), thruk, livestatus and naemon but i don't really know how to investigate more.

Missing statehist table

Hey, just a quick question.
Did the statehist table present in the original livestatus was taken out of the naemon-livestatus version or it was something later added to livestatus but not ported to naemon-livestatus?

Tks.

livestatus.so segfault

We're running naemon 1.0.0-1 (I know, not quite the latest)

It's been running fine for a couple of weeks with no issues, then this morning, we got a restart. Investigations show a segfault with livestatus.so.

Apr 6 13:02:02 myhost kernel: [567203.918161] naemon[59627]: segfault at 20 ip 00007f8259609c50 sp 00007f825a967e08 error 4 in livestatus.so[7f82595e7000+6d000]

Performance stats show that we make approx 360 NEB callbacks/second. 453k callbacks since the restart approx 4 hours ago.

The installed rpm is a build from 2/13 :

[ahorn@myhost]$ rpm -qi naemon-livestatus
Name : naemon-livestatus Relocations: (not relocatable)
Version : 1.0.0 Vendor: Naemon Core Development Team
Release : 1.el6 Build Date: Fri 13 Feb 2015 07:06:18 PM UTC
Install Date: Mon 30 Mar 2015 11:23:08 PM UTC Build Host: mon-build-centos6-0-64.virt.consol.de
Group : Applications/System Source RPM: naemon-1.0.0-1.el6.src.rpm
Size : 470136 License: GPLv2
Signature : (none)
Packager : Naemon Core Development Team [email protected]
URL : http://www.naemon.org/
Summary : Naemon Livestatus Eventbroker Module
Description :
contains the naemon livestatus eventbroker module.

I'm happy to gather any other information that may help, let me know.

Build naemon-livestatus on non-GNU environment

To be able to build naemon-livestatus everywhere, there are some blockers:

  • A forgotten header, see #118
  • Remove register keywords deprecated in C++11, removed from C++17, see #119
  • Handle GNU-only glibc pthread_tryjoin_np

The last one is less trivial: 1bd7a3b uses an addition to pthreads functions only offered in glibc, using the function pthread_tryjoin_np missing from standard pthreads, that fonction uses glibc-only __pthread_clockjoin_ex etc.

It could be possible to include them if glibc is missing, as licenses of glibc and naemon-livestatus are compatible, but that's a lot of code, could someone has a link to that MON-9123 issue to look at the original issue this code change wanted to fix? It seems a JIRA one?

Missing files when running autoreconf -s

Hi !

[root@sup-dev-rvn0002 ~/naemon-livestatus]# autoreconf -s

configure.ac:47: error: required file './ar-lib' not found
configure.ac:47:   'automake --add-missing' can install 'ar-lib'
configure.ac:46: error: required file './compile' not found
configure.ac:46:   'automake --add-missing' can install 'compile'
configure.ac:49: error: required file './config.guess' not found
configure.ac:49:   'automake --add-missing' can install 'config.guess'
configure.ac:49: error: required file './config.sub' not found
configure.ac:49:   'automake --add-missing' can install 'config.sub'
configure.ac:30: error: required file './install-sh' not found
configure.ac:30:   'automake --add-missing' can install 'install-sh'
configure.ac:49: error: required file './ltmain.sh' not found
configure.ac:30: error: required file './missing' not found
configure.ac:30:   'automake --add-missing' can install 'missing'
Makefile.am:51: warning: shell grep ^VERSION version.sh | awk -F = '{ print $$2}': non-POSIX variable name
Makefile.am:51: (probably a GNU make extension)
src/Makefile.am: error: required file './depcomp' not found
src/Makefile.am:   'automake --add-missing' can install 'depcomp'
parallel-tests: error: required file './test-driver' not found
parallel-tests:   'automake --add-missing' can install 'test-driver'
autoreconf: automake failed with exit status: 1

Fix : use autoreconf -is

[Feature] Export as status.dat or retention.dat format

Hi, I'm working in a distributed naemon core using merlin, works well but prior to run a new naemon peer I have to sync status.dat -> retention.dat from a webserver where previously I have get a status.dat file from cluster via webserver. Homemade solution but also working well.

My request is to query all the status info from livestatus and format it in a retention.dat/status.dat compatible syntax. Do you think it is possible?

Thanks in advance.

g_tree_foreach: assertion `tree != NULL' failed and Naemon fails with segfault

We're experiencing crash with the latest version of naemon and thruk when using the /thruk/cgi-bin/showlog.cgi CGI by clicking on the Event Log button in Thruk interface.

Our configuration holds

Checking objects...
        Checked 5212 services.
        Checked 1548 hosts.
        Checked 79 contacts.
        Checked 302 host groups.
        Checked 310 service groups.
        Checked 85 contact groups.
        Checked 124 commands.
        Checked 4 time periods.
        Checked 0 host escalations.
        Checked 0 service escalations.

After investigations, the crash is caused by this LiveStatus query:

GET log                                          
Columns: class time type state host_name service_description plugin_output message options state_type contact_name
Filter: time >= 1540854000
Filter: time <= 1540940400
And: 2
Limit: 100
Sort: time desc
OutputFormat: wrapped_json
ResponseHeader: fixed16

with a log files that has the following number of lines

root@naemon-server ~ $ wc -l /var/log/naemon/archives/*
  1235420 /var/log/naemon/archives/naemon.log-20181005
   593846 /var/log/naemon/archives/naemon.log-20181027
  1468552 /var/log/naemon/archives/naemon.log-20181028
  1151668 /var/log/naemon/archives/naemon.log-20181029
   582285 /var/log/naemon/archives/naemon.log-20181030
   959542 /var/log/naemon/archives/naemon.log-20181031
  5991313 total

When we issue the Livestatus query above, Livestatus outputs messages like

0;0;;;;0;;;;;;;;;;0;;;;;;0;0;0;0;0;;;;;0;0;0;0;;;0;;;;0;0;0.0000000000e+00;0;;;0;0;;;;;;0;0;;;;;;;;0;0.0000000000e+00;;0.0000000000e+00;0;;0;0;0.0000000000e+00;0;;;;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0.0000000000e+00;;0.0000000000e+00;0;0;;;0;0;0;;;;;0.0000000000e+00;;0;0;0;0;0;0;0;0;0;0;0;0;0;0;;0;0.0000000000e+00;;;0;0;0.0000000000e+00;0;;;;0;0.0000000000e+00;0;0;;0;0;0;0.0000000000e+00;0.0000000000e+00;0.0000000000e+00;0;0;0;;;0;;0;0.0000000000e+00;0;;;0;0;;;;;0;0;;;;;;;;;0;0.0000000000e+00;0.0000000000e+00;0;;0;0.0000000000e+00;0;;;;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0.0000000000e+00;;0.0000000000e+00;0;0;;0;0;0;;;;;0.0000000000e+00;;0;0;0;0;0.0000000000e+00;;;0;0;0.0000000000e+00;0;0;0.0000000000e+00;0;0;;739891;[1540921434] g_tree_foreach: assertion 'tree != NULL' failed;assertion 'tree != NULL' failed;;;0;;1540921434;g_tree_foreach: assertion 'tree != NULL' failed

and very often (every time ?) naemon dies.

After applying the patch specified in #29 , it seems the g_tree_foreach: assertion 'tree != NULL' failed still appears but this time, naemon seem to stay alive. We can see the g_tree_foreach: assertion 'tree != NULL' failed messages in Thruk interface like shown in the following screenshot:

naemon_logs

This issue is related to #28 but the proposed patch only fixes the naemon crash problem (which is a good thing 👍 ) but the problem is only mitigated here since the g_tree_foreach: assertion 'tree != NULL' failed always shows up.

Maybe that this issue is also related to #39 (cause the Query also contains a Sort filter), maybe it's not.

Missing statehist table

It seems this fork of livestatus is missing the statehist table, useful for reporting (available since mk-livestatus 1.2.1i2).
Are there any plans to include it?

unsorted contacts host/service column

With the current release of OMD 4.40 (naemon 1.0.3) the contacts column in host and service overview is unsorted. This did work with the previous version (OMD 4.30 and naemon 1.0.2) - do we need a final sorting after the duplicate detection (#91)?

best regards,
Martin.

Segmentation fault

On debian 11.4, omd 5.10.

naemon segfaults with the following stacktrace (read from core dump):

#0 0x00007f28205f48a7 in Logfile::freeMessages(unsigned int) () from /omd/sites/xpertvision/lib/naemon/livestatus.o
#1 0x00007f28205f408a in LogCache::handleNewMessage(Logfile*, long, long, unsigned int) () from /omd/sites/xpertvision/lib/naemon/livestatus.o
#2 0x00007f28205f4c27 in Logfile::loadRange(_IO_FILE*, unsigned int, LogCache*, long, long, unsigned int) () from /omd/sites/xpertvision/lib/naemon/livestatus.o
#3 0x00007f28205f4e3d in Logfile::load(LogCache*, long, long, unsigned int) () from /omd/sites/xpertvision/lib/naemon/livestatus.o
#4 0x00007f28205f4fb7 in Logfile::answerQueryReverse(Query*, LogCache*, long, long, unsigned int) () from /omd/sites/xpertvision/lib/naemon/livestatus.o
#5 0x00007f28205fb0f5 in TableLog::answerQuery(Query*) () from /omd/sites/xpertvision/lib/naemon/livestatus.o
#6 0x00007f28205cfec0 in Store::answerGetRequest(InputBuffer*, OutputBuffer*, char const*) () from /omd/sites/xpertvision/lib/naemon/livestatus.o
#7 0x00007f28205d015a in Store::answerRequest(InputBuffer*, OutputBuffer*) () from /omd/sites/xpertvision/lib/naemon/livestatus.o
#8 0x00007f28205cf2b9 in store_answer_request () from /omd/sites/xpertvision/lib/naemon/livestatus.o
#9 0x00007f2820600de5 in client_thread () from /omd/sites/xpertvision/lib/naemon/livestatus.o
#10 0x00007f2821756ea7 in start_thread (arg=) at pthread_create.c:477
#11 0x00007f28218e2a2f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Segfault is highly random.

jfr

Syntax error during ./configure

Hi,

On CentOS 6.9, I type ./configure CPPFLAGS=-I$

It fails at:

checking for ICU... ./configure: line 15954: syntax error near unexpected token 4.2,' ./configure: line 15954: AX_CHECK_ICU(4.2, have_icu=yes, have_icu=no)'

I tried to install libicu:

Package libicu-4.2.1-14.el6.x86_64 already installed and latest version

EL8: further crash dumps triggered

another crash dump hit me, backtrace:

#0  0x00007fe86ac44ac5 in __strlen_avx2 () from /lib64/libc.so.6
#1  0x00007fe868c47931 in Query::outputString(char const*) () from /tmp/livestatus.so
#2  0x00007fe868c4753b in Query::printRow(void*) () from /tmp/livestatus.so
#3  0x00007fe868c4b37a in Query::finish() () from /tmp/livestatus.so
#4  0x00007fe868c4d995 in Store::answerGetRequest(InputBuffer*, OutputBuffer*, char const*) () from /tmp/livestatus.so
#5  0x00007fe868c4dc93 in Store::answerRequest(InputBuffer*, OutputBuffer*) ()
   from /tmp/livestatus.so
#6  0x00007fe868c4d03d in store_answer_request () from /tmp/livestatus.so
#7  0x00007fe868c83e47 in client_thread () from /tmp/livestatus.so
#8  0x00007fe86a2a22de in start_thread () from /lib64/libpthread.so.0
#9  0x00007fe86abe32a3 in clone () from /lib64/libc.so.6

compiled with clang results in a different "corner case" (customvars) crash

2020-06-23 15:05:04 Ignoring invalid UTF-8 sequence in string '▒'
2020-06-23 15:05:04 Ignoring invalid UTF-8 sequence in string '▒'
2020-06-23 15:05:04 Ignoring invalid UTF-8 sequence in string '▒'
2020-06-23 15:05:04 Ignoring invalid UTF-8 sequence in string '▒'
2020-06-23 15:05:04 Ignoring invalid UTF-8 sequence in string '▒'
2020-06-23 15:05:04 Ignoring invalid UTF-8 sequence in string '▒'
2020-06-23 15:05:04 Ignoring invalid UTF-8 sequence in string '▒'

backtrace

#0  0x00007f46170383c0 in Query::outputString(char const*) () from /tmp/livestatus.so
#1  0x00007f461702f1f2 in CustomVarsColumn::output(void*, Query*) ()
   from /tmp/livestatus.so
#2  0x00007f4617038f12 in Query::printRow(void*) () from /tmp/livestatus.so
#3  0x00007f4617039226 in Query::finish() () from /tmp/livestatus.so

I've tried to place several tracing debug statements, it still unclear to me why it crashes, also I can't really check what is the cause, because trying to access the "char const*" will already cause the crash, even if I try to display the first char by e.g. "s[0]" - something really strange.

[Help] socat query broken pipe

Hi, I would like to make queries to livestatus without using shell pipes or redirections. I'm playin with socat and found some weird behaviour.

If I run

# echo -e "GET status\nColumns: livestatus_version" | socat STDIO UNIX:/run/naemon/live
1.0.6_20160914

works well, but if I run:

# socat EXEC:'echo -e "GET status\nColumns: program_version"' UNIX:/run/naemon/live
2016/09/19 15:43:51 socat[234] E write(5, 0xa29850, 15): Broken pipe

some debug

# socat -vvv EXEC:'echo -e "GET status\nColumns: program_version"' UNIX:/run/naemon/live
GET status
Columns: program_version
< 2016/09/19 15:44:55.457792  length=15 from=0 to=14
1.0.6_20160914
2016/09/19 15:44:55 socat[345] E write(5, 0xe34320, 15): Broken pipe

Do you have any hint about the difference between both commands?

Livestatus shows a user twice on the contacts field

This issue arises when you have a user assigned to two different contact groups and these contact groups assigned to the same host.
Livestatus will list the user twice on the contacts column and we ended up with problems when running logcache updates on Thruk side.

Naemon Version
naemon-core 1.2.4
naemon-livestatus 1.2.4.1
On Debian Buster

To Reproduce
Steps to reproduce the behavior:

  1. Have two different contact groups that contain the same user assigned to the same host.
  2. Query livestatus for the host and verify that the contacts column contains the given username twice.

Expected behavior
The username shows up only once in the contacts column.

Additional context
I can't really precise when this became an issue, because unfortunately, the failure on Thruk side when running authupdate went under our radar for a while.

Feel free to ask for more details if needed.

Thank you

g_tree_foreach: assertion `tree != NULL' failed

When I query the table log, I get this error on naemon.log.
Any ideas on how to go about this problem? Is some cases, naemon crashes when querying the log table.
Tks.

[1517501818] g_tree_foreach: assertion tree != NULL' failed [1517501818] g_tree_foreach: assertion tree != NULL' failed
[1517501818] g_tree_foreach: assertion tree != NULL' failed [1517501818] g_tree_foreach: assertion tree != NULL' failed
[1517501818] g_tree_foreach: assertion tree != NULL' failed [1517501818] g_tree_foreach: assertion tree != NULL' failed

Fails to build - error: 'RegexMatcher' does not name a type

This once worked fine for me, but trying to rebuild recently and it now fails. Maybe due to now using later software packages. For example gcc 7.3.1. I'm running Arch Linux. Let me know what other information you might need. Thanks.

Making all in src
make[3]: Entering directory '/build/naemon/src/naemon-1.0.6/naemon-livestatus/src'
  CXX      livestatus_la-AndingFilter.lo
  CXX      livestatus_la-Column.lo
  CXX      livestatus_la-ColumnsColumn.lo
  CXX      livestatus_la-CustomVarsExplicitColumn.lo
  CXX      livestatus_la-ContactsColumn.lo
  CXX      livestatus_la-CustomVarsColumn.lo
In file included from CustomVarsColumn.cc:28:0:
CustomVarsFilter.h:45:5: error: 'RegexMatcher' does not name a type; did you mean 'uregex_matches'?
     RegexMatcher * _regex_matcher;
     ^~~~~~~~~~~~
     uregex_matches
make[3]: *** [Makefile:798: livestatus_la-CustomVarsColumn.lo] Error 1
make[3]: Leaving directory '/build/naemon/src/naemon-1.0.6/naemon-livestatus/src'
make[2]: *** [Makefile:475: all-recursive] Error 1
make[2]: Leaving directory '/build/naemon/src/naemon-1.0.6/naemon-livestatus'
make[1]: *** [Makefile:385: all] Error 2
make[1]: Leaving directory '/build/naemon/src/naemon-1.0.6/naemon-livestatus'
make: *** [Makefile:24: naemon-livestatus] Error 2

Timeperiod Transition in naemon logfile for SLA Reporting

We are using naemon-1.0.6 and SLA Reporting by Thruk. Since the upgrade from naemon-1.0.3 to naemon-1.0.6 we considered that no timeperiod transitions were logged in naemon logfile.
Before the upgrade we had entries like this

[1293066460] TIMEPERIOD TRANSITION: workhours;1;0

and after restart of naemon like this

[1293030953] TIMEPERIOD TRANSITION: 24x7;-1;1
[1293030953] TIMEPERIOD TRANSITION: none;-1;0
[1293030953] TIMEPERIOD TRANSITION: workhours;-1;1

In Version naemon-1.0.6 Only transitions are written into the logfile after the restart of naemon.
Now SLA Reporting is'nt working correctly because timeperiods are not considered.

Are the any changes in the new Version or what can i do to correct my SLA Reporting ?

Thank you for your help.

best regards

rpm build error: attempt to use unversioned python, define %__python to /usr/bin/python2 or /usr/bin/python3 explicitly

Since Fedora-33, RHEL-8, %__python is not valid anymore and produce an error it you try to use it.
Please update rpm spec file to avoid %__python use. Instead use %__python3 on recent distress and %__python2 on old ones (reel-6 and reel-7)

Error message:
error: attempt to use unversioned python, define %__python to /usr/bin/python2 or /usr/bin/python3 explicitly
error: line 54: install -d %buildroot%{python_sitelib}/livestatus

how to reproduce (on AlmaLinux-8 for example):
rpm build -tb naemon-livestatus-1.3.1.tar.gz

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.