jonschipp / nagios-plugins Goto Github PK
View Code? Open in Web Editor NEWA collection of Nagios Plugins I've written
License: GNU General Public License v2.0
A collection of Nagios Plugins I've written
License: GNU General Public License v2.0
HI!
According to the code snippet:
STATUS_MSG=$(eval "$SERVICETOOL" 2>&1)
EXIT_CODE=$?
The variable EXIT_CODE will obtain the exit code of eval, not of the command that is implicit in SERVICETOOL, that is, it is not the same to obtain the exit code obtained:
Where SERVICETOOL = systemctl status service1 | grep -i Active
STATUS_MSG = $ (eval "$ SERVICETOOL" 2> & 1)
EXIT_CODE = $?
than the exit code obtained when we execute the command:
`systemctl status service1 | grep -i Active`
EXIT_CODE = $?
This causes cases like mine, in which a service that is in a failed state:
Active: failed (Result: start-limit) since vie 2021-02-12 14:11:24 CET; 3min 59s ago
give an OK in the plugin, exiting its execution in the code:
[ $TRUST_EXIT_CODE -eq 1 ] && [ $EXIT_CODE -eq 0 ] && echo "$STATUS_MSG" && exit $OK
thanks!!
There should be an option to check the age of the log file.
It should be critical/warning if it's too old. This will help identify if the module isn't loaded, or if rsyslog was stopped, or died.
I have no perfdata with your plugin :(
Hi,
I have not clear where is the mistake on script check_service, because some running service return exit 0 (OK) and others also running return exit 2
Evidence of exit 2 with running service on linux
root@emonpi(rw):nagios# ./libexec/check_service -o linux -t "service emonhub status"
● emonhub.service - LSB: Start/stop emonHub
Loaded: loaded (/etc/init.d/emonhub)
Active: active (running) since Sun 2017-12-24 18:10:15 CET; 1 day 23h ago
Process: 31020 ExecStop=/etc/init.d/emonhub stop (code=exited, status=0/SUCCESS)
Process: 10504 ExecStart=/etc/init.d/emonhub start (code=exited, status=0/SUCCESS)
CGroup: /system.slice/emonhub.service
└─10551 python /usr/share/emonhub/emonhub.py --config-file /home/pi/data/emonhub.conf --logfile /var/log/emonhub/emonhub.log
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
root@emonpi(rw):nagios# echo $?
2
Evidence of expected running
çroot@emonpi(rw):nagios# ./libexec/check_service -o linux -t "service mosquitto status"
● mosquitto.service - Mosquitto MQTT Broker
Loaded: loaded (/lib/systemd/system/mosquitto.service; enabled)
Active: active (running) since Mon 2017-12-25 17:23:18 CET; 24h ago
Docs: man:mosquitto(8)
https://mosquitto.org/
Main PID: 25504 (mosquitto)
CGroup: /system.slice/mosquitto.service
└─25504 /usr/sbin/mosquitto -c /etc/mosquitto/mosquitto.conf
Dec 25 17:23:18 emonpi systemd[1]: Started Mosquitto MQTT Broker.
root@emonpi(rw):nagios# echo $?
0
I added the splunk services to the commands.conf and service.conf file with the goal of being able to monitor if/when the splunk services stopped. Here are the config files:
command.conf:
object CheckCommand "splunk"{
import "plugin-check-command"
command = [
PluginDir + "/check_service.sh"
]
arguments = {
"-s" = "splunk"
"-o" = "linux"
"-t" = "service splunk status"
}
}
service.conf:
apply Service "splunk" {
import "generic-service"
check_command = "splunk"
assign where host.address
}
I am able to successfully add syslog-ng and filebeat with the same configuration and they work fine. Is the splunk services even supported with this plugin? Thanks in advance for any help.
Hi,
I am trying to setup tomcatl service check using this plugin.
Can you please help me out with this issue.
NRPE COMMAND on Tomcat server
The above command runs on the local tomcat server
But When I try to run the command from nagios server it gives the error
Unknown status: /usr/lib64/nagios/plugins/check_service: line 229: systemctl: command not found
SERVICE ON NAGIOS SERVER
I'm using fedora 21 and the command:
./check_load.sh -a -c 5 -w 3
returns the following output:
Load average:
When exec i got:
[root@server plugins]# ./check_service.sh -o linux -s sshd
● sshd.service - OpenSSH server daemon
Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
Active: active (running) since vie 2017-12-29 16:55:07 CET; 2 weeks 5 days ago
Docs: man:sshd(8)
man:sshd_config(5)
Main PID: 1132 (sshd)
CGroup: /system.slice/sshd.service
└─1132 /usr/sbin/sshd -D
ene 18 00:02:01 Server sshd[7445]: Accepted publickey for root from 66.66.66.66 port 6666 ssh2: RSA SHA256:87fWB8mk
[root@server plugins]#
When a service is down, the plugin is still returning status code of OKAY (zero).
root@vps3:~/icinga/nagios-plugins-master# /usr/lib/nagios/plugins/check_service.sh -s postfix -o linux
Active: inactive (dead) since Thu 2018-05-24 12:46:25 BST; 8s ago
root@vps3:~/icinga/nagios-plugins-master# echo $?
0
I think this incorrect, and the bug is due to the grep on line 79:
SERVICETOOL="systemctl status $SERVICE | grep -i Active"
I can fix this by removing the grep on this line as follows:
SERVICETOOL="systemctl status $SERVICE"
which means that SERVICETOOL will get the result variable from systemctl itself rather than the return from grep (which is matching on the text 'ACTIVE' and 'inactive' in the status line therefore returning zero).
Unfortunately the status text returned without this grep is then a bit verbose but at least the status code (i.e. critical, okay etc) is correct. Maybe the grep for '-i Active' could be moved to later in the script once the actual status code check has been performed.
OS is Ubuntu Xenial (16.04)
Last update in the header of the script is 2018-04-25.
Thanks for the script.
Andy.
I use
check_service.sh
with SLES11 SP3 and check_mk
but I get only the output:
Unknown status: service: only root can use service
with visudo i add this lines, but no success:
Defaults:mysite !requiretty
mysite ALL=(ALL) NOPASSWD:ALL
what's going wrong?
Carsten
Two problems:
Good riddance with upstart ... but still, Trusty is an LTS-release that has not yet gone EOL, so it ought to be supported by the script.
I'll try to produce a pull request for this if I get time. This is not so urgent anymore for me, customer decided to do things without our help due to delays and problems from my side - but since I'm already working on it I may as well try to complete it.
Hello,
I've got /etc/nrpe.d/service-iptables.cfg file with the following command:
>cat /etc/nrpe.d/service-iptables.cfg
command[check_service_iptables]=/usr/lib64/nagios/plugins/check_service.sh -u nrpe -o linux -t "systemctl status iptables"
/usr/lib64/nagios/plugins/check_service.sh in this case is the latest version of the script.
If I run the command locally under 'nrpe' user, it returns correct result. However, if I try to run the command via nrpe, it returns an odd result:
>/usr/lib64/nagios/plugins/check_nrpe -H XXX -c check_service_iptables
Failed to determine peer security context: Protocol not available
Showing one /org/freedesktop/systemd1/unit/iptables_2eservice
Sent message type=method_call sender=n/a destination=org.freedesktop.DBus object=/org/freedesktop/DBus interface=org.freedesktop.DBus member=Hello cookie=1 reply_cookie=0 error=n/a
Got message type=method_return sender=org.freedesktop.DBus destination=:1.296 object=n/a interface=n/a member=n/a cookie=1 reply_cookie=1 error=n/a
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1/unit/iptables_2eservice interface=org.freedesktop.DBus.Properties member=GetAll cookie=2 reply_cookie=0 error=n/a
AssertPathExists 0 0 /etc/sysconfig/iptables 0
Root directory /run/log/journal added.
Considering /run/log/journal/31e871a0d760404d91649aecb7986bb3.
Failed to open /run/log/journal/31e871a0d760404d91649aecb7986bb3: Permission denied
Failed to add directory /run/log/journal/31e871a0d760404d91649aecb7986bb3: Permission denied
Journal filter:
The only way I was able to get around this is to write the command like the following:
>cat /etc/nrpe.d/service-iptables.cfg
command[check_service_iptables]=/usr/bin/sudo /usr/lib64/nagios/plugins/check_service.sh -t "systemctl status iptables"`
This is a bit ugly, as I should not be required to sudo each time I want to check a service is running. Is there anything that could be done here?
Hello guys,
I've wrote the script on my server. Manually i get the status of the service.
But somehow when i put it in a script that manage all my monitoring in my server, it doesnt work. In nagios, says Connection refused. but everything its alright.
There is a way to make it that check an specific service. without writing the options -o -s?
like check_service, that check httpd or sshd or mysqld.
Thank you
Hi,
Have see a big problem !
I noticed a problem on the script related to fedora or centos.
It's quite strange because on debian / ubuntu there are no worries.
Three problems are visible:
The first one can easily fix it we download the plugin and we made a chmod + x even if it had already been done!
The other two ben I do not know where he came from ...
I tried more commands and tests but no found solution now.
Best Regards
Hi,
Have try your plugin is work !
But is work localy.
Exemple have EON = Master / Debian 9 = Client
Only informations from master is reported.
Have make this command :
You have idea for what is not work for my Debian ?
This work only on localhost, how i get it to work with hostname. ???
My options not is
define command {
command_name check_service.sh
command_line $USER1$/check_service.sh -l -o linux
}
and
define service {
host_name Lejeplads
service_description List Services
check_command check_service.sh
}
Hi!
The first thing is thanks for this great work with the plugins.
Using the check_services to monitor nfs service on CentOS 6.5, this only show one service, when if you execute "service nfs status" the SO show 4 processes.
So if the 4 th is UP an one of the others is down, Nagios not show an CRITICAL.
Reviewing your code, I´ve show:
elif [ -f /etc/init.d/$SERVICE ] || [ -d /etc/init.d ]; then SERVICETOOL="/etc/init.d/$SERVICE status | tail -1" LISTTOOL="ls -1 /etc/init.d/" if [ $USERNAME ]; then SERVICETOOL="sudo -u $USERNAME /etc/init.d/$SERVICE status | tail -1" LISTTOOL="sudo -u $USERNAME ls -1 /etc/init.d/" fi
If I putt the tail to 10 (for instance) it works.
root@emonpi(rw):nagios# ./libexec/check_service -o linux -s emonhub
Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago
root@emonpi(rw):nagios# echo $?
0
>>> STATUS_MSG: Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago
STATUS_MSG=$(eval "$SERVICETOOL" 2>&1)
EXIT_CODE=$?
>>> SERVICETOOL: systemctl status emonhub | grep -i Active
>>> STATUS_MSG: Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago
>>> EXIT_CODE: 0
#[ $TRUST_EXIT_CODE -eq 1 ] && [ $EXIT_CODE -eq 0 ] && echo "$STATUS_MSG" && exit $OK
running|activerunning*)
echo "$STATUS_MSG"
exit $OK
;;
activeexited*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
root@emonpi(rw):nagios# ./libexec/check_service_test2 -o linux -s emonhub
Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago
root@emonpi(rw):nagios# echo $?
2
If you are agree with my solution, please would you generate a new version with the proposed corrections and as a comment in the header something similar to the following:
When I use this script on CentOS 6.6, it uses 'status' as command.
Unfortunately, this does not see postfix as a valid job:
[root@hostname tmp] (@unknown.foo.com.) # status postfix status status: Unknown job: postfix [root@hostname tmp] (@unknown.foo.com.) # status --system postfix status status: Unknown job: postfix
This is the 'RedHat' way to query status on CentOS 6:
[root@hostname tmp] (@unknown.foo.com.) # service postfix status master (pid 1670) is running...
This is the 'RedHat' to query status on more recent versions:
[user@bar ~]$ systemctl status postfix.service postfix.service - Postfix Mail Transport Agent Loaded: loaded (/usr/lib/systemd/system/postfix.service; enabled) Active: active (running) since Sun 2015-03-22 13:20:46 CET; 2 months 9 days ago Main PID: 1200 (master) CGroup: name=systemd:/system/postfix.service ├─ 1200 /usr/libexec/postfix/master -w ├─ 1203 qmgr -l -t unix -u ├─ 1212 tlsmgr -l -t unix -u └─31388 pickup -l -t unix -u Aug 31 17:12:50 hp-box.unknown.org postfix/smtp[31381]: connect to smtp.gmail.com[2a00:1450:4013:c01::6d]:587: Network...able
Hi,
My name is Poonam Davari. I am trying to setup mysql service check using this plugin. and previously it was working fine.
But out Testing team did some OS hardening on mysql server and now it is giving error as I pasted in title.
Could you please help me out with this.
Pasting here required info.
NRPE COMMAND on mysql server -
command runs locally on mysql server
-image is not getting uploaded
SERVICE ON NAGIOS SERVER
command from nagios server fails
-image is not getting uploaded
Hello,
so I was testing out your check_service.sh script and i noticed a bug where service would have "not" in it's name.
I'm running this on Ubuntu server 16.04, services are run using upstart.
e.g. when regex fails
$ ./check_service.sh -o linux -t "service cplt_marin_push_notifications status"
./check_service.sh: line 250: [: -eq: unary operator expected <<< not sure what this msg means
3 <<< exit code $CRITICAL when *not*running in $STATUS_MSG
cplt_marin_push_notifications start/running, process 30290 <<< $STATUS_MSG
e.g. when regex is ok with different service
$ ./check_service.sh -o linux -t "service cplt_analyzer status"
./check_service.sh: line 250: [: -eq: unary operator expected
6 <<< exit code $OK when *running* in $STATUS_MSG
cplt_analyzer start/running, process 3356 <<< $STATUS_MSG
Hi, first of all thanks for your simple to understand nagios checks.
On Debian Jessie with systemctl , I had to remove 2 cases. Because the would respond with a wrong exit code.
These two i had to remove:
*SUCCESS*)
*[eE]nable*)
The script will respond with an exit code of 0 for both of the cases, but as you can see below the termination of a service with systemctl stop $service
may generate (code=exited, status=0/SUCCESS)
and enable* (enabled) means in systemctl that the service should auto start.
active service
● abc.service - Key Management Service Emulator in C
Loaded: loaded (/etc/systemd/system/abc.service; enabled)
Active: active (running) since Tue 2016-11-22 12:54:38 CET; 2s ago
Docs: man:abc(8)
Main PID: 23983 (abc)
CGroup: /system.slice/abc.service
└─23983 /usr/local/bin/abc -l syslog -c1 -M1 -D
dead service
● abc.service - Key Management Service Emulator in C
Loaded: loaded (/etc/systemd/system/abc.service; enabled)
Active: inactive (dead) since Tue 2016-11-22 12:55:45 CET; 2s ago
Docs: man:abc(8)
Process: 23983 ExecStart=/usr/local/bin/abc $abc_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 23983 (code=exited, status=0/SUCCESS)
Best Regards
skirschner
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.