jonschipp / nagios-plugins Goto Github PK

View Code? Open in Web Editor NEW

108.0 108.0 118.0 103 KB

A collection of Nagios Plugins I've written

License: GNU General Public License v2.0

Shell 78.10% Python 21.90%

nagios-plugins's People

Contributors

Stargazers

Watchers

Forkers

cytora hinchy karolisc ad4mon hegyre kbabioch dmelo anilkumarkakarla warrior1724 mikedowney01 drexlma krhisthianarg tusharniras henryudha stubevan ashishchandra1 pfuender salmani coder-sreeraj amarao congdonglinux towolf uzzal2k5 ishmaelen kmf deepq xshyamx marcindulak samattridge cmcneil karthikvee clgreen94 bierchermuesli t794104 w3bservice mikelaw0201 dgoetz hajoli57lipp039 site24x7 pgaulon tzwingm kunlqt ahmadgt88 wimroose humblefool006 simonsunx tobixen youmarva wyangsun beckblurry paulcalabro linkz57 landscape82 maartee pethed romerojunior putersdcat hoxis vscfreire sowmyaramegowda rhewage mcorbe xyzarchive hromojaro splpathi feljamal satish534 fareax svnaresh sowmyar13 christianpajimola liberodark vijayswaroop abhi9016 quyetmv ronsalas fadzali mvquyet195 franktech bnaysakya vaibhavpokale christophedumont5 hamedkhosrawi sshilton arulanguraj besiberi alejandroska biju-vinoth pisarov gardart dinukakavinda ktan-ccsf mickael067 salqsous jeyminee eduault mcgralis omo-nosa jaysonv t-sinjon

nagios-plugins's Issues

Bug con check_service.sh EXIT_CODE from eval

HI!

According to the code snippet:

STATUS_MSG=$(eval "$SERVICETOOL" 2>&1)
EXIT_CODE=$?

The variable EXIT_CODE will obtain the exit code of eval, not of the command that is implicit in SERVICETOOL, that is, it is not the same to obtain the exit code obtained:

Where SERVICETOOL = systemctl status service1 | grep -i Active

STATUS_MSG = $ (eval "$ SERVICETOOL" 2> & 1)
EXIT_CODE = $?

than the exit code obtained when we execute the command:

`systemctl status service1 | grep -i Active`
EXIT_CODE = $?

This causes cases like mine, in which a service that is in a failed state:

Active: failed (Result: start-limit) since vie 2021-02-12 14:11:24 CET; 3min 59s ago

give an OK in the plugin, exiting its execution in the code:

[ $TRUST_EXIT_CODE -eq 1 ] && [ $EXIT_CODE -eq 0 ] && echo "$STATUS_MSG" && exit $OK

thanks!!

nagios-plugins/check_rsyslog.sh logfile time stamp check

There should be an option to check the age of the log file.

It should be critical/warning if it's too old. This will help identify if the module isn't loaded, or if rsyslog was stopped, or died.

Perfdata ?

I have no perfdata with your plugin :(

Unexpected exit code 2 on Raspberry Pi

Hi,

I have not clear where is the mistake on script check_service, because some running service return exit 0 (OK) and others also running return exit 2

Evidence of exit 2 with running service on linux

root@emonpi(rw):nagios# ./libexec/check_service -o linux -t "service emonhub status"
● emonhub.service - LSB: Start/stop emonHub
Loaded: loaded (/etc/init.d/emonhub)
Active: active (running) since Sun 2017-12-24 18:10:15 CET; 1 day 23h ago
Process: 31020 ExecStop=/etc/init.d/emonhub stop (code=exited, status=0/SUCCESS)
Process: 10504 ExecStart=/etc/init.d/emonhub start (code=exited, status=0/SUCCESS)
CGroup: /system.slice/emonhub.service
└─10551 python /usr/share/emonhub/emonhub.py --config-file /home/pi/data/emonhub.conf --logfile /var/log/emonhub/emonhub.log

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
root@emonpi(rw):nagios# echo $?
2

Evidence of expected running

çroot@emonpi(rw):nagios# ./libexec/check_service -o linux -t "service mosquitto status"
● mosquitto.service - Mosquitto MQTT Broker
Loaded: loaded (/lib/systemd/system/mosquitto.service; enabled)
Active: active (running) since Mon 2017-12-25 17:23:18 CET; 24h ago
Docs: man:mosquitto(8)
https://mosquitto.org/
Main PID: 25504 (mosquitto)
CGroup: /system.slice/mosquitto.service
└─25504 /usr/sbin/mosquitto -c /etc/mosquitto/mosquitto.conf

Dec 25 17:23:18 emonpi systemd[1]: Started Mosquitto MQTT Broker.
root@emonpi(rw):nagios# echo $?
0

Getting "Unknown status: splunk: unrecognized service" when adding splunk services.

I added the splunk services to the commands.conf and service.conf file with the goal of being able to monitor if/when the splunk services stopped. Here are the config files:
command.conf:
object CheckCommand "splunk"{
import "plugin-check-command"

    command = [
    PluginDir + "/check_service.sh"
    ]

    arguments = {
    "-s" = "splunk"
    "-o" = "linux"
    "-t" = "service splunk status"
      }
    }

service.conf:
apply Service "splunk" {
import "generic-service"

    check_command = "splunk"

    assign where host.address
    }

I am able to successfully add syslog-ng and filebeat with the same configuration and they work fine. Is the splunk services even supported with this plugin? Thanks in advance for any help.

Getting an Error:Unknown status: /usr/lib64/nagios/plugins/check_service: line 229: systemctl: command not found

Hi,
I am trying to setup tomcatl service check using this plugin.
Can you please help me out with this issue.
NRPE COMMAND on Tomcat server

The above command runs on the local tomcat server

But When I try to run the command from nagios server it gives the error
Unknown status: /usr/lib64/nagios/plugins/check_service: line 229: systemctl: command not found

SERVICE ON NAGIOS SERVER

check_load.sh on linux returns no uptime

I'm using fedora 21 and the command:

./check_load.sh -a -c 5 -w 3

returns the following output:

Load average:

mysql 5.5 - WARNING on running service

Hey!

any ideas why is it happen?

Check_service not working as expected

When exec i got:

[root@server plugins]# ./check_service.sh -o linux -s sshd
● sshd.service - OpenSSH server daemon
   Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
   Active: active (running) since vie 2017-12-29 16:55:07 CET; 2 weeks 5 days ago
     Docs: man:sshd(8)
           man:sshd_config(5)
 Main PID: 1132 (sshd)
   CGroup: /system.slice/sshd.service
           └─1132 /usr/sbin/sshd -D

ene 18 00:02:01 Server sshd[7445]: Accepted publickey for root from 66.66.66.66 port 6666 ssh2: RSA SHA256:87fWB8mk
[root@server plugins]#

Inactive service reported as active

When a service is down, the plugin is still returning status code of OKAY (zero).

root@vps3:~/icinga/nagios-plugins-master# /usr/lib/nagios/plugins/check_service.sh -s postfix -o linux
   Active: inactive (dead) since Thu 2018-05-24 12:46:25 BST; 8s ago
root@vps3:~/icinga/nagios-plugins-master# echo $?
0

I think this incorrect, and the bug is due to the grep on line 79:

SERVICETOOL="systemctl status $SERVICE | grep -i Active"

I can fix this by removing the grep on this line as follows:

SERVICETOOL="systemctl status $SERVICE"

which means that SERVICETOOL will get the result variable from systemctl itself rather than the return from grep (which is matching on the text 'ACTIVE' and 'inactive' in the status line therefore returning zero).
Unfortunately the status text returned without this grep is then a bit verbose but at least the status code (i.e. critical, okay etc) is correct. Maybe the grep for '-i Active' could be moved to later in the script once the actual status code check has been performed.

OS is Ubuntu Xenial (16.04)
Last update in the header of the script is 2018-04-25.

Thanks for the script.
Andy.

check_service.sh -> Unknown status: service: only root can use service

I use
check_service.sh
with SLES11 SP3 and check_mk
but I get only the output:
Unknown status: service: only root can use service

with visudo i add this lines, but no success:

Defaults:mysite !requiretty
mysite ALL=(ALL) NOPASSWD:ALL

what's going wrong?

Carsten

Ubuntu Trusty makes problems for check_service

Two problems:

"status $foo" doesn't seem to work, but the check_service.sh-script relies on it.
"service $foo status" yields the output "$foo is NOT running" (and exit code 0, of course), the script expects to see "$foo is not running".

Good riddance with upstart ... but still, Trusty is an LTS-release that has not yet gone EOL, so it ought to be supported by the script.

I'll try to produce a pull request for this if I get time. This is not so urgent anymore for me, customer decided to do things without our help due to delays and problems from my side - but since I'm already working on it I may as well try to complete it.

check_service.sh fails on centos7 if running over NRPE

Hello,

I've got /etc/nrpe.d/service-iptables.cfg file with the following command:

>cat /etc/nrpe.d/service-iptables.cfg
command[check_service_iptables]=/usr/lib64/nagios/plugins/check_service.sh -u nrpe -o linux -t "systemctl status iptables"

/usr/lib64/nagios/plugins/check_service.sh in this case is the latest version of the script.

If I run the command locally under 'nrpe' user, it returns correct result. However, if I try to run the command via nrpe, it returns an odd result:

>/usr/lib64/nagios/plugins/check_nrpe -H XXX -c check_service_iptables
Failed to determine peer security context: Protocol not available
Showing one /org/freedesktop/systemd1/unit/iptables_2eservice
Sent message type=method_call sender=n/a destination=org.freedesktop.DBus object=/org/freedesktop/DBus interface=org.freedesktop.DBus member=Hello cookie=1 reply_cookie=0 error=n/a
Got message type=method_return sender=org.freedesktop.DBus destination=:1.296 object=n/a interface=n/a member=n/a cookie=1 reply_cookie=1 error=n/a
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1/unit/iptables_2eservice interface=org.freedesktop.DBus.Properties member=GetAll cookie=2 reply_cookie=0 error=n/a
AssertPathExists 0 0 /etc/sysconfig/iptables 0
Root directory /run/log/journal added.
Considering /run/log/journal/31e871a0d760404d91649aecb7986bb3.
Failed to open /run/log/journal/31e871a0d760404d91649aecb7986bb3: Permission denied
Failed to add directory /run/log/journal/31e871a0d760404d91649aecb7986bb3: Permission denied
Journal filter:

The only way I was able to get around this is to write the command like the following:

>cat /etc/nrpe.d/service-iptables.cfg
command[check_service_iptables]=/usr/bin/sudo /usr/lib64/nagios/plugins/check_service.sh -t "systemctl status iptables"`

This is a bit ugly, as I should not be required to sudo each time I want to check a service is running. Is there anything that could be done here?

Check_service not working as expected

Hello guys,

I've wrote the script on my server. Manually i get the status of the service.
But somehow when i put it in a script that manage all my monitoring in my server, it doesnt work. In nagios, says Connection refused. but everything its alright.
There is a way to make it that check an specific service. without writing the options -o -s?

like check_service, that check httpd or sshd or mysqld.

Thank you

check_service not work on centos / fedora

Hi,

Have see a big problem !
I noticed a problem on the script related to fedora or centos.
It's quite strange because on debian / ubuntu there are no worries.
Three problems are visible:

Unable to read output
/usr/lib64/nagios/plugins/check_service.sh: line 248: systemctl: command not found
/ usr / sbin / service: line 87: exec: / bin / systemctl: can not execute: Permission not granted

The first one can easily fix it we download the plugin and we made a chmod + x even if it had already been done!

The other two ben I do not know where he came from ...
I tried more commands and tests but no found solution now.

Best Regards

How to check service from client ?

Hi,

Have try your plugin is work !
But is work localy.
Exemple have EON = Master / Debian 9 = Client
Only informations from master is reported.
Have make this command : $USER1$/check_service -o linux -t "service status glances"

You have idea for what is not work for my Debian ?

HOSTNAME ERROR

This work only on localhost, how i get it to work with hostname. ???

My options not is

define command {
    command_name    check_service.sh
    command_line    $USER1$/check_service.sh -l -o linux
}

and

define service {
    host_name             Lejeplads
    service_description   List Services
    check_command         check_service.sh
}

check_service not show more than one line

Hi!

The first thing is thanks for this great work with the plugins.

Using the check_services to monitor nfs service on CentOS 6.5, this only show one service, when if you execute "service nfs status" the SO show 4 processes.
So if the 4 th is UP an one of the others is down, Nagios not show an CRITICAL.
Reviewing your code, I´ve show:
elif [ -f /etc/init.d/$SERVICE ] || [ -d /etc/init.d ]; then SERVICETOOL="/etc/init.d/$SERVICE status | tail -1" LISTTOOL="ls -1 /etc/init.d/" if [ $USERNAME ]; then SERVICETOOL="sudo -u $USERNAME /etc/init.d/$SERVICE status | tail -1" LISTTOOL="sudo -u $USERNAME ls -1 /etc/init.d/" fi
If I putt the tail to 10 (for instance) it works.

Detected wrong situation for service down. Added exited service state in case loop.

Finding is in service not running in state "Active: active (exited)" and the actual version of script returns "exit 0". So is not detecting a real service down and is identified as active (exit 0)

root@emonpi(rw):nagios# ./libexec/check_service -o linux -s emonhub
Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago
root@emonpi(rw):nagios# echo $?
0

>>>		STATUS_MSG:    Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago

Some varaibles asignation found are:

Check the status of a service

STATUS_MSG=$(eval "$SERVICETOOL" 2>&1)
EXIT_CODE=$?

>>>		SERVICETOOL: systemctl status emonhub | grep -i Active
>>>		STATUS_MSG:    Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago
>>>		EXIT_CODE: 0

The mistake is this because run 'exit 0' and is not evaluating a real service down status. Comment following line:

#[ $TRUST_EXIT_CODE -eq 1 ] && [ $EXIT_CODE -eq 0 ] && echo "$STATUS_MSG" && exit $OK

And inside of case loop (case $STATUS_MSG in) include this states:

running|activerunning*)
echo "$STATUS_MSG"
exit $OK
;;

activeexited*)
echo "$STATUS_MSG"
exit $CRITICAL
;;

After implement changes and run again the command, now shows service down with CRITICAL status (exit 2)

root@emonpi(rw):nagios# ./libexec/check_service_test2 -o linux -s emonhub
Active: active (exited) since Wed 2019-03-27 08:50:07 CET; 4 weeks 2 days ago
root@emonpi(rw):nagios# echo $?
2

If you are agree with my solution, please would you generate a new version with the proposed corrections and as a comment in the header something similar to the following:

2019-04-26 [Dan Oliva M.] - Detected wrong situation for service down. Added exited service state in case loop.

check_service.sh Uses wrong command in CentOS 6.6

When I use this script on CentOS 6.6, it uses 'status' as command.
Unfortunately, this does not see postfix as a valid job:

[root@hostname tmp] (@unknown.foo.com.) # status postfix status
status: Unknown job: postfix
[root@hostname tmp] (@unknown.foo.com.) # status --system postfix status
status: Unknown job: postfix

This is the 'RedHat' way to query status on CentOS 6:

[root@hostname tmp] (@unknown.foo.com.) # service postfix status
master (pid  1670) is running...

This is the 'RedHat' to query status on more recent versions:

[user@bar ~]$ systemctl status postfix.service
postfix.service - Postfix Mail Transport Agent
   Loaded: loaded (/usr/lib/systemd/system/postfix.service; enabled)
   Active: active (running) since Sun 2015-03-22 13:20:46 CET; 2 months 9 days ago
 Main PID: 1200 (master)
   CGroup: name=systemd:/system/postfix.service
           ├─ 1200 /usr/libexec/postfix/master -w
           ├─ 1203 qmgr -l -t unix -u
           ├─ 1212 tlsmgr -l -t unix -u
           └─31388 pickup -l -t unix -u

Aug 31 17:12:50 hp-box.unknown.org postfix/smtp[31381]: connect to smtp.gmail.com[2a00:1450:4013:c01::6d]:587: Network...able

Getting Error - Unknown status: /usr/lib64/nagios/plugins/check_service.sh: line 223: sudo: command not found

Hi,
My name is Poonam Davari. I am trying to setup mysql service check using this plugin. and previously it was working fine.
But out Testing team did some OS hardening on mysql server and now it is giving error as I pasted in title.
Could you please help me out with this.
Pasting here required info.

NRPE COMMAND on mysql server -

command runs locally on mysql server
-image is not getting uploaded

SERVICE ON NAGIOS SERVER

command from nagios server fails
-image is not getting uploaded

Check_service regex problems

Hello,

so I was testing out your check_service.sh script and i noticed a bug where service would have "not" in it's name.

I'm running this on Ubuntu server 16.04, services are run using upstart.

e.g. when regex fails

$ ./check_service.sh -o linux -t "service cplt_marin_push_notifications status"
./check_service.sh: line 250: [: -eq: unary operator expected   <<< not sure what this msg means
3                                                              <<< exit code $CRITICAL when *not*running in $STATUS_MSG
cplt_marin_push_notifications start/running, process 30290    <<< $STATUS_MSG

e.g. when regex is ok with different service

$ ./check_service.sh -o linux -t "service cplt_analyzer status"
./check_service.sh: line 250: [: -eq: unary operator expected  
6                                           <<< exit code $OK when *running* in $STATUS_MSG 
cplt_analyzer start/running, process 3356  <<< $STATUS_MSG

check_service.sh wrong $STATUS_MSG cases

Hi, first of all thanks for your simple to understand nagios checks.

On Debian Jessie with systemctl , I had to remove 2 cases. Because the would respond with a wrong exit code.

These two i had to remove:

*SUCCESS*) *[eE]nable*)

The script will respond with an exit code of 0 for both of the cases, but as you can see below the termination of a service with systemctl stop $service may generate (code=exited, status=0/SUCCESS) and enable* (enabled) means in systemctl that the service should auto start.

active service

● abc.service - Key Management Service Emulator in C
   Loaded: loaded (/etc/systemd/system/abc.service; enabled)
   Active: active (running) since Tue 2016-11-22 12:54:38 CET; 2s ago
     Docs: man:abc(8)
 Main PID: 23983 (abc)
   CGroup: /system.slice/abc.service
           └─23983 /usr/local/bin/abc -l syslog -c1 -M1 -D

dead service

● abc.service - Key Management Service Emulator in C
   Loaded: loaded (/etc/systemd/system/abc.service; enabled)
   Active: inactive (dead) since Tue 2016-11-22 12:55:45 CET; 2s ago
     Docs: man:abc(8)
  Process: 23983 ExecStart=/usr/local/bin/abc $abc_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 23983 (code=exited, status=0/SUCCESS)

Best Regards
skirschner

jonschipp / nagios-plugins Goto Github PK

nagios-plugins's People

Contributors

Stargazers

Watchers

Forkers

nagios-plugins's Issues

Check the status of a service

2019-04-26 [Dan Oliva M.] - Detected wrong situation for service down. Added exited service state in case loop.

Recommend Projects

Recommend Topics

Recommend Org

Jobs