GithubHelp home page GithubHelp logo

dvarounis / perfservmon Goto Github PK

View Code? Open in Web Editor NEW
8.0 2.0 5.0 49 KB

Nagios Plugin for IBM WebSphere Application Server using perfservlet

License: MIT License

Python 100.00%
nagios nagios-plugin jdbc-connection-pools jms-sib-destinations monitoring websphere nagios-plugins orb-thread-pool heap

perfservmon's Introduction

Perfservmon

Perfservmon is a Nagios Plugin for IBM Websphere Application Server(WAS) using the perfservlet web application that comes with each WAS installation. It also has minimal library dependencies so it can be easily used on most environments.

The plugin can monitor the following WAS metrics of a WebSphere Cell:

  • Web authentication time
  • Web authorization time
  • Heap Usage
  • Web Container Thread Pool Usage
  • Web Container Declared Threads Hung
  • ORB Thread Pool Usage
  • JDBC Data Source Connection Pool Usage
  • JDBC Data Source Connection Pool Use Time
  • JDBC Data Source Connection Pool Wait Time
  • JDBC Data Source Connection Pool Waiting Threads
  • Live HTTP Sessions
  • JMS SIB Destination(Queue, Topic) Metrics

Prerequisites

  1. Perfservlet App Install the PerfServletApp.ear in one WAS server of your WebSphere Cell. This is located in <WAS_ROOT>/installableApps, i.e. this would be in /opt/IBM/WebSphere/AppServer/installableApps in a Unix System.
  2. Python version 2.7 or version 3.X installed at the Nagios host

The plugin is tested to work with WAS Traditional version 8.5 and 9.0.

Setup

  1. Copy the perfservmon.py file in $USER1$ path, which is the plugins path. You will propably find the value of this variable in Nagios resource.cfg file (usually this is a libexec directory).

  2. Add the following lines in Nagios command.cfg file:

#Check_perfservlet commands
#The -H -u -p parameters are optional
#depending on whether you use https and/or Basic Auth credentials to access the perfservlet

define command{
        command_name    check_perfserv_retriever
        command_line    $USER1$/perfservmon.py -C $ARG1$ retrieve -N $ARG2$ -P $ARG3$ -H $ARG4$ -u $ARG5$ -p $ARG6$
        }

define command{
        command_name    check_perfserv_show
        command_line    $USER1$/perfservmon.py -C $ARG1$ show -n $ARG2$ -s $ARG3$ -M $ARG4$ -c $ARG5$ -w $ARG6$
        }

define command{
        command_name    check_perfserv_show_dcp
        command_line    $USER1$/perfservmon.py -C $ARG1$ show -n $ARG2$ -s $ARG3$ -M DBConnectionPoolPercentUsed -j $ARG4$ -c $ARG5$ -w $ARG6$
        }

define command{
        command_name    check_perfserv_show_sib
        command_line    $USER1$/perfservmon.py -C $ARG1$ show -n $ARG2$ -s $ARG3$ -M SIBDestinations -d $ARG4$ -c $ARG5$ -w $ARG6$
        }

Usage

Define Collector Service

Before defining a service using check_perfserv_show it is required to add the following service definition at the WAS Server or the DMgr Server(for ND Architecture) Nagios Config file:

define service{
        use                             local-service        
        host_name                       <WAS_Host>
        service_description             Collect PerfServlet data from Cell
        check_command                   check_perfserv_retriever!<WAS_Cell_Name>!<PerfServ_hostname>!<PerfServ_Port>![http|https]!userid!passwd![--ignorecert]
        }

Where:

  • WAS_Cell_Name = The name of the Websphere Cell
  • PerfServ_hostname = The IP Address/Hostname of where perfservlet Application runs
  • PerfServ_Port = The Port of where perfservlet Application runs

Optionally set the HTTP protocol(http or https) and/or the Basic Authentication credentials for accessing the PerfServlet Application. In the case of an https connection you may use (although not recommended) the --ignorecert option to ignore any TLS certificate issues.

This is the check that collects all the relevant perfserv data of all nodes/servers from perfservlet and stores them localy as a Python selve file.

In case you want, for example, to change the check interval of the above service so that all WAS data are refreshed more frequently you may add the following lines in Nagios template.cfg:

define service{
        name                            collector-service           ; The name of this service template
        use                             local-service         ; Inherit default values from the local-service definition
        max_check_attempts              2                       ; Re-check the service up to 2 times in order to determine its final (hard) state
        normal_check_interval           3                       ; Check the service every 3 minutes under normal conditions
        retry_check_interval            1                       ; Re-check the service every minute until a hard state can be determined
        register                        0                       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
        }

Then the collector service definition should be like the following:

define service{
        use                             collector-service        
        host_name                       <WAS_Host>
        service_description             Collect PerfServlet data from Cell
        check_command                   check_perfserv_retriever!<WAS_Cell_Name>!<PerfServ_hostname>!<PerfServ_Port>![http|https]!userid!passwd
        }

Sample Service Definitions for WAS Metrics

  • Heap Usage
define service{
        use                             local-service
        host_name                       <WAS_Host>
        service_description             WAS Heap usage
        check_command                   check_perfserv_show!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!Heap!<Critical Percentage>!<Warning Percentage>
        }
  • Web Container Thread Pool
define service{
        use                             local-service
        host_name                       <WAS_Host>
        service_description             WAS WebContainer ThreadPool Usage
        check_command                   check_perfserv_show!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!WebContainer!<Critical Percentage>!<Warning Percentage>
        }
  • JDBC Connection Pools

Shows all the available connection pools of the WAS Server and show an alert when any of them exceeds the percentage limits.

define service{
        use                             local-service
        host_name                       <WAS_Host>
        service_description             WAS ConnectionPool Usage
        check_command                   check_perfserv_show!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!DBConnectionPoolPercentUsed!<Critical Percentage>!<Warning Percentage>
        }

Shows a specific connection pool of the WAS Server (specified with <JNDI_name>) and show an alert when its usage exceeds the percentage limits.

define service{
        use                             local-service
        host_name                       <WAS_Host>
        service_description             WAS ConnectionPool JNDI_name Usage
        check_command                   check_perfserv_show_dcp!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!<JNDI_name>!<Critical Percentage>!<Warning Percentage>
        }
  • Total Live HTTP Sessions

Shows the Total Live HTTP Sessions together with the individual(per Module HTTP Sessions). Show an alert when the Total Sessions exceed the limits.

define service{
        use                             local-service
        host_name                       <WAS_Host>
        service_description             WAS Http Live Sessions
        check_command                   check_perfserv_show!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!LiveSessions!<Critical No of Sessions>!<Warning No of Sessions>
        }
  • ORB Thread Pool Usage
define service{
        use                             local-service
        host_name                       <WAS_Host>
        service_description             WAS ORB ThreadPool Usage
        check_command                   check_perfserv_show!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!ORB!<Critical Percentage>!<Warning Percentage>
        }
  • JMS SIB Destinations (Queue, Topic)
define service{
        use                             local-service
        host_name                       <WAS_Host>
        service_description             My Topic Space
        check_command                   check_perfserv_show_sib!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!<MyTopicSpaceName>!<No_Messages_Critical>!<No_Messages_Warning>
        }

define service{
        use                             local-service
        host_name                       <WAS_Host>
        service_description             My Exception Destination
        check_command                   check_perfserv_show_sib!<WAS_Cell_Name>!<WAS_Node_Name>!<WAS_server_name>!_SYSTEM.Exception.Destination.<WAS_Node_Name>.<WAS_server_name>-<SIBus_Name>!<No_Messages_Critical>!<No_Messages_Warning>
        }

perfservmon's People

Contributors

atterdag avatar dvarounis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

perfservmon's Issues

Parsed Arguments for Critical/Warning Methods work only for percentage Metrics

The following lines work only for percentage Metrics (e.g. Heap Usage):

    show_parser.add_argument("-c", type=int, action="store", dest='Critical', choices=xrange(1, 100),
                             help="Critical Value for Metric", required=False)
    show_parser.add_argument("-w", type=int, action="store", dest='Warning', choices=xrange(1, 100),
                             help="Warning Value for Metric", required=False)

Should be corrected to work with raw values, e.g. Warning Number of Available Messages in SIB Topic.

Syntax issue

I am receiving this error when running the retrieval command:
except urllib2.HTTPError as error:
^SyntaxError: invalid syntax

Is this due to the python version?
Version 2.4.3 is running on this machine

issue with perfservermon.py when executed

I need to use perfservermon.py for monitoring my WAS parmeters

What I have done : Installed and configured perfservlet

Perfservlet is accessible fine.

but while running the perfservmon.py command from Nagios Server. I'm getting the below error

/usr/local/nagios/libexec/perfservermon.py -C xxx retrieve -N server.example.com/wasPerfTool/servlet/perfservlet -P 9080 -u xxx -p xxx --ignorecert -H http

Traceback (most recent call last):
File "/usr/local/nagios/libexec/perfservermon.py", line 801, in
username=arguments.Username, password=arguments.Password)
File "/usr/local/nagios/libexec/perfservermon.py", line 658, in retrieveperfxml
tree = parse(xmlfilename)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse
tree.parse(source, parser)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
parser.feed(data)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1659, in feed
self._raiseerror(v)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1523, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: mismatched tag: line 35, column 2

python -V

Python 2.7.15+

Can you please tell me what i'm doing wrong

Handle Raised Exceptions

Raised Exceptions from parseperfxml() and other possible methods should be printed in Nagios Check Output.

SSL/Basic Auth support

I'm working in a DMZ where the Security, and Compliance team have demanded that all traffic are encrypted - Especially if credentials are passed.

So this plugin is amazing, and I will work on adding servlet monitoring def's as well. But the lack of SSL kinda blocks me from using the plugin :(

SIB Destinations in clusters

Hi Dimos

If you have a SIB with a cluster as member, then only one server in the cluster runs the engine at a time (except if choosing the "Scalability with HA" policy).

So as the script works currently, if you have 4 servers in a cluster, then you must configure a SIB destination check for each server, thus resulting in 3 UNKNOWN, and one OK message.

I suggest that we add a "cluster" switch to SIB destination check, so that perfservmon doesn't return a UNKNOWN, but "OK: inactive SIB engine" for cluster nodes that doesn't run the engine, as a way of saying "Yeah, yeah I know there's no data, but its expected" to Nagios.

Or do you have a better idea?

DbfilenameShelf error

Hi
Trying to test your plugin and getting this error( parameter values are masked ) :

python perfservmon.py -C OPFCell retrieve -N xxxx -P xxxx -u xxxx -p xxxx
Traceback (most recent call last):
File "perfservmon.py", line 804, in
parseperfxml(path=startingpath, cellname=arguments.CellName)
File "perfservmon.py", line 466, in parseperfxml
with shelve.open(shelvefilename, flag='c') as pfile:
AttributeError: DbfilenameShelf instance has no attribute 'exit'

python --version

Python 2.7.5

I Have a problem -

I am running the script is getting the following error. I am running the script is getting the following error. I tried to find the problem but to no avail.

python perfservmon.py -C MYCELL retrieve -N MYHOST -P 9080

Traceback (most recent call last):
File "perfservmon.py", line 804, in
parseperfxml(path=startingpath, cellname=arguments.CellName)
File "perfservmon.py", line 466, in parseperfxml
with shelve.open(shelvefilename, flag='c') as pfile:
AttributeError: DbfilenameShelf instance has no attribute 'exit'

python perfservmon.py -C wasbase show -n cell -s trnp300 -M Heap -c 90 -w 70
UNKNOWN - Error opening cached metrics file

AttributeError when trying to retrieve perf data from WAS

Hello Dimos,

When I run this command :
./perfservmon.py -C HP-ProBook-G5Node01Cell retrieve -N localhost -P 9443 -H https

I receive this message :
Traceback (most recent call last):
File "/usr/lib/nagios/plugins/perfservmon.py", line 804, in
parseperfxml(path=startingpath, cellname=arguments.CellName)
File "/usr/lib/nagios/plugins/perfservmon.py", line 466, in parseperfxml
with shelve.open(shelvefilename, flag='c') as pfile:
AttributeError: DbfilenameShelf instance has no attribute 'exit'

After I run this command :
./perfservmon.py -C HP-ProBook-G5Node01Cell show -n HP-ProBook-G5Node01 -s server1 -M WebContainer -c 90 -w 70

and the machine answers :
UNKNOWN - Error opening cached metrics file

Could you help me to find what's wrong ?
Thank you very much,
with best regards,
Mikhael

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.