Deprecation Announcement

This feature is deprecated as it is replaced by the JobScheduler Monitoring Interface - Overview. The JobScheduler Monitoring Interface provides better integration with Nagios without modifications to your jobs and job chains, e.g. providing recovery messages, performance checks and individual routing of job error messages to specific Nagios services. For use of actice checks see How to perform active checks with a System Monitor such as Nagios/op5

FEATURE AVAILABILITY ENDING WITH RELEASE 1.10

JITL-143 - Getting issue details... STATUS

Nagios is an Open Source network monitor that is available at http://www.nagios.org.

Installation

The nagios integration has two parts.

  1. The Log Analyser. This is a job which must run periodically in JobScheduler. This job examines the JobScheduler main log. If error messages or warnings are found, they will be stored in the JobScheduler database (Table SCHEDULER_MESSAGES).
  2. The nagios plugin. This is a perl script, which looks into the JobScheduler Database to find some error messages or warnings.

Installation of the nagios plugin

You need the perl > 5.8 and the perl packages NET::HTTP and DBI. You can install these packages from http://www.cpan.

  1. Unzip JobScheduler_nagios.tar.gz to any folder.
  2. gzip -d nagios.tar.gz
  3. tar -xvf nagios.tar
  4. Copy the files ./nagios/bin/plugin sos.check_scheduler.pl and ./nagios/bin/SOSScheduler.pm to the plugin directory of your nagios installation.
  5. Copy the config folder to the plugin directory of your nagios installation
  6. Create a file config/sos_settings.ini. You can use the example files in the config folder.
    Configure your nagios with this plugin. For this, you have to add a service for each group of job chains or jobs you want to include in the monitoring. You also have to add the command for the plugin. You can use the file jobscheduler.cfg which have the example configuration. Please add the line cfg_filh1. /usr/local/nagios/etc/jobscheduler.cfg to your nagios.cfg configuration file to include this file.
     
 define service\{
 use                             generic-service                 
 host_name                       localhost
 service_description             SchedulerLog
 is_volatile                     0
 check_period                    24x7
 max_check_attempts              1
 normal_check_interval           1
 retry_check_interval            1
 contact_groups                  admins
 notification_options    w,u,c,r
 notification_interval           960
 notification_period             24x7
 check_command     sos_check_scheduler!prodscheduler!4444!0!blacklist,test/job3!!
 active_checks_enabled           1          
 passive_checks_enabled          1  
 # 'check_scheduler' command definition
 define command\{
 command_name    sos_check_scheduler
 command_line    /home/nagios/sos_check_scheduler.pl 
 -i $ARG1$ -H $HOSTADDRESS$ -p $ARG2$ -m $ARG3$ -j $ARG4$ -c  $ARG5$ \}          

Before restarting nagios, check your configuration with

 /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Installation of the Log Analyser Job

To install the Log Analyser, you have to copy the folder config/live/Nagios to your JobScheduler configuration directory. You can adjust the runtime of the job JobSchedulerLogAnalyser. The default runtime for analysing the logfile is every 5 minutes. The default for resetting all messages is every day at 11:00 pm and for deleting messages from database every Monday at 7:00 am.

 <?xml version="1.0" encoding="ISO-8859-1"?>
 <job title="Analyse Job Logfile">
     <script java_class="sos.scheduler.logMessage.JobSchedulerLogAnalyser"
             language="java"/>
     <run_time>
         <period absolute_repeat="00:05"
                 begin="00:00"
                 end="24:00"/>
     </run_time>
 </job> 

[ Parameter Description|http://www.sos-berlin.com/doc/JITL/JobSchedulerLogAnalyser.xml]

Installation of the table SCHEDULER_MESSAGES

If you are running JobScheduler with Version > 1.3.10, the table SCHEDULER_MESSAGES is already installed. In other cases, you find the create table command in the directory ./nagios/db/yourdb. Please install this table using your database client.

Testing your installation

  1. Execute the plugin in a shell
 Example:
 perl ./sos_check_scheduler.pl -ischeduler_139 -Hur.sos -p4139 -m0 -j test/job1
  1. Please make sure, that the job Nagios/ JobSchedulerLogAnalyser is running. You should see the job in JOC when opening host:port
  2. Open your nagios console. You should see the configured services.
  3. Check your nagios configuration with
    /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

How it works

You define the parameters for monitoring in the Nagios configuration file jobscheduler.cfg.

For example, if you have several JobScheduler instances running or you want to monitor different groups of jobs and job chains then you have to define one service for each JobScheduler or group of jobs and job chains. You can not mix group of jobs and group of job chains. You have to configure one service for job chains and one for jobs.

Parameter

long

Default

Description

-H,

--hostname

Name or IP address of the host JobScheduler is running

-p

--port

Port that JobScheduler listens to

-m

--max

optional: todo

-i

--scheduler_id

optional: When the log analyser finds an entry in the log file then that message will be stored in a database. If log analyser finds the message once more then a counter for this message will be incremented. With this parameter you can specify that only messages with a counter less than the given value are monitored.

-f

--file_configuration

optional: You can specify all parameters in a configuration file. The name of the file is specified with this parameter.

-c

--job_chain

optional: Defines a filter for job chains which should be monitored. The names of the job chains are set in a list with comma.

-j

--job

optional: Defines a filter for jobs which should be monitored. The names of the jobs are set in a list with comma.

Example

Test job blacklist and test/job3

 define service {
 use                             generic-service                 
 host_name                       localhost
 service_description             SchedulerLog
 is_volatile                     0
 check_period                    24x7
 max_check_attempts              1
 normal_check_interval           1
 retry_check_interval            1
 contact_groups                  admins
 notification_options    w,u,c,r
 notification_interval           960
 notification_period             24x7
 check_command     sos_check_scheduler!prodscheduler!4444!0!blacklist,test/job3!!
 active_checks_enabled           1          
 passive_checks_enabled          1
}

Test job chain test/print_chain

 define service { 
 use                             generic-service                 
 host_name                       localhost
 service_description             SchedulerLog
 is_volatile                     0
 check_period                    24x7
 max_check_attempts              1
 normal_check_interval           1
 retry_check_interval            1
 contact_groups                  admins
 notification_options    w,u,c,r
 notification_interval           960
 notification_period             24x7
 check_command     sos_check_scheduler!prodscheduler!4444!0!!print_chain!
 active_checks_enabled           1          
 passive_checks_enabled          1
 }
 # 'check_scheduler' command definition
 define command {
 command_name    sos_check_scheduler
 command_line    /home/nagios/sos_check_scheduler.pl 
 -i $ARG1$ -H $HOSTADDRESS$ -p $ARG2$ -m $ARG3$ -j $ARG4$ -c  $ARG5$ 
}        

Implementation

  • Nagios Plugin: Reading the database with error messages and warnings. You can start the plugin in your shell for example as follows:
     
 perl sos_check_scheduler.pl -H localhost -p 4444 -j jobname 
  • Job JobSchedulerLogAnalyser: Analysing log files and writing them into the database.
  • Job JobSchedulerLogAnalyserReset: Resetting all messages.
  • Job JobSchedulerLogAnalyserDelete: Deleting all messages.