Skip to end of metadata
Go to start of metadata

Scope

  • Nagios® is an Open Source System Monitor that is available at http://www.nagios.org. An ecosystem of more or less compatible System Monitors has evolved, e.g. op5®, that makes use of a similar design and configuration.
  • You can monitor JobScheduler by use of Plugins that check the availability of the daemon/service.

Active Checks and Passive Checks

  • Active Checks are initiated from the System Monitor server and are performed on a regular basis, e.g. every 5 minutes.
    • The only purpose of Active Checks is to verify the availability of the JobScheduler application. 
    • This type of check does not provide information about the execution status of individual jobs.
    • The System Monitor Plugins explained in this article are intended for Active Checks only.
  • Passive Checks are initiated from JobScheduler and report individual execution results to a System Monitor.

Download and Installation of the Plugins

Download

Four Plugins are available for download:

Installation

Copy the files to the plugin directory of your System Monitor installation.

Configuration of the Plugins

The configuration for the JobScheduler Master and Agent Plugins includes to define

  • a System Monitor Command that specifies the parameterization of the Plugin and
  • a System Monitor Service which is the visible point of control in the System Monitor console.

JobScheduler Master Plugin

Sample Configuration

Sample JobScheduler Master Command configuration
# 'check_jobscheduler' command definition
define command {
    command_name                check_jobscheduler
    command_line                /opt/plugins/check_jobscheduler.pl -H $HOSTADRESS$ -p $ARG1$ -t $ARG2$ (-u $ARG3$)
}
Sample JobScheduler Master Service configuration
 define service {
 use                             generic-service                 
 host_name                       localhost
 service_description             JobScheduler
 is_volatile                     0
 check_period                    24x7
 max_check_attempts              1
 normal_check_interval           1
 retry_check_interval            1
 contact_groups                  admins
 notification_options            w,u,c,r
 notification_interval           960
 notification_period             24x7
 check_command     check_jobscheduler!homer.sos!4444!0!!
 active_checks_enabled           1          
 passive_checks_enabled          0  
}

Parameterization

Parameter

 

Default

Description

-H,

--hostname

Name or IP address of the host (homer.sos) on which JobScheduler is running

-p

--port

Port that JobScheduler listens to (4444)

-t

--timeout

30s

Timeout for establishing the connection to JobScheduler

-u--userUser and password for HTTP authentication

Example:

Example Active Check for JobScheduler Master
#JobScheduler running on a server 'test' using port 4444 and HTTP authentication (user and password 'test')
 
<path_to_plugins>/check_jobscheduler.pl -H test -p 4444 -t 30 -u test:test

Notes

  • Perl > 5.8 is required and the Perl package NET::HTTP. You can download and install Perl packages from e.g. http://www.cpan.

JobScheduler Cluster/Master Plugin via JOC Cockpit with HTTP(S)

Sample Configuration

Sample JobScheduler Master Command configuration
# 'check_jobscheduler_with_joc' command definition
define command {
    command_name                check_jobscheduler_with_joc
    command_line                /opt/plugins/check_jobscheduler_with_joc.pl -j $ARG1$ -i $ARG2$ -a $ARG3$ -t $ARG4$
}
Sample JobScheduler Master Service configuration
 define service {
 use                             generic-service                 
 host_name                       localhost
 service_description             JobSchedulerCluster
 is_volatile                     0
 check_period                    24x7
 max_check_attempts              1
 normal_check_interval           1
 retry_check_interval            1
 contact_groups                  admins
 notification_options            w,u,c,r
 notification_interval           960
 notification_period             24x7
 check_command     check_jobscheduler_with_joc!http://localhost:4446!myJobSchedulerId!test:test!5!
 active_checks_enabled           1          
 passive_checks_enabled          0  
}

Parameterization

Parameter

 

Default

Description

-j--joc-urlUrl of JOC Cockpit (http or https are supported)
-i--idId of a JobScheduler Cluster
-a--accountAccount for HTTP authentication to JOC Cockpit (=<user:password>)

-H

--hostname

Name or IP address of a JobScheduler Master host. Only required to check a specific cluster member

-p

--port

HTTP port that JobScheduler listens to (40444). Only required to check a specific cluster member

-t

--timeout

30s

Timeout for establishing the connection to JOC Cockpit

-d--detailed

If set and the Cluster has more than one Master and not all Masters are running then the message contains host:port of each Master

Example:

Example Active Check for JobScheduler Cluster with JOC Cockpit
#JobScheduler running with the id 'test' and JOC Cockpit has the url http://localhost:4446 where the account (user and password 'test') has access
 
<path_to_plugins>/check_jobscheduler_with_joc.pl -j http://localhost:4446 -i test -a test:test
Example Active Check for JobScheduler Master with JOC Cockpit
#JobScheduler running with the id 'test' and JOC Cockpit has the url http://localhost:4446 where the account (user and password 'test') has access
 
<path_to_plugins>/check_jobscheduler_with_joc.pl -j http://localhost:4446 -i test -a test:test -H galadriel -p 4444

Notes

  • JOC Cockpit installation since 1.11.4 is required.
  • Perl > 5.8 is required and some Perl packages are required. You can download and install Perl packages from e.g. http://www.cpan.

    • HTTP::Request

    • LWP::UserAgent

    • LWP::Protocol::https (if https is used to connect to JOC Cockpit)
    • JSON
    • MIME::Base64

JobScheduler Agent Plugin

A service Command has to be declared before configuring the System Monitor Service that makes use of this Command.

Sample Configuration

Sample JobScheduler Agent command configuration
# 'check_jobscheduler_agent' command definition
define command {
    command_name                check_jobscheduler_agent
    command_line                /opt/plugins/check_jobscheduler_agent.pl -u $ARG1$ -a $ARG2$ -o $ARG3$ -t $ARG4$
}

When configuring the System Monitor Service then Command parameters have to be specified that are separated by exclamation marks, e.g.

Sample JobScheduler Agent Command parameterization
http://galadriel.sos:4445/jobscheduler/agent/api/overview!'{totalTaskCount},{currentTaskCount}'!'{startedAt},{totalTaskCount},{currentTaskCount},{isTerminating}'!20

Parameterization

Parameter

 

Default

Description

-u,

--url

URL for the HTTP connection that addresses the Agent via host (galadriel.sos) and port (4445) that the Agent is listening to followed by a fixed path (/jobscheduler/agent/api/overview).

-a

--attributes

List of attributes that are used to check the Agent availability. The specified attributes are recommended to check if the Agent answer if formally correct.

-o

--outputvars

List of attributes that are used for output of the script and that will be displayed in the System Monitor

  • startedAt: point in time when the Agent was started.
  • totalTaskCount: total number of jobs that have been executed during the lifetime of the Agent.
  • currentTaskCount: number of jobs that are currently executed by the Agent.
  • isTerminating: indicates whether the Agent is currently shutting down. Depending on the shutdown command that has been used an Agent will let run currently executed jobs before terminating
-t--timeout15sTimeout for establishing the connection to the Agent

Notes

  • Consider the use of the HTTPS protocol instead of the HTTP if the Agent is configured for use with Secure HTTPS communication.
  • Perl > 5.8 is required and some Perl packages are required. You can download and install Perl packages from e.g. http://www.cpan

    • HTTP::Request::Common

    • LWP::UserAgent

    • JSON
    • Data::Dumper

       

JobScheduler Agent Plugin via JOC Cockpit with HTTP(S)

A service Command has to be declared before configuring the System Monitor Service that makes use of this Command.

Sample Configuration

Sample JobScheduler Agent command configuration with JOC Cockpit
# 'check_jobscheduler_agent_with_joc' command definition
define command {
    command_name                check_jobscheduler_agent_with_joc
    command_line                /opt/plugins/check_jobscheduler_agent_with_joc.pl -j $ARG1$ -i $ARG2$ -a $ARG3$ -t $ARG4$
}
define service {
 use                             generic-service                 
 host_name                       localhost
 service_description             JobSchedulerAgents
 is_volatile                     0
 check_period                    24x7
 max_check_attempts              1
 normal_check_interval           1
 retry_check_interval            1
 contact_groups                  admins
 notification_options            w,u,c,r
 notification_interval           960
 notification_period             24x7
 check_command     check_jobscheduler_agent_with_joc!http://localhost:4446!myJobSchedulerId!test:test!5!
 active_checks_enabled           1          
 passive_checks_enabled          0  
}

 

Parameterization

Parameter

 

Default

Description

-j--joc-urlUrl of JOC Cockpit (http or https are supported)
-i--idId of a JobScheduler Cluster
-a--accountAccount for HTTP authentication to JOC Cockpit (=<user:password>)

-A

--agent

Url of an Agent, optional,  can be specified several times

-t

--timeout

30s

Timeout for establishing the connection to JOC Cockpit

-d--detailed

If set and the Cluster has more than one Master and not all Masters are running then the message contains host:port of each Master

Example:

Example Active Check for all JobScheduler Agents with JOC Cockpit
#JobScheduler running with the id 'test' and JOC Cockpit has the url http://localhost:4446 where the account (user and password 'test') has access
 
<path_to_plugins>/check_jobscheduler_agent_with_joc.pl -j http://localhost:4446 -i test -a test:test
Example Active Check for some JobScheduler Agents with JOC Cockpit
#JobScheduler running with the id 'test' and JOC Cockpit has the url http://localhost:4446 where the account (user and password 'test') has access
 
<path_to_plugins>/check_jobscheduler_agent_with_joc.pl -j http://localhost:4446 -i test -a test:test -A http://galadriel:4445 -A http://galadriel:4455 -d

Notes

  • JOC Cockpit installation since 1.11.4 is required.
  • Perl > 5.8 is required and some Perl packages are required. You can download and install Perl packages from e.g. http://www.cpan.

    • HTTP::Request

    • LWP::UserAgent

    • LWP::Protocol::https (if https is used to connect to JOC Cockpit)
    • JSON
    • MIME::Base64
    • Cwd

Report Script per Agent

  • check_jobscheduler_agent_with_joc.pl calls an optional script report_jobscheduler_agent.pl per Agent.
  • This script does not connect to JOC Cockpit or perform any checks but simply serves to create individual notifications per Agent to the System Monitor.
  • The script is executed if it is available from the same directory as the calling script and has to be executable.
  • The script is parameterized to transfer the message type and notification to the System Monitor.
    • report_jobscheduler_agent.pl <joc-cockpit-url> <scheduler-id> <agent-url> <agent-status>
    • <joc-cockpit-url> is the URL that has been specivied as a parameter to the script check_jobscheduler_agent_with_joc.pl
    • <scheduler-id> is the JobScheduler Master ID that has been specified as a parameter to the script check_jobscheduler_agent_with_joc.pl
    • <agent-url> is the URL identifying the Agent
    • <agent-status> is one of "RUNNING" , "UNREACHABLE", "TERMINATING", "UNKNOWN_AGENT"

      Example report_jobscheduler_agent.pl
      #!/usr/bin/env perl
      use strict;
      use warnings;
       
      my ($jocUrl, $schedulerId, $agentUrl, $agentStateText) = @ARGV;
      # do something with $jocUrl, $schedulerId, $agentUrl, $agentStateText

See also

  • JS-684 - Getting issue details... STATUS
  • JS-1480 - Getting issue details... STATUS
  • JS-1715 - Getting issue details... STATUS
  • JOC-579 - Getting issue details... STATUS

Change Management References Web Services

T Key Linked Issues Fix Version/s Status P Summary Updated
Loading...
Refresh

 

 

Write a comment…