Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Scope

  • Nagios® is an Open Source System Monitor that is available at http://www.nagios.org. An ecosystem of more or less compatible System Monitors has evolved, e.g. op5®, that makes use of a similar design and configuration.
  • You can monitor JobScheduler by use of Plugins that check the availability of the daemon/service.

...

Code Block
languagebash
titleSample JobScheduler Master Service configuration
 define service {
 use                             generic-service                 
 host_name                       localhost
 service_description             JobScheduler
 is_volatile                     0
 check_period                    24x7
 max_check_attempts              1
 normal_check_interval           1
 retry_check_interval            1
 contact_groups                  admins
 notification_options            w,u,c,r
 notification_interval           960
 notification_period             24x7
 check_command     check_jobscheduler!homer.sos!4444!0!!
 active_checks_enabled           1          
 passive_checks_enabled          0  
}

Parameterization

Parameter

 


Default

Description

-H,

--hostname

Name or IP address of the host (homer.sos) on which JobScheduler is running

-p

--port

Port that JobScheduler listens to (4444)

-t

--timeout

30s

Timeout for establishing the connection to JobScheduler

-u--userUser and password for HTTP authentication

Example:

Code Block
languagebash
titleExample Active Check for JobScheduler Master
#JobScheduler running on a server 'test' using port 4444 and HTTP authentication (user and password 'test')
 
<path_to_plugins>/check_jobscheduler.pl -H test -p 4444 -t 30 -u test:test

...

Code Block
languagebash
titleSample JobScheduler Master Service configuration
 define service {
 use                             generic-service                 
 host_name                       localhost
 service_description             JobSchedulerCluster
 is_volatile                     0
 check_period                    24x7
 max_check_attempts              1
 normal_check_interval           1
 retry_check_interval            1
 contact_groups                  admins
 notification_options            w,u,c,r
 notification_interval           960
 notification_period             24x7
 check_command     check_jobscheduler_with_joc!http://localhost:4446!myJobSchedulerId!test:test!5!
 active_checks_enabled           1          
 passive_checks_enabled          0  
}

Parameterization

Parameter

 


Default

Description

-j--joc-urlUrl of JOC Cockpit (http or https are supported)
-i--idId of a JobScheduler Cluster
-a--accountAccount for HTTP authentication to JOC Cockpit (=<user:password>)

-H

--hostname

Name or IP address of a JobScheduler Master host. Only required to check a specific cluster member

-p

--port

HTTP port that JobScheduler listens to (40444). Only required to check a specific cluster member

-t

--timeout

30s

Timeout for establishing the connection to JOC Cockpit

-d--detailed

If set and the Cluster has more than one Master and not all Masters are running then the message contains host:port of each Master

Example:

Code Block
languagebash
titleExample Active Check for JobScheduler Cluster with JOC Cockpit
#JobScheduler running with the id 'test' and JOC Cockpit has the url http://localhost:4446 where the account (user and password 'test') has access
 
<path_to_plugins>/check_jobscheduler_with_joc.pl -j http://localhost:4446 -i test -a test:test

...

Code Block
languagebash
titleSample JobScheduler Agent Command parameterization
http://galadriel.sos:4445/jobscheduler/agent/api/overview!'{totalTaskCount},{currentTaskCount}'!'{startedAt},{totalTaskCount},{currentTaskCount},{isTerminating}'!20

Parameterization

Parameter

 


Default

Description

-u,

--url

URL for the HTTP connection that addresses the Agent via host (galadriel.sos) and port (4445) that the Agent is listening to followed by a fixed path (/jobscheduler/agent/api/overview).

-a

--attributes

List of attributes that are used to check the Agent availability. The specified attributes are recommended to check if the Agent answer if formally correct.

-o

--outputvars

List of attributes that are used for output of the script and that will be displayed in the System Monitor

  • startedAt: point in time when the Agent was started.
  • totalTaskCount: total number of jobs that have been executed during the lifetime of the Agent.
  • currentTaskCount: number of jobs that are currently executed by the Agent.
  • isTerminating: indicates whether the Agent is currently shutting down. Depending on the shutdown command that has been used an Agent will let run currently executed jobs before terminating
-t--timeout15sTimeout for establishing the connection to the Agent

Notes

  • Consider the use of the HTTPS protocol instead of the HTTP if the Agent is configured for use with Secure HTTPS communication.
  • Perl > 5.8 is required and some Perl packages are required. You can download and install Perl packages from e.g. http://www.cpan

    • HTTP::Request::Common

    • LWP::UserAgent

    • JSON
    • Data::Dumper

       


JobScheduler Agent Plugin via JOC Cockpit with HTTP(S)

...

Code Block
define service {
 use                             generic-service                 
 host_name                       localhost
 service_description             JobSchedulerAgents
 is_volatile                     0
 check_period                    24x7
 max_check_attempts              1
 normal_check_interval           1
 retry_check_interval            1
 contact_groups                  admins
 notification_options            w,u,c,r
 notification_interval           960
 notification_period             24x7
 check_command     check_jobscheduler_agent_with_joc!http://localhost:4446!myJobSchedulerId!test:test!5!
 active_checks_enabled           1          
 passive_checks_enabled          0  
}

...


Parameterization

Parameter

 


Default

Description

-j--joc-urlUrl of JOC Cockpit (http or https are supported)
-i--idId of a JobScheduler Cluster
-a--accountAccount for HTTP authentication to JOC Cockpit (=<user:password>)

-A

--agent

Url of an Agent, optional,  can be specified several times

-t

--timeout

30s

Timeout for establishing the connection to JOC Cockpit

-d--detailed

If set and the Cluster has more than one Master and not all Masters are running then the message contains host:port of each Master

Example:

Code Block
languagebash
titleExample Active Check for all JobScheduler Agents with JOC Cockpit
#JobScheduler running with the id 'test' and JOC Cockpit has the url http://localhost:4446 where the account (user and password 'test') has access
 
<path_to_plugins>/check_jobscheduler_agent_with_joc.pl -j http://localhost:4446 -i test -a test:test

...

  • check_jobscheduler_agent_with_joc.pl calls an optional script report_jobscheduler_agent.pl per Agent.
  • This script does not connect to JOC Cockpit or perform any checks but simply serves to create individual notifications per Agent to the System Monitor.
  • The script is executed if it is available from the same directory as the calling script and has to be executable.
  • The script is parameterized to transfer the message type and notification to the System Monitor.
    • report_jobscheduler_agent.pl <joc-cockpit-url> <scheduler-id> <agent-url> <agent-status>
    • <joc-cockpit-url> is the URL that has been specivied as a parameter to the script check_jobscheduler_agent_with_joc.pl
    • <scheduler-id> is the JobScheduler Master ID that has been specified as a parameter to the script check_jobscheduler_agent_with_joc.pl
    • <agent-url> is the URL identifying the Agent
    • <agent-status> is one of "RUNNING" , "UNREACHABLE", "TERMINATING", "UNKNOWN_AGENT"

      Code Block
      languageperl
      titleExample report_jobscheduler_agent.pl
      #!/usr/bin/env perl
      use strict;
      use warnings;
       
      my ($masterUrl$jocUrl, $schedulerId, $agentUrl, $agentStateText) = @ARGV;
      # do something with $masterUrl$jocUrl, $schedulerId, $agentUrl, $agentStateText

...

Jira
serverSOS JIRA
columnstype,key,issuelinks,fixversions,status,priority,summary,updated
maximumIssues20
jqlQuerylabels in (agent-web-service)
serverId6dc67751-9d67-34cd-985b-194a8cdc9602