Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Configuration

JobScheduler - SystemMonitorNotification files

Location: <scheduler_install>/config/notification

FileDescription
SystemMonitorNotification_v1.0.xsd

The XML Schema file that defines which values are allowed in your XML files for the JobScheduler monitoring.

That means, you just have to modify your SystemMonitorNotification_<MonitorSystem>.xml files in order to configure the JobScheduler objects you want to monitor and the System Monitor you want to use but the XML schema does not have to be modified.

SystemMonitorNotification_<MonitorSystem>.xml

 Configuration file for each System Monitor.

  • Specifies the delivery way to System Monitor.
  • Specifies notification for error or success conditions
  • Specifies notification to measure performance of JobScheduler objects
 

SystemMonitorNotificationTimers.xml

Configuration file for all System Monitors.

  • Specifies notification to measure performance of JobScheduler objects

This file is optional and contains the definitions of the SystemMonitorNotification / Timer elements.

 

SystemMonitorNotification Elements

The configuration element descriptions are organized into the following major categories:

ElementElement descriptionDescription
SystemMonitorNotificationTop Level ElementConfiguration for notifications to a system monitor
NotificationOnce or more inside a SystemMonitorNotification elementSpecifies a system monitor notification that includes a command line invocation and the JobScheduler objects
TimerOptional, once or more inside a SystemMonitorNotification elementPerformance measurement definition
SystemMonitorNotification

SystemMonitorNotification support the following attributes:

...

Code Block
languagexml
titleExample
collapsetrue
<SystemMonitorNotification system_id="OP5">
...


SystemMonitorNotification / Notification

The following elements may be nested inside a Notification element:

ElementElement descriptionDescription
NotificationMonitorOnce inside a Notification elementSpecifies the System Monitor interface that is being used for messages: either by a Plug-in Interface or by command line invocation
NotificationObjectsOnce inside a Notification elementSpecifies the Job Chain and the Timer definitions
SystemMonitorNotification / Notification / NotificationMonitor

NotificationMonitor supports the following attributes:

...

ElementElement descriptionDescription
NotificationInterfaceOptional, once inside of NotificationMonitor elementPlugin Interface to be executed for System Monitor notification
NotificationCommandOptional, once inside of NotificationMonitor elementCommand line to be executed for System Monitor notification

 

SystemMonitorNotification / Notification / NotificationMonitor / NotificationInterface

NotificationInterface support the following attributes:

...

Code Block
languagexml
titleExample
collapsetrue
...
<NotificationInterface monitor_host="monitor_host" monitor_port="5667" monitor_encryption="XOR" service_host="service_host"><![CDATA[
scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), step =%MON_N_ORDER_STEP_STATE%, error=%MON_N_ERROR_TEXT%
]]></NotificationInterface>
...
SystemMonitorNotification / Notification / NotificationMonitor / NotificationCommand

NotificationCommand support the following attributes:

...

Code Block
languagexml
titleExample
collapsetrue
...
<NotificationCommand><![CDATA[
echo scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), step =%MON_N_ORDER_STEP_STATE%, error=%MON_N_ERROR_TEXT% > D://errors.txt
]]></NotificationCommand>
...
SystemMonitorNotification / Notification / NotificationObjects

One of the following elements must be nested inside a NotificationObjects element:

...

Code Block
languagexml
titleExample
collapsetrue
<SystemMonitorNotification system_id="OP5"> 
  <Notification> 
    <NotificationMonitor service_name_on_error="JobScheduler Monitoring Errors"> 
      ... 
    </NotificationMonitor> 
    <NotificationObjects> 
      <!-- Send the job chain error, occurrent in the "test/my_jobchain" job chain, to the "JobScheduler Monitoring Errors" service. --> 
      <JobChain name="test/my_jobchain" /> 
    </NotificationObjects> 
 </Notification> 
</SystemMonitorNotification>  

 

SystemMonitorNotification / Notification / NotificationObjects / JobChain

JobChain support the following attributes:

...

Code Block
languagexml
titleExample
collapsetrue
...
<JobChain notifications="2" name="test/my_jobchain"/>
...
<JobChain scheduler_id="scheduler_4444" />
...
<JobChain scheduler_id="scheduler_4444" name="^(test/my)" />
...
<JobChain name="test/my_jobchain" step_from="200"/>
...
<JobChain name="test/my_jobchain" step_to="500"/>
...
<JobChain name="test/my_jobchain" step_from="300" step_to="300"/>
...
<JobChain name="test/my_jobchain" excluded_steps="200;300"/>
...

 

SystemMonitorNotification / Notification / NotificationObjects / Timer

Timer support the following attributes:

...

Code Block
languagexml
titleExample
collapsetrue
<SystemMonitorNotification system_id="OP5"> 
  <Notification> 
    <NotificationMonitor service_name_on_error="JobScheduler Monitoring Error"> 
      ... 
    </NotificationMonitor> 
    <NotificationObjects> 
     <!-- 
     Send the job chain error, occurrent in the "test/my_jobchain" job chain, to the "JobScheduler Monitoring Errors" service. 
     --> 
     <JobChain name="test/my_jobchain" /> 
    </NotificationObjects> 
  </Notification>   
 
  <Notification> 
    <NotificationMonitor service_name_on_error="JobScheduler Monitoring Performance"> 
      ... 
    </NotificationMonitor> 
    <NotificationObjects> 
      <!-- 
      Send the performance check error, occurrent in the "test/my_jobchain" job chain, to the "JobScheduler Monitoring Performance" service. 
      Send of the performance check error to the "JobScheduler Monitoring Performance" service 
      will be ignored when the "test/my_jobchain" has the job chain error (default notify_on_error = false). 
      --> 
      <Timer name="my_timer" /> 
    </NotificationObjects> 
 </Notification>   
 
 <Timer name="my_timer"> 
    <JobChain name="test/my_jobchain" /> 
 </Timer> 
</SystemMonitorNotification> 

 

SystemMonitorNotification / Timer 

The following elements must be nested inside a Timer element:

...

Code Block
languagexml
titleExample
collapsetrue
...
<Timer name="my_timer">
... 

 

SystemMonitorNotification / Timer / JobChain

JobChain support the following attributes:

...

Code Block
languagexml
titleExample
collapsetrue
...
<JobChain scheduler_id="scheduler_4444" /> 
... 
<JobChain scheduler_id="scheduler_4444" name="^(test/my)" /> 
... 
<JobChain name="test/my_jobchain" step_from="200"/> 
... 
<JobChain name="test/my_jobchain" step_to="500"/> 
... 
<JobChain name="test/my_jobchain" step_from="300" step_to="300"/>
...

 

SystemMonitorNotification / Timer / Minimum

The following elements must be nested inside a Minimum element:

...

Code Block
languagexml
titleExample
collapsetrue
...
<Timer name="my_timer">
  ...
  <Maximum><Script language="javascript"><![CDATA[1000]]></Script></Maximum>
</Timer>
... 

 

SystemMonitorNotification / Timer / Maximum

The following elements must be nested inside a Maximum element:

...

Code Block
languagexml
titleExample
collapsetrue
...
<Timer name="my_timer">
  ...
  <Minimum><Script language="javascript"><![CDATA[1000]]></Script></Minimum>
</Timer>
... 

 

SystemMonitorNotification / Timer / Minimum|Maximum / Script

Script support the following attributes:

...

  • fixed value
  • calculation based on the job/order parametes
Fixed value

Fixed value is the duration time in seconds for the specific Minimum or Maximum definition

Code Block
languagexml
titleExample (fixed value)
collapsetrue
...
  <Script language="javascript"><![CDATA[1000]]></Script>
...
Calculation

The calculation result is the time in seconds for the specific Minimum or Maximum definition.

...

Code Block
languagexml
titleExample (job)
collapsetrue
<?xml version="1.0" encoding="ISO-8859-1"?> 
<job  title="Sample Job with Store Result Monitor" order="yes" stop_on_error="no" tasks="1">     
  <params>
     <!--
     set the scheduler_notification_result_parameters parameter
     -->         
    <param name="scheduler_notification_result_parameters" value="file_size"/>     
  </params>     
  
  <!--
  calculate and create the new order parameter if necessary
  -->
  <script language="javascript"><![CDATA[             
      function spooler_process(){                                  
        var order    = spooler_task.order;                 
        var params   = spooler.create_variable_set();                 
        params.merge(spooler_task.params);                 
        params.merge(order.params);                      
        
        // parameter scheduler_file_path was set in the previous job chain step
        var file     = new java.io.File(params.value("scheduler_file_path"));                 
        var fileSize = file.length()/1024;                 
        order.params.set_var("file_size",fileSize.toString());                          
      return true;             
      }]]>     
   </script>          
 
   <!-- 
   set the com.sos.scheduler.notification.jobs.result.StoreResultsJobJSAdapterClass as monitor
   -->     
   <monitor  name="notification_monitor" ordering="1">         
     <script java_class="com.sos.scheduler.notification.jobs.result.StoreResultsJobJSAdapterClass" language="java"/>     
   </monitor>

   <run_time /> 
</job> 

Message

Usage

The Message can be configured on the following parent nodes as CDATA element :

...

Example: <![CDATA[ scheduler id = %MON_N_SCHEDULER_ID%  ]]>

Variables

All variables must be defined by using of the %<variable name>% syntax.

...

  1. Table variables.
  2. Service variables.
  3. OS environment variables. 
Table variables 
Expand
titleVariables: table SCHEDULER_MON_NOTIFICATIONS

 Table of the history of steps of processed orders.

NameDescription 
%MON_N_ID%Unique notification id
%MON_N_SCHEDULER_ID% Id of the JobScheduler
%MON_N_TASK_ID%Id of the JobScheduler task 
%MON_N_STEP% Consecutive number of the order step
%MON_N_ORDER_HISTORY_ID% Id of the JobScheduler order 
%MON_N_JOB_CHAIN_NAME% Name of the job chain of the order 
%MON_N_JOB_CHAIN_TITLE%Title of the job chain of the order  
%MON_N_ORDER_ID% Unique (within the job chain) id of the order 
%MON_N_ORDER_TITLE% Title of the order 
%MON_N_ORDER_START_TIME% Timestamp of the start of the order
%MON_N_ORDER_END_TIME% Timestamp of the end of the order
%MON_N_ORDER_TIME_ELAPSED% The time or difference in seconds between a beginning time and an ending time of the order
%MON_N_ORDER_STEP_STATE% State of the order inside the job chain
%MON_N_ORDER_STEP_START_TIME%Timestamp of the start of the order step 
%MON_N_ORDER_STEP_END_TIME% Timestamp of the end of the order step 
%MON_N_ORDER_STEP_TIME_ELAPSED%The time or difference in seconds between a beginning time and an ending time of the order step 
%MON_N_JOB_NAME%Name of the job 
%MON_N_JOB_TITLE% Title of the job
%MON_N_TASK_START_TIME%Timestamp of the job task start 
%MON_N_TASK_END_TIME% Timestamp of the job task end
%MON_N_TASK_TIME_ELAPSED% The time or difference in seconds between a beginning time and an ending time of the job task
%MON_N_RECOVERED% 

0 = dependent of the %MON_N_ERROR% - ok or error was not recovered,

1 = error was recovered  

%MON_N_ERROR%

0 = ok

1 = error 

%MON_N_ERROR_CODE% Exception-code of the job error 
%MON_N_ERROR_TEXT%Exception message of the job (that processed the order) 
%MON_N_CREATED% Timestamp of the notification initial record 
%MON_N_MODIFIED%Timestamp of the latest changes to this notification record 

...

Code Block
languagexml
titleExample
collapsetrue
 timer name = %MON_C_NAME%, text = %MON_C_CHECK_TEXT%
Service variables
Expand
titleVariables
NameDescription
%SERVICE_NAME%

Current service name. One of both element attributes:

  • SystemMonitorNotification / Notification / NotificationMonitor / @service_name_on_error
  • SystemMonitorNotification / Notification / NotificationMonitor / @service_name_on_success
%SERVICE_STATUS%

Current service status. One of both element attributes or default: 

  • SystemMonitorNotification / Notification / NotificationMonitor / @service_status_on_error
  • SystemMonitorNotification / Notification / NotificationMonitor / @service_status_on_success
  • default CRITICAL error
  • default OK       success
%SERVICE_MESSAGE_PREFIX%

Message prefix

  • ERROR       error
  • RECOVERED     error recovery
  • TIMER       performance check
Code Block
languagexml
titleExample
collapsetrue
 service name = %SERVICE_NAME%

 

OS environment variables 

 

All existing system variables can be defined by message with the syntax like %<variable name>% (Windows/Unix).

Code Block
languagexml
titleExample
collapsetrue
 %TEMP%/test.exe

 

Examples 
Code Block
languagexml
titleMessage on error
collapsetrue
scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), step=%MON_N_ORDER_STEP_STATE%, error=%MON_N_ERROR_TEXT%            

...

Code Block
languagexml
titleMessage on timer
collapsetrue
name = %MON_C_NAME%, scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), steps(%MON_C_STEP_FROM% to %MON_C_STEP_TO%), check = %MON_C_CHECK_TEXT%            

Notification environment variables

The default com.sos.scheduler.notification.plugins.notifier.SystemNotifierProcessBuilderPlugin plugin used by the SystemMonitorNotification / Notification / NotificationCommand element sets the following variables as environment variables:

...

These variables can be used when the NotificationCommand calls the notification client not directly, but a shell script, that make the logical implementation for sending of the notification messages.

Table variables
Expand
titleVariables

All table variables (see Table variables explanation) are set as environment variables with the prefix:

  • SCHEDULER_MON_TABLE_

e.g.:

  • SCHEDULER_MON_TABLE_MON_N_ID
  • SCHEDULER_MON_TABLE_MON_N_SCHEDULER_ID
  • ...
Service variables
Expand
titleVariables
NameDescription

SCHEDULER_MON_SERVICE_NAME

Current service name. One of both element attributes:

  • SystemMonitorNotification / Notification / NotificationMonitor / @service_name_on_error
  • SystemMonitorNotification / Notification / NotificationMonitor / @service_name_on_success

SCHEDULER_MON_SERVICE_STATUS

Current service status. One of both element attributes or default:

  • SystemMonitorNotification / Notification / NotificationMonitor / @service_status_on_error
  • SystemMonitorNotification / Notification / NotificationMonitor / @service_status_on_success
  • default CRITICAL error
  • default OK       success

SCHEDULER_MON_SERVICE_MESSAGE_PREFIX

  • ERROR      error                           
  • RECOVERED    error recovery                            
  • TIMER             performance check 

SCHEDULER_MON_SERVICE_COMMAND

 Content of the SystemMonitorNotification / Notification / NotificationCommand after substitution

...

Code Block
languagebash
titleSample NotificationCommand Windows. Script file (C:/temp/command.cmd).
collapsetrue
1) configured command in the SystemMonitorNotification_<MonitorSystem>.xml file
<NotificationCommand><![CDATA[C:/Temp/command.cmd]</NotificationCommand>
 
2) content of the C:/Temp/command.cmd file
rem Note: "> C:/Temp/command_output.txt" used to simulate the starting of the notification client
rem
echo %SCHEDULER_MON_SERVICE_NAME%:%SCHEDULER_MON_SERVICE_STATUS%:%SCHEDULER_MON_SERVICE_MESSAGE_PREFIX% history id = %SCHEDULER_MON_TABLE_MON_N_ORDER_HISTORY_ID% > C:/Temp/command_output.txt
 

Examples

Examples OP5
NotificationInterface 

Here is an except of an XML file used for notifying a specific System Monitor (OP5 Monitor) and using NotificationInterface:

Code Block
languagexml
titleSystemMonitorNotification_OP5.xml
collapsetrue
 ...
<!--
monitor_host            The hostname or ip address of System Monitor host 
monitor_port            The TCP port that the System Monitor would listen to
monitor_encryption      Encryption algorithm
service_host            The host that executes the passive check. The name must match the corresponding setting in the System Monitor
%MON_N_SCHEDULER_ID%    See explanation "Table variables"
...
-->
<NotificationInterface monitor_host="monitor_host" monitor_port="5667" monitor_encryption="XOR" service_host="service_host"><![CDATA[
scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), step =%MON_N_ORDER_STEP_STATE%, error=%MON_N_ERROR_TEXT%
]]></NotificationInterface>
...
NotificationCommand

Here is an except of an XML file used for notifying a specific System Monitor (OP5 Monitor) and using NotificationCommand on Windows:

Code Block
languagexml
titleSystemMonitorNotification_OP5.xml
collapsetrue
... 
<!--
service_host               The host that executes the passive check. The name must match the corresponding setting in the System Monitor.
monitor_host               The hostname or ip address of System Monitor host.
%SERVICE_NAME%             See explanation "Service variables"
%SERVICE_STATUS%           See explanation "Service variables"
%SERVICE_MESSAGE_PREFIX%   See explanation "Service variables"
%MON_N_SCHEDULER_ID%       See explanation "Table variables"
...
NotificationCommand after substitution (error case):
<![CDATA[echo service_host:JobScheduler Monitoring Errors:2:ERROR scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]>
 
NotificationCommand after substitution (recovery case): 
<![CDATA[echo service_host:JobScheduler Monitoring Errors:0:RECOVERED scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error=error occurred | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]> 
 
NotificationCommand after substitution (success case):  
<![CDATA[echo service_host:JobScheduler Monitoring Success:0:scheduler id=scheduler_4444, history id=123, job_chain=test/my_jobchain(order_id), step=100, error= | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]>  
 
-->
<NotificationMonitor service_name_on_error="JobScheduler Monitoring Errors" service_name_on_success="JobScheduler Monitoring Success">
  <NotificationCommand><![CDATA[echo service_host:%SERVICE_NAME%:%SERVICE_STATUS%:%SERVICE_MESSAGE_PREFIX%scheduler id=%MON_N_SCHEDULER_ID%, history id=%MON_N_ORDER_HISTORY_ID%, job_chain=%MON_N_JOB_CHAIN_NAME%(%MON_N_ORDER_ID%), step=%MON_N_ORDER_STEP_STATE%, error=%MON_N_ERROR_TEXT% | D:\nsca\send_nsca.exe -H monitor_host -c D:\nsca\send_nsca.cfg -d : ]]>
  </NotificationCommand>  
</NotificationMonitor>

...
Examples Zabbix
NotificationCommand

Here is an except of an XML file used for notifying a specific System Monitor (Zabbix Monitor) and using NotificationCommand

Code Block
languagexml
titleSystemMonitorNotification_zabbix.xml
collapsetrue
... 
<!--
zabbix_sender            Zabbix sender installed on the JobScheduler host
localhost                Hostname of the zabbix server
Zabbix_server            JobScheduler Agent name(host name) that registred on Zabbix
samples.job1             Item key of zabbix (replace "/" to "." of JOB_NAME
%MON_N_ERROR_TEXT%       See explanation "Table variables"
-->
<NotificationCommand>
<![CDATA[zabbix_sender -z localhost -s zabbix_server -k samples.job1 -o %MON_N_ERROR_TEXT%]]>
</NotificationCommand>
...

  

JobScheduler - Job Chains

See https://kb.sos-berlin.com/display/PKB/JobScheduler+Monitoring+Interface+-+Prerequisites+and+Installation#JobSchedulerMonitoringInterface-PrerequisitesandInstallation-JobChainConfiguration

...

Status
colourYellow
titlework in progress

Use Cases

Recoverable Errors

Initial Situation: A Job Chain is triggered by directory monitoring. That is, when a certain file comes in a monitored folder, the Job Chain starts.

...

  • XML CheckConfigurationHistory.xml: Indicate the ID of the JobScheduler and the name of the Job Chain you want to monitor.
  • XML SystemMonitorNotification.xml: Specify the name of the Service (in the System Monitor) and specify that it is about a service_name_on_error since you want to have the control when the Job Chain ends in an error.
  • System Monitor: Services in the System Monitor have to be configured and named the same way as in the XML file above SystemMonitorNotification.xml.

Workflow Execution takes too long

Initial Situation: A Job Chain is triggered and it could not end, it hanged in a step, taking then longer than expected.

...

  • XML CheckConfigurationHistory.xml: As in the example above, indicate the ID of the JobScheduler and the name of the Job Chain you want to monitor. Moreover, specify the timer for this specific job chain and the function to calculate the expiration time for the timer.
  • XML SystemMonitorNotification.xml: As in the example above, specify the name of the Service (in the System Monitor) and specify that it is about a service_name_on_error since you want to have the control when the Job Chain ends in an error. Moreover and essential for this particular case, specify how many times the timer should notify your System Monitor about the expiration of a timer.
  • System Monitor: As in the example above, Services in the System Monitor have to be configured and named the same way as in the XML file above SystemMonitorNotification.xml.

SFTP connection refused

Initial Situation: There is a Job Chain that uses SFTP for transferring files. You have a setback configured in this step of the Job Chain, so that if the connection to the SFTP server fails, this step is retried after some time.

...

  • XML CheckConfigurationHistory.xml: As in the example above, indicate the ID of the JobScheduler and the name of the Job Chain you want to monitor.
  • XML SystemMonitorNotification.xml: As in the example above, specify the name of the Service (in the System Monitor) and specify that it is about a service_name_on_error since you want to have the control when the Job Chain ends in an error. Moreover and very important in this case, specify how many times this Job Chain should notify your System Monitor about the error connecting to the SFTP Server. You can use step_from andstep_to for that in order to reduce the number of notifications for this specific step.
  • System Monitor: As in the example above, Services in the System Monitor have to be configured and named the same way as in the XML file above SystemMonitorNotification.xml.

Thresholds

Initial Situation: For example, a specific number of Workflow Executions have to be executed successfully till some specific time. That is, a specific value has to be monitored in order to determine if this quote was reached.

...

  • XML CheckConfigurationHistory.xml: As in the example above, indicate the ID of the JobScheduler and the name of the Job Chain you want to monitor.
  • XML SystemMonitorNotification.xml: Specify the name of the Service (in the System Monitor) but now specify that it is about a service_name_on_success since you want to have the control when the Job Chain ends in an success, and not only when it ends on error.
  • System Monitor: As in the example above, Services in the System Monitor have to be configured and named the same way as in the XML file above SystemMonitorNotification.xml.

Acknowledgement

Initial Situation: An alert for a Service has been sent to the System Monitor and a Mail has been sent to the Service Desk (Support Team) notifying about it.

...