Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • <scheduler_install>/config/notification/SystemMonitorNotification_MonitorSystem.xml
    • rename this file to  SystemMonitorNotification_OP5.xml
    • set system_id Attribute to OP5
      • e.g. <SystemMonitorNotification system_id="OP5">
  • <scheduler_install>/config/live/sos/notification/SystemNotifier,MonitorSystem.order.xml
    • rename this file to SystemNotifier,OP5.order.xml
    • set system_configuration_file Attribute to SystemMonitorNotification_OP5.xml
      • e.g. <param name="system_configuration_file" value="config/notification/SystemMonitorNotification_OP5.xml"/>
  •  <scheduler_install>/config/live/sos/notification/ResetNotifications,AcknowledgeMonitorSystem.order.xml
    • rename this file to ResetNotifications,AcknowledgeOP5.order.xml
    • set system_id Attribute to OP5
      • e.g. <param name="system_id" value="OP5"/>

 

Status
colourYellow
titlework in progress

Use Cases

...

Workflow Execution takes too long

Initial Situation

: A Job Chain is triggered by directory monitoring - i.e. the Job Chain starts when a certain file arrives in a monitored folder.

Problem: The Job Chain has ended with an error.

Handling: The System Monitor will be notified with the error message via the service specified for the Job Chain. If the Job Chain is then restarted by the arrival of a new file end and ends without an error, this does not mean that the original error has been recovered, since the second run has involved the processing of a different file. Instead, the error message at the System Monitor should remain unchanged until the original file has been re-added to the monitored directory and the Job Chain has ended without an error.

Configuration:

  • XML CheckConfigurationHistory.xml: Indicates the ID of the JobScheduler and the name of the Job Chain you want to monitor.
  • XML SystemMonitorNotification.xml: Specifies the name of the Service (in the System Monitor) and specifies that it is about a service_name_on_error since you want to have the control when the Job Chain ends in an error.
  • System Monitor: Services in the System Monitor have to be configured and named the same way as in the SystemMonitorNotification.xml XML file above.

Workflow Execution takes too long

Initial Situation: A Job Chain is triggered and it could not end, it hang in a step, taking longer than expected.

Problem: Execution time was too long

Handling: A timer for this Job Chain has been set and the System Monitor notified about it. The expiration times for the Job Chains are configured with enough time for processing. This is usually used for cases where the Job Chain could hang in a specific step.

Configuration:

and it could not end, it hang in a step, taking longer than expected.

Problem

Execution time was too long

Handling

A timer for this Job Chain has been set and the System Monitor notified about it. The expiration times for the Job Chains are configured with enough time for processing. This is usually used for cases where the Job Chain could hang in a specific step.

Configuration

  • SystemMonitorNotification_<MonitorSystem>.xm
  • System Monitor
  • XML CheckConfigurationHistory.xml: As in the example above - indicates the ID of the JobScheduler and the name of the Job Chain you want to monitor. In addition, the timer for this specific job chain and the function for calculating the expiration time for the timer should be specified.
  • XML SystemMonitorNotification.xml: As in the example above - specifies the name of the Service (in the System Monitor) and that it is about a service_name_on_error since you want to have the control if the Job Chain ends with an error. It is essential for this particular case that the number of times the timer should notify your System Monitor about the expiration of a timer should be specified.
  • System Monitor: As in the example above -

    SFTP connection refused

    Initial Situation

    : Consider a Job Chain that uses SFTP for transferring files. You have a setback configured in this step of the Job Chain, so that if the connection to the SFTP server fails, this step is retried after a specified time.

    Problem

    : The SFTP server is not available anymore.

    Handling

    : The System Monitor will be notified to the service related to the Job Chain with the message error. However, you don't want to have repeated notifications for a Job Chain when is an external factor, the connection to the SFTP Server, is producing the error.

    Configuration:

    , you don't want to have repeated notifications for a Job Chain when is an external factor, the connection to the SFTP Server, is producing the error.

    Configuration

    • SystemMonitorNotification_<MonitorSystem>.xml 
    • System Monitor
    • XML CheckConfigurationHistory.xml: As in the example above - indicates the ID of the JobScheduler and the name of the Job Chain you want to monitor.
    • XML SystemMonitorNotification.xml: As in the example above - specifies the name of the Service (in the System Monitor) and that it is about a service_name_on_error as you want to have the control if the Job Chain ends in error. Note that it is very important in this case that the number of times this Job Chain should notify your System Monitor about the error connecting to the SFTP Server is specified. You can use step_from and step_to for this in order to reduce the number of notifications for this specific step.
    • System Monitor: As in the example above -
      • Services in the System Monitor have to be configured and named the same way as in
      the SystemMonitorNotification.xml file

    Thresholds

    Initial Situation

    : Consider the situation where a workflow has to be executed successfully a specific number of times before a specific point in time. This means that a specific value has to be monitored in order to determine if this quote was reached.

    Handling

    : A new History service is configured, so that the workflow executions (Job Chains in the JobScheduler vocabulary) send the information that they have been successfully executed to the System Monitor.

    Configuration:

    • XML CheckConfigurationHistory.xml: As in the example above - indicates the ID of the JobScheduler and the name of the Job Chain you want to monitor.
    • XML SystemMonitorNotification.xml: Specifies the name of the Service (in the System Monitor) but note that here it is about a service_name_on_success since you want to have the control when the Job Chain ends in an success, and not only when it ends on error.
    • System Monitor: As in the example above - Services in the System Monitor have to be configured and named the same way as in the SystemMonitorNotification.xml file above.

    Acknowledgment

    configured, so that the workflow executions (Job Chains in the JobScheduler vocabulary) send the information that they have been successfully executed to the System Monitor.

    Configuration

    Acknowledgment

    Initial Situation

    An alert for a Service has been sent to the System Monitor, which has sent a Mail to the Service Desk (Support Team) notifying them about the alert.

    Handling

    The problem is known to the Service Desk and they "acknowledge" the problem. The acknowledgment will cause the JobScheduler to be notified not to send any more notifications for this Service to the System Monitor until the Service has been recovered.

    Configuration

     

    Recoverable Errors

    Initial Situation

    You have a setback configured in this step of the Job Chain, so that if the step execution fails, this step is retried after a specified time.

    Problem

    The step has ended with an error, but recovered after setback

    Handling

    If the error message Initial Situation: An alert for a Service has been sent to the System Monitor, which has sent a Mail to the Service Desk (Support Team) notifying them about the alert.

    Handling: The problem is known to the Service Desk and they "acknowledge" the problem. The acknowledgment will cause the JobScheduler to be notified not to send any more notifications for this Service to the System Monitor until the Service has been recovered.

    Configuration:

    ...

    in case of error recovery JobScheduler will automatically sent the recovery message on the same service and the same error message with the prefix RECOVERY.

    Configuration