Scope

  • JobScheduler Master and Agents can maintain network connections by regularly sending keep-alive packets. This prevents connections across firewalls being broken if a long-running job should exceed allowed timeouts of firewalls or proxy servers.
  • FEATURE AVAILABILITY STARTING FROM RELEASE 1.9

Related Features

JS-1506 - Getting issue details... STATUS

JS-1454 - Getting issue details... STATUS

JS-1456 - Getting issue details... STATUS

Use Case

  • Increasing awareness of security issues means that an increasing number of firewalls and proxy servers are to be found within company intranets, where they, for example, can form part of the infrastructure used to create so-called network islands.
  • This means that the interception of idle network connections between JobSchedulers - in particular between JobScheduler Masters and Agents - by firewalls or proxies can start to represent a significant problem.
  • Idle connections occur when, for example, a long-running job is being processed by an Agent and the job does not create log output from time to time, meaning that data is not sent over the network.
  • The feature described here uses so-called keep-alive packets that are sent across network connections at regular intervals.
  • If interception takes place then this can happen either politely or impolitely:
    • Polite interception means that the firewall sends a connection reset packet to the JobScheduler Master.
      • This means that the Master is able to rebuild the connection.
    • Impolite interception means that the firewall simply drops subsequent packets and that information from an Agent about, for example, job completion will be lost.

Implementation

JobScheduler Master - Classic Agent

  • The JobScheduler Master and Classic Agent can be configured to prevent connections from timing out by adding a scheduler.agent.keep_alive parameter to the <params> section of the Master's scheduler.xml file. This file is located in the $SCHEDULER_DATA/config folder, where $SCHEDULER_DATA is the directory used for JobScheduler's configuration and log files.

    keep-alive parameter
    <params>
        <param name="scheduler.agent.keep_alive" value="300"/>
    </params>

    Changes to the scheduler.xml file require a restart of the JobScheduler before they will be implemented.

  • The value attribute sets the interval in seconds between keep-alive packets.

    • A duration lower than 30s will be silently replaced by 30s.
  • Keep-alive packets will not be sent if the parameter is not set or if the value attribute is empty.

  • The keep-alive parameter will be forwarded to the Agent along with other task configuration parameters for use when the Agent initiates a connection.
  • Keep-alive packets will be sent across the network by the JobScheduler (either Master or Agent) that initiates a task.
  • The Master sends keep-alive commands to Classic Agents (up to and including Classic Agent release 1.9) via TCP connections.
  • The Master log will show a SCHEDULER-727 Keep-alive package sent to Agent message at the info level stating that a keep-alive command has been successfully sent.
  • See the scheduler.agent.keep_alive parameter reference article for more information.

JobScheduler Master - Universal Agent

  • The JobScheduler Master and Universal Agent do not require a TCP connection - as with the Classic Agent - see above - but use only a single connection (HTTP or HTTPS) to communicate.
  • The Master does not need to send keep-alive packets to the Universal Agent as the Agent sends its own keep-alive packets in the form of HTTP Heartbeats.

  • The HTTP Heartbeats are configured in <remote_scheduler> XML elements which are means that:

    •  The JobScheduler does not have to be restarted after a configuration change.

    • <remote_scheduler> XML elements are descendants of <process_class> elements which can be configured separately from jobs, job chains and orders and thereby reused.
    • Heartbeats can be configured within JOE.
  • The following code block shows a typical configuration:

    The remote scheduler HTTP Heartbeat parameters
    <process_class  max_processes="10">
        <remote_schedulers>
            <remote_scheduler remote_scheduler="http://127.0.0.2:5000"
                   http_heartbeat_period="10"
                   http_heartbeat_timeout="15"/>
        </remote_schedulers>
    </process_class>
  • The default HTTP heartbeat values are:
    • http_heartbeat_period: 10 seconds
    • http_heartbeat_timeout : 60 seconds
  • See the <remote_scheduler> XML element reference article for more information.

Workarounds for Older Versions

The idle connection can also be overcome by:

  • regularly calling API methods (only in jobs using the JobScheduler's API),
  • regularly logging to stdout (every log line to stdout is a log API call as well and makes use of the TCP connection used by the Classic Agent to communicate with the JobScheduler Master).

Heartbeat script for Linux Agents

The following scripts provide an alternative workaround for older versions.

  • Store the following script heartbeat.sh in the Agent's file System:

    Heartbeat script for Linux Agents
    trap 'kill $(jobs -p)' EXIT
    /path/to/heartbeat.sh & 
    
    #your scriptcode here...
  • Configure long running job scripts to run the heartbeat script.

Heartbeat script for Windows Agents

  • Store the following heartbeat.cmd script in the Agent's file system:

    Heartbeat script for Windows Agents
    start /b C:\temp\heartbeat.cmd 
    
    @rem your scriptcode here...
  • Configure long-running job scripts to run the heartbeat script.
  • Don't use quotes in the start command, e.g. start /b "c:\my scripts\heartbeat.cmd".

Heartbeat script download

Delimitation

  • Keep-alive packets are not created if Remote File Watching is performed by the Agent.
  • Keep-alive packets are only sent for running jobs.

References

Change Management References

T Key Linked Issues Fix Version/s Status P Summary Updated
Loading...
Refresh


Documentation

Master - Classic Agent

Master - Universal Agent

General