Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Introduction

  • Jobs are executed with JS7 Agents which handle termination of jobs.
    • Shell Jobs and JVM Jobs are under control of the Agent which terminates running jobs.
    • Jobs implementing use of an SSH Client or use of the JS7 - JITL SSHJob cannot guarantee that a job's child processes are terminated as they are controlled by the remote SSHD server.
  • Termination of jobs can be caused by users from the JOC Cockpit and can be performed automatically if jobs exceed a given timeout.
    • As a prerequisite for termination by the JOC Cockpit, the Controller has to be connected to the JOC Cockpit and the Agent has to be accessible to the Controller.
  • See 
    Jira
    serverSOS JIRA
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverId6dc67751-9d67-34cd-985b-194a8cdc9602
    keyJS-1965

Termination of Jobs

Jobs can be terminated in one of the following ways:

  • The job is configured with a timeout setting: if job execution exceeds the timeout then the job will be killed by the Agent.
  • Jobs can be killed using the GUI operation and by use of the JS7 - REST Web Service API:
    • The Cancel/Kill operation kills a running job and fails the order.
    • The Suspend/Kill operation kills a running job and suspends the order.
    • Failed and suspended orders can be resumed.

Terminating Jobs on Unix

In Unix environments, jobs receive the following signals from the Agent:

...

  • By default the OS removes child processes if the parent process is killed. However, this mechanism is not applicable for all situations, depending on the way child processes have been spawned.
  • In order to more reliably kill child processes the Agent uses the kill_task.sh script from its var_<port>/work directory.
    • This script identifies the process tree created by the job script and kills any available child processes.
    • Download: kill_task.sh
  • Though the Agent is platform independent it is evident that retrieval of a process tree does not necessarily use the same command (ps) and options for all Unixes.
    • The Agent therefore allows specification of an individual kill script from a command line option if the built-in kill_task.sh script is not applicable to your Unix platform, see JS7 - Agent Operation.

Use of Exit Traps

The Short Version

You can add the following two traps to your Shell Jobs:

...

For explanations see the long version.

The Long Version

In a situation when a Shell Job script starts a background process and does not wait for termination of the child process but instead completes (with or without error), then the Agent cannot identify the running child process as its parent process has gone. It is therefore recommended that a trap is added to the shell script. This will be triggered on termination of the script - independently of whether the script terminates normally or with an error. This prevents the script from terminating immediately while child processes are running. Instead, in the event of forced termination, the script will continue due to its trap waiting for child processes and the Agent will execute the kill_task.sh script. This script identifies the Shell Job script process and kills the running child processes.

...

  • Line 3 - 11: implements the JS7Trap()function including the wait command. This either waits for termination of child processes or continues immediately.
    • The exit code returned from the trap in the event of script termination is reported by the task log and order log.
    • However, job execution will be considered to have failed regardless of the exit code value as the Cancel/Kill or Suspend/Kill operation has been performed.
  • Line 14-16: define traps calling the JS7Trap() function in the event of the following signals being received:
    • EXIT is a summary for a number of signals that terminate a script, however, this is available for the bash shell only.
    • TERM is the termination signal sent by the Agent if the Cancel/Kill or Suspend/Kill operation is invoked.
    • INT is added in case OS processes external to the JS7 Agent send this signal, which usually corresponds to hitting Ctrl+C in a terminal session.
  • Line 15-17: starts background processes.
  • Line 21 a script should normally wait for child processes. However, if this cannot be guaranteed, for example if set -e is used to abort a script in case of error, then the use of a trap is an appropriate measure.
  • The following sequence of actions is performed:
    • The job script listed above does not wait for child processes and therefore terminates triggering the EXIT pseudo-signal. The trap function is executed and waits for child processes to be completed. During this period the task process for the job remains alive.
    • If subsequently the Cancel/Kill or Suspend/Kill operation is invoked, then the Agent will send a SIGTERM signal which:
      • interrupts the wait command in the currently executed JS7Trap()function,
      • triggers execution of the JS7Trap()function once more and performs the wait operation for child processes.
    • Having applied the Grace Timeout the Agent executes the kill_task.sh script which sends a STOP signal to the task process, kills any child processes and finally sends a SIGKILL signal to abort the task process.
    • The crucial point is that the job script does not terminate with child processes running but remains active due to triggering of a trap which allows the Agent to kill any child processes from the process tree. If the task process for the job script terminates with child processes running then the Agent cannot identify the process tree and cannot kill child processes.

Automation of Exit Traps

JS7 provides an option for applying traps such as those described in the example above. These can be applied to a number of Shell Job scripts via JS7 - Script Includes.

  • The trap and the trap function are added to a Script Include like this:




  • The Script Include is embedded into any Shell Job scripts from a single line similar to a shebang:



Terminating Jobs on Windows

For Windows environments the following applies when terminating jobs:

...