Page History
...
- Jobs are executed with JS7 Agents which handle termination of jobs.
- Shell Jobs and JVM Jobs are under control of the Agent which terminates running jobs.
- Jobs implementing use of an SSH Client or use of the JS7 - JITL SSHJob cannot guarantee that a job's child processes are terminated as they are controlled by the remote SSHD server. The JS7 - JITL SSHJob provides the means to reliably kill child processes.
- Termination of jobs can be caused by users from the JOC Cockpit and can be performed automatically if jobs exceed a given timeout.
- As a prerequisite for termination by the JOC Cockpit, the Controller has to be connected to the JOC Cockpit and the Agent has to be accessible to the Controller.
- See
Jira server SOS JIRA columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 6dc67751-9d67-34cd-985b-194a8cdc9602 key JS-1965
...
- When a job is to be killed then the Agent first sends a
SIGTERM
signal.- This signal can be ignored or it can be handled by a job script. For shell scripts jobs a
trap
can be defined to, for example, perform cleanup tasks such as disconnecting from a database or removing temporary files. - Note that this applies to job scripts that directly include shell code. If instead the job script includes calls to external shell scripts or programs then the Agent's
SIGTERM
signal is not forwarded to child processes running for external scripts or programs. To prevent this situation external shell scripts or programs should be called like this:exec /tmp/some_script.sh
- The
exec
command causes any external scripts or programs to be executed with the process of the current job script (instead of creating a new child process) and guarantees that theSIGTERM
signal is received by the process.
- This signal can be ignored or it can be handled by a job script. For shell scripts jobs a
- The job configuration includes the Grace timeout setting:
- The Grace Timeout duration is applied after a
SIGTERM
signal (corresponding tokill -15)
has been sent by the Agent. This allows the job to terminate on its own, for example after some cleanup has been performed.
- The Grace Timeout duration is applied after a
- Should the job still be running after the specified Grace Timeout duration then the Agent will send a
SIGKILL
signal (corresponding tokill -9
) that aborts the OS process. - Note that it is essential for job scripts that create child processes not to terminate on receipt of a
SIGTERM
signal before child processes are terminated.- Job scripts can use the
wait
command to wait for completion of child processes as this command prevents termination of the job script on receipt ofSIGTERM
. - Job scripts including any child processes will then be reliably killed by
SIGKILL
after the specified Grace Timeout.
- Job scripts can use the
The OS commands used by the Agent to send signals include:
...
- If required for your Agent platform, the commands to send signals can be modified - see the JS7 - Agent Configuration Items article.
...
- Line 3 - 11: implements the
JS7Trap()
function including thewait
command. This either waits for termination of child processes or continues immediately.- The exit code returned from the trap in the event of script termination is reported by the task log and order log.
- However, job execution will be considered to have failed regardless of the exit code value as the Cancel/Kill or Suspend/Kill operation has been performed.
- Line 14-16: define traps calling the
JS7Trap()
function in the event of the following signals being received:EXIT
is a summary for a number of signals that terminate a script, however, this is available for the bash shell only.TERM
is the termination signal sent by the Agent if the Cancel/Kill or Suspend/Kill operation is invoked.INT
is added in case OS processes external to the JS7 Agent send this signal, which usually corresponds to hitting Ctrl+C in a terminal session.
- Line 15-17: starts background processes.
- Line 21 a script should normally
wait
for child processes. However, if this cannot be guaranteed, for example ifset -e
is used to abort a script in case of error, then the use of a trap is an appropriate measure. - The following sequence of actions is performed:
- The job script listed above does not wait for child processes and therefore terminates triggering the EXIT pseudo-signal. The trap function is executed and waits for child processes to be completed. During this period the task process for the job remains alive.
- If subsequently the Cancel/Kill or Suspend/Kill operation is invoked, then the Agent will send a
SIGTERM
signal which:- interrupts the
wait
command in the currently executedJS7Trap()
function, - triggers execution of the
JS7Trap()
function once more and performs thewait
operation for child processes.
- interrupts the
- Having applied the Grace Timeout the Agent executes the
kill_task.sh
script which sends aSTOP
signal to the task process, kills any child processes and finally sends aSIGKILL
signal to abort the task process. - The crucial point is that the job script does not terminate with child processes running but remains active due to triggering of a trap which allows the Agent to kill any child processes from the process tree. If the task process for the job script terminates with child processes running then the Agent cannot identify the process tree and cannot kill child processes.
If the job script in the above example is executed from a script file then the exec
command should be used to call the script file like this:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/usr/bin/env bash
exec /tmp/some_script.sh |
Automation of Exit Traps
JS7 provides an option for applying traps such as those described in the example above. These can be applied to a number of Shell Job scripts via JS7 - Script Includes.
...
Overview
Content Tools