Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
outlinh1. true
outlinh1. true
1printablefalse
2stylh1. none
3indent20px

Scope

  • The functions for terminating task processes by the JobScheduler Master and Universal Agent have been extended to allow the use of SIGTERM

...

  • and SIGKILL signals  on Unix servers

...

  • . This allows an orderly

...

  • termination of task processes to take place over a limited period of time.
  • The information contained in this article draws together detailed information contained in a range of issues and should primarily be of interest to persons in engineering and to a lesser extent persons in operating functions.

Feature History

This feature has been implemented stepwise between Release release 1.9.0 and 1.10.0 (see the table of issues below for more detailed information).

Issues

Support of this feature is subject to the following issues:

Jira
serverSOS JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution,fixversions
maximumIssues20
jqlQueryissue = joc-10 or issue = js-1163 or issue = js-1307 or issue = js-1382 or issue = js-1420 or issue = js-1421 or issue = js-1463 or issue = js-1468 or issue = js-1495
serverId6dc67751-9d67-34cd-985b-194a8cdc9602

Use Case

The information contained in this article draws together detailed information contained in a range of issues and should primarily be of interest to persons in engineering and to a lesser extent persons in operating functions.

Implementation

Users who schedule programs and scripts that are aware of SIGTERM signals can implement clean-up procedures on receipt of the signal. Clean-up includes e.g. removal of temporary files, disconnect from a database and similar tasks.

  • The use of both SIGTERM and SIGKILL signals on Unix servers has the following advantages:
    • The use of SIGTERM before SIGKILL means that there is a greater chance of data being saved after after the the kill command signal has been issued.

    • The SIGTERM signal can - in contrast with SIGKILL - be monitored, i.e. a pre-/postprocessing Script can be carried out. This means that the ending of a task by the JobScheduler can be reacted to and the sudo user process itself can be ended.

    • The post-processing methods implementation of SIGTERM allows post-processing methods such as spooler_process_after() to complete within the timeout period.
  • The time allowed between the SIGTERM and the SIGKILL signal can be specified in the command using the timeout attribute (the default is 15 sec) - : <kill_task … timeout=".."/>

  • This feature can also be applied for:
    • remote processes - , i.e. processes started by SSH and those started by an agentAgent,
    • child processes started by a process running on an agent (JS-1468).

...

Implementation

The following operations can be carried out from the JobScheduler Operating Center interface (JOC) or by use of the command line:

  1. Operation: kill immediately
    • JOC sends <kill_task immediately="yes"/>
    • The process is killed immediately using the SIGKILL signal.
  2. Operation: terminate with timeout
    • JOC sends <kill_task immediately="yes" timeout="15"/>
    • The process receives a SIGTERM signal. Should that process not terminate within the specified timeout period then it will be killed with a SIGKILL signal.
  3. Operation: terminate
    • JOC sends <kill_task immediately="yes" timeout="never"/>
    • The process receives a SIGTERM signal. Monitoring of the process termination as described in Operation 2 above is not carried out.

Delimitation

  • This feature is intended for Unix platforms that implement the SIGTERM and SIGKILL signals. It is not intended for Windows platforms for which exclusively the Kill Immediately command applies.
  • When using traps then please consider that the process created by the <shell> element receives the signal. Subsequent scripts that are called within the <shell> element will not receive the signal.
    You could therefore:
    • configure traps directly within the <shell> element. The shell process will then receive and handle the signal.
    • configure traps in a shell script that is added by an <include> element instead of being stated within the <shell> element. The included shell script will receive and handle the signal.
    • forward signals to subsequent shell scripts that are called within a <shell> element.
  • This feature has been fully implemented on the Universal Agent . and It has been implemented for classic JobScheduler Agents using TCP (JS-1420).

Workaround

Workarounds

  • Should job scripts not be able to catch signals by traps then you can use a monitor script, iA monitor (i.e. a pre-/ postprocessing script) has to be configured for shell jobs that have a timeout set , that would be called by JobScheduler on receipt of a SIGTERM signal (JS-1463).
    For  For example:

    Code Block
    languagexml
    titleWorkaround for shell jobs with a timeout
    <job name="shell_with_javascript_monitor">
        <script  language="shell">
            <![CDATA[
    echo hello world!
    sleep 45
            ]]>
        </script>
    
        <monitor  name="process0" ordering="0">
            <script  language="java:javascript">
                <![CDATA[
    function spooler_process_before(){
    	return true;
    }
                ]]>
            </script>
        </monitor>
    
        <run_time />
    </job>

...

Examples

Download the Example

job_trap_sigterm.job.xml.zip

Description

This example contains a job that uses a sigterm trap to show the difference between the kill<kill_task task> and terminate<terminate_task task> commands provided by JOC.

...

  • Start the job
  • Terminate the task in JOC
  • You will see the log message: sigterm will be ignored
  • The task will continue
     

Code Block
languagexml
<?xml version="1.0" encoding="ISO-8859-1"?>

<job  title="test test">
    <script  language="shell">
        <![CDATA[
trap 'echo sigterm will be ignored' 15
for i in 1 2 3 4 5 6 7 8 9 0
do
date
sleep 10
done
sleep 60
        ]]>
    </script>

    <run_time />
</job>

References

...