Scope

  • The following question has been asked: 
  • The solution outline includes a job chain that can be parameterized with orders for invididual points in time and a list of other job chains that should be terminated.

Solution

  • Download: kill_tasks.zip
    • The files included with the archive should be usable with Windows and Unix systems.
  • Extract the archive to the ./config/live folder of your JobScheduler installation.
  • The archive will extract the files included to a folder kill_tasks, therefore all objects will be located in the ./config/live/kill_tasks folder. 
  • You can store the sample files to a different folder, however, you would have to adjust the job_chains parameter from the kill_tasks order, see below.

Components

  • The test chain run_infinitely: this chain implements a test job that runs infinitely and can be started in multiple instances.
    • One job run_infinitely is included that executes a number of ping commands to simulate a long running job. The job is configured to run in multiple instances if more than one order is active for the job chain.
    • One order run_infinitely is included that starts the run_infinitely job chain. This order can be started manually by using JOC. Additional orders can be added by use of the Add Order context menu item with JOC.
  • The kill chain kill_tasks_in_job_chain: this chain implements the killing of tasks for any job chains.
    • One job kill_tasks_in_job_chain is included that implements the JobScheduler <kill_task> XML command.
    • One order kill_tasks_in_job_chain is included that starts the kill_tasks_in_job_chain job chain. This order can be started manually by using JOC.
      • This order makes use of two possible parameters:
        • job_chains: This parameters accepts one or multiple job chain paths. 
          • Specify the full path including all folder names of the job chain that runs tasks that should be terminated.
          • Separate multiple job chain paths by use of a semicolon ";".
        • timeout: This parameter is available for Unix systems only and specifies a timeout in seconds. 
          • The job kill_tasks_in_job_chain will first send a SIGTERM signal to any tasks of the run_infinitely job. 
          • Should the run_infinitely job not be terminated after the timeout expires then a SIGKILL signal is sent to the tasks.
          • For Windows a SIGTERM signal is not available and therefore this parameter cannot be used or is used with a value 0. 

Usage

  • Sample Usage
    • Start the order run_infinitely for the job chain with the same name. Add an a additional order for the test chain, e.g. by using JOC with the Add Order context menu item.
      • The job run_infinitely from the job chain with the same name should be executed in two tasks (visible from JOC by clicking the job).
    • Start the order kill_tasks from the kill_tasks_in_job_chain job chain.
      • The order will cause all tasks from the run_infinitely job chain to be terminated.
      • The order included with the above download archive specifies the path /issues/kill_tasks/run_infinitely for the job chain to be terminated. Modify this parameter to a different job chain path if your test job chain is located elsewhere.
  • Individual Usage
    • Use any job chain path with the job_chains parameter of the kill_tasks_from_job_chain order.
    • Specify a value for the timeout parameter to cause the tasks to be terminated to receive a SIGTERM signal first and a SIGKILL signal after timeout expiration. This functionality is available for Unix systems only.

Implementation

Job Chain: kill_tasks_in_job_chain

The job chain is straightforward, only one job is included:

job chain configuration: job configuration: kill_tasks_in_job_chain
<job_chain  title="Kill any tasks in a job chain" name="kill_tasks_in_job_chain">
    <job_chain_node  state="kill" job="kill_tasks_in_job_chain" next_state="success" error_state="error"/>
    <job_chain_node  state="success"/>
    <job_chain_node  state="error"/>
</job_chain>

Job: kill_tasks_in_job_chain

The job implements use of the JobScheduler API to retrieve information about running tasks and to kill tasks accordingly:

job configuration: kill_tasks_in_job_chain
function spooler_process() {
  var rc = true;

  // merge parameters from task and order
  var params = spooler.create_variable_set();
  params.merge( spooler_task.params );
  params.merge( spooler_task.order.params );

  // accept a list of colon separated job chain paths
  var jobChainPaths = params.value( "job_chains" ).split( ";" );
  var timeout = params.value( "timeout" );
  if (!timeout) { timeout = 0; }

  // traverse job chains as specified
  for (jobChainIndex in jobChainPaths) {
    var jobChainPath = jobChainPaths[jobChainIndex];
    if (!spooler.job_chain_exists( jobChainPath )) {
      spooler_log.error( ".. specified job chain does not exist: " + jobChainPath );
      rc = false;
      continue;
    }

    spooler_log.info( ".. traversing job chain: " + jobChainPath );
    var response = spooler.execute_xml( "<show_job_chain job_chain='" + jobChainPath + "' what='job_chain_jobs'/>" );
    var jobChainDOM = new Packages.sos.xml.SOSXMLXPath( new java.lang.StringBuffer( response ) );
    var jobChainNodes = jobChainDOM.selectNodeList( "/spooler/answer/job_chain/job_chain_node" );

    // traverse job chain nodes
    for( nodeIndex=0; nodeIndex<jobChainNodes.getLength(); nodeIndex++ ) {
      var jobChainNode = jobChainNodes.item(nodeIndex);
      var job = jobChainDOM.selectSingleNodeValue( jobChainNode, "@job" );
      if (job == null) {
        continue;
      }

      spooler_log.info( ".... job found: " + job );
      var response = spooler.execute_xml( "<show_job job='" + job + "' job_chain='" + jobChainPath + "' what='all'/>" );
      var jobDOM = new Packages.sos.xml.SOSXMLXPath( new java.lang.StringBuffer( response ) );
      var taskNodes = jobDOM.selectNodeList( "/spooler/answer/job/tasks/task" );

      for( taskNodeIndex=0; taskNodeIndex<taskNodes.getLength(); taskNodeIndex++ ) {
        var taskNode = taskNodes.item( taskNodeIndex );
        var taskID = jobDOM.selectSingleNodeValue( taskNode, "@id" );
        if (taskID == null) {
          continue;
        }

        spooler_log.info( "...... task found: " + taskID );
        if ( timeout > 0 ) {
          var command = "<kill_task id='" + taskID + "' job='" + job + "' immediately='yes' timeout='" + timeout + "'/>";
        } else {
          var command = "<kill_task id='" + taskID + "' job='" + job + "' immediately='yes'/>";
        }

        spooler_log.info( "........ kill task command: " + command );
        var response = spooler.execute_xml( command );
        var killTaskDOM = new Packages.sos.xml.SOSXMLXPath( new java.lang.StringBuffer(response) );
        var errorCode = killTaskDOM.selectSingleNodeValue( "//ERROR/@code" );
        var errorText = killTaskDOM.selectSingleNodeValue( "//ERROR/@text" );
        if ( errorCode || errorText ) {
          spooler_log.error( "........ kill task response: errorCode=" + errorCode + ", errorText=" + errorText );
          rc = false;
        }
      }
    }  
  }
  return rc;
}