Scope

Solution

  • Download split_join.zip
  • Extract the archive to a folder ./config/live of your JobScheduler Master installation.
  • The archive will extract the files to a folder split_join. 
  • The split_join folder can be renamed if required, the solution does not require the use of specific folder or Job names.

Pattern

Implementation

Components

  • The Job Chain and Jobs job1 to job6 provided in the download are not specific for this solution, they represent simple shell scripts.

The Splitter Job

  • The JobChainSplitter Job is the Splitter JITL Job and uses the Java class com.sos.jitl.splitter.JobChainSplitterJSAdapterClass.
    • There is no restriction on the name which can be given to this Job.
    • The JobChainSplitter Job is used with the following parameters:
      • state_names:
        • A list of semicolon separated Job Node states.
          • The Job Node state names correspond to the state names of the first job node of each child Job Chain segment that is to be processed in parallel. 
          • An individual Order is created for each entry in this list.
        • In order to support better graphical output for the JOE graphical diagram the state names are prefixed with the name of the state of the split Job, a colon and the name of the job associated with the state.
          • Example for state of job3: split:job3
          • Example for state_names parameter value: split:job3;split:job4
      • sync_state_name:
        • This parameter is required for the Job Chain Details view in the JOC Cockpit and for the Job Chain Diagram shown in JOE. It accepts the value of the state that is associated with the join Job node.
    • Each child Job Chain segment can have any number of Jobs and can also include further split and join nodes.
      (See the How to Nest Parallel Executing Jobs in a Job Chain article for more information.)
    • The implementation of this Job is shown in the following code:

      The Split Job
      <?xml version="1.0" encoding="ISO-8859-1"?>
      <job  title="Start a parallel processing in a job chain" order="yes" stop_on_error="no">
          <description >
              <include  file="jobs/JobSchedulerJobChainSplitter.xml"/>
          </description>
          <params >
              <param  name="state_names" value="split:job3;split:job4"/>
              <param  name="join_state_name" value="join"/>
          </params>
          <script  language="java" java_class="com.sos.jitl.splitter.JobChainSplitterJSAdapterClass"/>
          <run_time />
      </job>

The Join Orders Job

  • The JobSchedulerJoinOrders Job is used to join up Orders and uses the Java class com.sos.jitl.join.JobSchedulerJoinOrdersJSAdapterClass.
    • This Job does not require any parameters to be set when it used with the JobChainSplitter Job as described here.
    • The implementation of this Job is shown in the following code:

      The Join Orders Job
      <?xml version="1.0" encoding="ISO-8859-1"?>
      <job  title="Join Orders coming from a split" order="yes" stop_on_error="no">
          <description >
              <include  file="jobs/JobSchedulerJoinOrder.xml"/>
          </description>
          <params />
          <script  language="java" java_class="com.sos.jitl.join.JobSchedulerJoinOrdersJSAdapterClass"/>
          <run_time />
      </job>

The Split & Join Job Chain and Order

The next code block shows the configuration for the Split & Join Job Chain. Note that the next state for the split node is set to the join node. 

The Split & Join Job Chain
<?xml version="1.0" encoding="ISO-8859-1"?>
<job_chain title="Split &amp; Join Chain" name="job_chain">
        <job_chain_node state="job1" job="job1" next_state="split" error_state="error" />
        <job_chain_node state="split" job="split_job" next_state="join" error_state="error" />
        <job_chain_node state="split:job3" job="job3" next_state="job3a" error_state="error" on_error="suspend" />
        <job_chain_node state="job3a" job="job3a" next_state="join" error_state="error" on_error="suspend" />
        <job_chain_node state="split:job4" job="job4" next_state="job4a" error_state="error" on_error="suspend" />
        <job_chain_node state="job4a" job="job4a" next_state="join" error_state="error" on_error="suspend" />
        <job_chain_node state="join" job="join_job" next_state="job6" error_state="error" />
        <job_chain_node state="job6" job="job6" next_state="success" error_state="error" />
        <job_chain_node state="success" />
        <job_chain_node state="error" />
</job_chain>
The Start Order for the Split & Join Job Chain
<?xml version="1.0" encoding="ISO-8859-1"?>
<order  title="Split &amp; Join Processing">
    <run_time />
</order>

Operation

  • Start the start Order in the JOC Cockpit using, for example, the Start Order Now option in the Job Chains Details tab.
  • The split Job will be processed after job1 has been completed and will generate Orders for the child Job Chain segments starting with job3 and job4 respectively.
  • The order for the main Job Chain will move to the join job where it will be suspended until the number of child Job Chain segment Orders required by the join Job has been completed.
  • The main order will then be de-suspended and processing of the main Job Chain will then proceed with job6.#

Error Handling

To allow for efficient error handling the on_error="suspend" setting should be used for each Node in the child Job Chains as shown in the Job Chain code block above. This setting is made in JOE in the Nodes Tab for the Job Chain Steps/Nodes elements.

The following decision tree summarizes the procedure to be followed if a Job in a split child Job Chain is suspended:

The error handling process requires some manual work. This is normally done using the Job Chains view of the JOC Cockpit interface.

  • To skip the error node, use the JOC Cockpit Set Order State function, which is available in the Order Additional Options menu. Then resume the Order using Resume Order.
  • To cancel the whole process delete all split suspended Orders with the JOC Cockpit Delete Order function. Then reset the main Order with the Reset Order function.
  • To add a dummy Order to the join node to satisfy the condition to proceed the main Order use the JOC Cockpit Add Order function.
    You can then either:
    • Add the parameter join_session_id=main_order_id where main_order_id is the order id of the main order.
    • Alternatively, you can:
      • Name the Order with the id main-order-id_any where main-order-id is the Order id of the main Order.
    • Set the start step and the end step to the join node.
    • Then delete the suspended Order if there is one.

Order IDs

The 'main' Order for the split & join download - that is the Order that proceeds job1 -> split -> join -> job6 - has the ID 'start'.

The Orders for the two parallel child job Chain segments are given the IDs start_split:job3 and start_split:job4. These IDs are generated from the main Order Id plus the node name of the first nodes in each child Job Chain segment.