Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Solution

  • Download split_join.zip
  • Extract the archive to a folder ./config/live of your JobScheduler Master installation.
  • The archive will extract the files to a folder split_join. 
  • You can store the sample files to a any folder as you likeThe split_join folder can be renamed if required, the solution does not make require the use of specific folder names or job Job names.

Pattern

Flowchart
job_chain [label="Job Chain",fillcolor="orange"]
job_1 [label="Job 1",fillcolor="lightskyblue"]
job_2 [label="Job 2",fillcolor="lightskyblue"]
job_split [label="Job Split",fillcolor="lightskyblue"]
job_3 [label="Job 3",fillcolor="lightskyblue"]
job_3a [label="Job 3a",fillcolor="lightskyblue"]
job_4 [label="Job 4",fillcolor="lightskyblue"]
job_4a [label="Job 4a",fillcolor="lightskyblue"]
job_5join [label="Job 5Join",fillcolor="lightskyblue"]
job_join6 [label="Job Join6",fillcolor="lightskyblue"]

job_chain -> job_1
job_1 -> job_2
job_2 -> job_split
job_split -> job_3
job_split -> job_4
job_3 -> job_3a -> job_join
job_4 -> job_4a -> job_join
job_join -> job_56

Implementation

Components

  • The job chain Job Chain and jobs Jobs job1 to job5 job6 provided by in the sample download are not specific for this solution, they represent simple shell scripts.

The Splitter Job

  • The JobChainSplitter Job job split_partitions is the Splitter JITL Job provided with and uses the Java class com.sos.jitl.splitter.JobChainSplitterJSAdapterClass.
    • There is no restriction on the name which can be given to this Job.
    • The JobChainSplitter Job is The job is used with the following parameters:
      • state_names: a
        • A list of semicolon separated
        job node states to which split orders are added. For each entry in this list an individual order is created.
        • Job Node states.
          • The Job Node
        • The
          • state names correspond to the
          states that the respective job nodes are associated with in the job chain definition
          • state names of the first job node of each child Job Chain segment that is to be processed in parallel. 
          • An individual Order is created for each entry in this list.
        • In order to support better graphical output for the JOE graphical diagram the state names are prefixed with the name of the state of the split_partitions job (corresponding to its name) Job, a colon and the name of the job associated with the state.
          • Example for state of job3: split
          _partitions
          • :job3
          • Example for state_names parameter value: split
          _partitions
          • :job3;split
          _partitions
          • :job4
      • joinsync_state_name: this
        • This parameter is required
        exclusively for improved graphical output from
        • for the Job Chain Details view in the JOC Cockpit and for the Job Chain Diagram shown in JOE. It accepts the value of the state that is associated with the join
        _partitions job
        • Job node.
        node 
      Any
    • Each child Job Chain segment can have any number of jobs can follow the jobs that are referenced by the state_names parameter.Jobs and can also include further split and join nodes.
      (See the How to Nest Parallel Executing Jobs in a Job Chain article for more information.)
    • The implementation of this Job is shown in the following code:

      Code Block
      languagexml
      titleThe Split Job
      collapsetrue
      <?xml version="1.0" encoding="ISO-8859-1"?>
      <job  title="Start a parallel processing in a job chain" order="yes" stop_on_error="no">
          <description >
              <include  file="jobs/JobSchedulerJobChainSplitter.xml"/>
          </description>
          <params >
              <param  name="state_names" value="split:job3;split:job4"/>
              <param  name="join_state_name" value="join"/>
          </params>
          <script  language="java" java_class="com.sos.jitl.splitter.JobChainSplitterJSAdapterClass"/>
          <run_time />
      </job>

The Join Orders Job

  • The JobSchedulerJoinOrders Job is used to join up Orders and uses the Java class com.sos.jitl.join.JobSchedulerJoinOrdersJSAdapterClass.
    • This Job does not require any parameters to be set when it used with the JobChainSplitter Job as described here.
    • The implementation of this Job is shown in the following code:

      Code Block
      languagexml
      titleThe Join Orders Job
      linenumberstrue
      collapsetrue
      <?xml version="1.0" encoding="ISO-8859-1"?>
      <job  title="Join Orders coming from a split" order="yes" stop_on_error="no">
          <description >
              <include  file="jobs/JobSchedulerJoinOrder.xml"/>
          </description>
          <params />
          <script  language="java" java_class="
    The job job_chain1.join_partitions is used to join up split orders and is provided by the Join Orders JITL Job with the Java class
    • com.sos.jitl.join.JobSchedulerJoinOrdersJSAdapterClass
    .
    • The recommended naming convention for this job includes to use the job chain name as a prefix: due to the nature of this job to join up across all job chains that are running in the system and that make use of the same join job name you should prefix the job name with the job chain name if you want this job to be limited to work for the current job chain.
    • This job is used without parameters.

Usage

  • Add an order to job chain job_chain1 by use of JOC.
  • Consider the processing that would 
    • split the execution into two subsequent orders that run for job3->job3a and job4->job4a
    • move the current order to the join_partitions job node.
  • The processing for job3->job3a and job4->job4a will require different execution time. All orders will wait in the join_partitions job node for any split order to arrive. With all split orders being completed the processing will continue with job5.

 

    • "/>
          <run_time />
      </job>

The Split & Join Job Chain and Order

The next code block shows the configuration for the Split & Join Job Chain. Note that the next state for the split node is set to the join node. 

Code Block
languagexml
titleThe Split & Join Job Chain
collapsetrue
<?xml version="1.0" encoding="ISO-8859-1"?>
<job_chain title="Split &amp; Join Chain" name="job_chain">
        <job_chain_node state="job1" job="job1" next_state="split" error_state="error" />
        <job_chain_node state="split" job="split_job" next_state="join" error_state="error" />
        <job_chain_node state="split:job3" job="job3" next_state="job3a" error_state="error" on_error="suspend" />
        <job_chain_node state="job3a" job="job3a" next_state="join" error_state="error" on_error="suspend" />
        <job_chain_node state="split:job4" job="job4" next_state="job4a" error_state="error" on_error="suspend" />
        <job_chain_node state="job4a" job="job4a" next_state="join" error_state="error" on_error="suspend" />
        <job_chain_node state="join" job="join_job" next_state="job6" error_state="error" />
        <job_chain_node state="job6" job="job6" next_state="success" error_state="error" />
        <job_chain_node state="success" />
        <job_chain_node state="error" />
</job_chain>
Code Block
languagexml
titleThe Start Order for the Split & Join Job Chain
collapsetrue
<?xml version="1.0" encoding="ISO-8859-1"?>
<order  title="Split &amp; Join Processing">
    <run_time />
</order>

Operation

  • Start the start Order in the JOC Cockpit using, for example, the Start Order Now option in the Job Chains Details tab.
  • The split Job will be processed after job1 has been completed and will generate Orders for the child Job Chain segments starting with job3 and job4 respectively.
  • The order for the main Job Chain will move to the join job where it will be suspended until the number of child Job Chain segment Orders required by the join Job has been completed.
  • The main order will then be de-suspended and processing of the main Job Chain will then proceed with job6.#

Anchor
error-handling
error-handling
Error Handling

To allow for efficient error handling the on_error="suspend" setting should be used for each Node in the child Job Chains as shown in the Job Chain code block above. This setting is made in JOE in the Nodes Tab for the Job Chain Steps/Nodes elements.

The following decision tree summarizes the procedure to be followed if a Job in a split child Job Chain is suspended:

Flowchart
1 [label="Order is suspended at a split node"]
1 -> 2
2 [shape="diamond", label="Can the error \n situation can be resolved?", fillcolor="lightblue"]
2 -> 10 [label="Yes"]
10 [shape="diamond", label="Is rerunning the job \n possible and necessary \n for the actual node?", fillcolor="lightblue"]
10 -> 100 [label="Yes"]
100 [label="Resume the Order"]
10 -> 101 [label="No"]
101 [label="Skip the node with the \n JOC set order state function"]
2 -> 20 [label="No"]
20 [shape="diamond", label="Is rerunning the job \n necessary?", fillcolor="lightblue"]
20 -> 200 [label="Yes"]
200 [label="Cancel the whole process"]
20 -> 201 [label="No"]
201 [label="Skip the node with the \n JOC set order state function"]

The error handling process requires some manual work. This is normally done using the Job Chains view of the JOC Cockpit interface.

  • To skip the error node, use the JOC Cockpit Set Order State function, which is available in the Order Additional Options menu. Then resume the Order using Resume Order.
  • To cancel the whole process delete all split suspended Orders with the JOC Cockpit Delete Order function. Then reset the main Order with the Reset Order function.
  • To add a dummy Order to the join node to satisfy the condition to proceed the main Order use the JOC Cockpit Add Order function.
    You can then either:
    • Add the parameter join_session_id=main_order_id where main_order_id is the order id of the main order.
    • Alternatively, you can:
      • Name the Order with the id main-order-id_any where main-order-id is the Order id of the main Order.
    • Set the start step and the end step to the join node.
    • Then delete the suspended Order if there is one.

Order IDs

The 'main' Order for the split & join download - that is the Order that proceeds job1 -> split -> join -> job6 - has the ID 'start'.

The Orders for the two parallel child job Chain segments are given the IDs start_split:job3 and start_split:job4. These IDs are generated from the main Order Id plus the node name of the first nodes in each child Job Chain segment.