How to split and sync a dynamic number of job instances for execution with Agents

Scope

Use Case
- Run a specific job within a job chain in a number of parallel instances, i.e. split a job for parallel processing. Each instance is executed on a different Agent.
  - The use case is intended for execution of the same job within a job chain in parallel instances, not for parallel execution of different jobs.
  - We assume am application for end of day processing that creates balance sheets for a larger number of accounts. The application is installed on a number of servers each equiped with an Agent. The processing is splitted in a way that each instance of the application processes a chunk of accounts.
  - A similar use case is explained with the How to split and sync a dynamic number of job instances in a job chain article, however, this use case is about using a different Agent for each parallel process instead of running parallel processes on the same Master or Agent.
- Synchronize jobs after parallel processing.
Solution Outline
- The number of parallel job instances is dynamically determined from the numbe of Agents that are available to run the application.
- A number of orders is created for a specific job that would be executed in parallel on different Agents. The number of orders corresponds to the number of Agents and each order is parameterized to process a chunk of accounts.
- Finally the parallel job instances are synchronized for further processing.
References
- How to split and sync jobs in a job chain
- How to split and sync a dynamic number of job instances in a job chain

Solution

Download parallel_job_instances_with_agents.zip
Extract the archive to a folder ./config/live of your JobScheduler installation.
The archive will extract the files to a folder parallel_job_instances_with_agents.
You can store the sample files to a any folder as you like, the solution does not make use of specific folder names or job names.

Pattern

Implementation

Components

The end_of_day_split job reads the process class configuration that is assigned to its successor job which is the create_balance_statements job. It creates the numbe of orders that corresponds to the number Agents.
- Each order is added the following parameters
  - number_of_orders: the number of orders that have been created.
  - <job_chain_name>_required_orders: the number of orders that the end_of_day_sync job waits for. This includes the value of the number_of_orders parameter incremented by 1 for the main order. The prefix is made up of the name of the job chain to allow parallel use of this job with a number of job chains.
- The orders are assigned the state that is associated with the next job node, the create_balance_statements job node, i.e. the orders will be executed starting with that state.
- The orders are assigned the end state that is associated with the end_of_day_sync job.
The create_balance_statements job is configured for a maximum number of 10 parallel tasks via the attribute <job tasks="10">. It could be configured for any number of parallel tasks. For the sake of this sample the limitation shows the behavior of the job to wait for processes to become free that could be assigned to subsequent orders for the same job.
The end_of_day_sync job is used to synchronize splitted orders and is provided by the Sync JITL Job with the Java class com.sos.jitl.sync.JobSchedulerSynchronizeJobChainsJSAdapterClass.
- This job is used without parameters.
The end_of_day process class configures a number of Agents.
Hint: to re-use the end_of_day_split job you can
- store the job to some central folder and reference the job in individual job chains.
- move the JavaScript code of the job to some central location and use a corresponding <include> element for individual job scripts.

Job: end_of_day_split

<?xml version="1.0" encoding="ISO-8859-1"?>

<job  order="yes" stop_on_error="no" title="split processing for execution on a number of agents">
    <params >
        <param  name="sync_state_name" value="synchronize"/>
    </params>
    <script  language="java:javascript">
        <![CDATA[
function spooler_process()
{
	var rc = true;
	var syncParameterName = spooler_task.order.job_chain.name + '_required_orders';

    // get job and order parameters
  	params = spooler.create_variable_set();
  	params.merge( spooler_task.params );
  	params.merge( spooler_task.order.params );

	if ( !params.value( 'min_account' ) )
	{
		spooler_log.error( 'parameter missing: min_account' );
	}

	if ( !params.value( 'max_account' ) )
	{
		spooler_log.error( 'parameter missing: max_account' );
	}

	if ( !params.value( 'sync_state_name' ) )
	{
		spooler_log.error( 'parameter missing: sync_state_name' );
	}

    var minAccount = parseInt(params.value( 'min_account' ));
    var maxAccount = parseInt(params.value( 'max_account' ));
    var syncStateName = params.value( 'sync_state_name' );

    // get Agents from the process class that is assigned to the next job node
	var processClassName = spooler_task.order.job_chain_node.next_node.job.process_class.name;
	spooler_log.debug( '.. looking up process class for next job node: ' + processClassName );

	if ( processClassName.lastIndexOf( '/' ) > -1 )
	{
		var processClassDirectory = processClassName.substring( 0, processClassName.lastIndexOf( '/' ) );
		var processClassPath = processClassName;
		processClassName = processClassName.substring( processClassName.lastIndexOf( '/' ) + 1 );
	} else {
		var processClassDirectory = spooler_task.order.job_chain.path.substring( 0, spooler_task.order.job_chain.path.lastIndexOf( '/' ) );
		var processClassPath = processClassDirectory + '/' + processClassName;
	}

    spooler_log.debug( '.. executing xml: <show_state subsystems="folder process_class" what="source folders no_subfolders" path="' + processClassDirectory + '"/>' );
    var response = spooler.execute_xml( '<show_state subsystems="folder process_class" what="source folders no_subfolders" path="' + processClassDirectory + '"/>' );

    if ( response )
    {
        spooler_log.debug( '.... show state for process class: ' + response );
        var xmlDOM = new Packages.sos.xml.SOSXMLXPath( new java.lang.StringBuffer( response ) );
		var xmlAgents = xmlDOM.selectNodeList( '//process_classes/process_class[@path = "' + processClassPath + '"]//remote_scheduler' );

		if ( xmlAgents.getLength() == 0 )
		{
			spooler_log.error( 'no process class found for expected path: ' + processClassPath );
		}

        var currentOrderNum = 0;
		var currentOrderChunkSize = ( maxAccount-minAccount+1 ) / xmlAgents.getLength();
		var currentOrderChunkSizeInt = parseInt( currentOrderChunkSize );
		var currentOrderChunkSizeReminder = currentOrderChunkSize % 1;
		var subMinAccount = 0;
		var subMaxAccount = minAccount-1;

		for( i=0; i < xmlAgents.getLength(); i++ )
		{
			spooler_log.debug( '.... Agent found: ' + xmlAgents.item(i).getAttribute( 'remote_scheduler' ) );
			// sample: calculate range for child order from the number of available agents for the process class
			subMinAccount = subMaxAccount + 1;
			subMaxAccount = subMinAccount + currentOrderChunkSizeInt;

			if ( i == 0 && currentOrderChunkSizeReminder )
			{
				subMaxAccount += 1; 
			} else if ( i == xmlAgents.getLength()-1 ) {
				subMaxAccount = maxAccount;
			}

			// create child order
		    var subOrder = spooler.create_order();
		    var subParams = spooler.create_variable_set();
			subParams.set_var( syncParameterName, ( xmlAgents.getLength() + 1 ) );
		    subParams.set_var( 'sync_session_id', spooler_task.order.id );
		    subParams.set_var( 'number_of_orders', xmlAgents.getLength() );
		    subParams.set_var( 'min_account', subMinAccount );
		    subParams.set_var( 'max_account', subMaxAccount );
		    subOrder.params = subParams;
		    subOrder.id = spooler_task.order.id + '_child_order_' + ( currentOrderNum + 1 );
		    subOrder.state = spooler_task.order.job_chain_node.next_state;
		    subOrder.end_state = syncStateName;

			// launch child order
		    spooler_task.order.job_chain.add_order( subOrder );
		    spooler_log.info( '.. child order ' + ( currentOrderNum + 1 ) + ' has been added for range: ' + subMinAccount + ' - ' + subMaxAccount + ': ' + subOrder.id );
			currentOrderNum++;
		}
    } else {
        spooler_log.error( 'no response from JobScheduler Master for command: <show_state>' );
        rc = false;
    }    

	spooler_task.order.params.set_var( syncParameterName, ( xmlAgents.getLength() + 1 ) );
	spooler_task.order.params.set_var( 'sync_session_id', spooler_task.order.id );
	spooler_task.order.state = syncStateName;

  	return rc;
}
        ]]>
    </script>
    <run_time />
</job>

Explanations

x

Process Class: end_of_day

<process_classes >
    <process_class  max_processes="10">
        <remote_schedulers  select="next">
            <remote_scheduler  remote_scheduler="http://server1:4445"/>
            <remote_scheduler  remote_scheduler="http://server2:4445"/>
            <remote_scheduler  remote_scheduler="http://server3:4445"/>
        </remote_schedulers>
    </process_class>
</process_classes>

Explanations

The process class defines an Active Cluster with round-robin scheduling by use of the select="next" attribute: each task for an order gets executed on the next Agent.
The process class specifies a number of Agents that are addressed by the http or https protocol.

Order: end_of_day

<order>
    <params >
        <param  name="min_account" value="100000"/>
        <param  name="max_account" value="200000"/>
    </params>
    <run_time />
</order>

Explanations

The order includes the parameters that specify the range of accounts for which balance sheets are created.
Any order configuration, e.g. run-time rule, can be added.

Usage

Start the end_of_day order for the end_of_day job chain by use of JOC.
Consider the processing that would
- split the execution into 3 subsequent orders that run for the create_balance_statements job each on a different Agent.
- move the current order to the end_of_day_sync job node.
The splitted orders for the create_balance_statements job will arrive in the end_of_day_sync job node and will wait for all orders to be completed. With all splitted orders being completed the processing will continue.

Space shortcuts

Page tree

Scope

Solution

Pattern

Implementation

Components

Usage