Scenario

  • Frequently the fact that a job chain is executed depends on an external event, e.g. when watching for incoming files from a monitored directory. This situation is non-deterministic, i.e. we cannot know if a file will arrive or not. However, from a business perspective we expect a file to arrive, e.g. Mon-Fri not later than 18:00. This scenario applies to any events that are created by mechanisms outside of the JobScheduler's reach as e.g. file orders, manual starts of job chains or external applications that start job chains e.g. by use of the API.
  • Typically users want to receive a notification if an expected event did not occur.
  • This article suggests a solution introducing Assertions to manage the expectation what should happen and when it should happen. 

Solution Outline

  • Basically the solution works for all job chains, including job chains that are started from file orders and job chains that are e.g. manually started by ad hoc orders. This solution is not intended for standalone jobs.
  • For a job chain that is expected to execute until a given date and time a Shadow Order is created that triggers the assertion.
    • The Shadow Order is assigned a start-time rule or calendar based rule to start e.g. Mon-Fri at 18:00. This is the point in time when the expectation should be met, i.e. at the given point in time the Shadow Order triggers a check if the respective job chain has been executed.
    • Should this check be successful, i.e. the job chain has been executed then the Shadow Order will complete its run successfully and will recalculate its next start-time.
    • Should this check fail, i.e. the job chain has not been executed then the Shadow Order will fail and create a notification to signal a failed assertion.
  • In addition to expecting a single execution of a job chain the solution includes
    • to check if more than one execution of a job chain occurred.
    • to check e.g. successful executions of a job chain only.

Implementation

  • The below sample implementation is available for download: assertions.zip
  • Unzip the sample to your JobScheduler Master's live folder. This will create an assertions sub-folder with the below job-related objects.
  • The implementation includes 
    • the Assertion Monitor script that is used by jobs that signal execution of a job chain and that will create a Shadow Order.
    • the Assertion Job Chain that handles pairs of Assertion Orders and Shadow Orders created by the Assertion Monitor.

Assertion Monitor

The assertion_shadow_order Assertion Monitor can be used by any job chain of a user that signals execution to the Assertion Job Chain:

Implementation of the Assertion Monitor
<monitor  ordering="0">
    <script  language="java:javascript">
        <![CDATA[
function spooler_process_after( spooler_process_result )
{
	// modify the path to the assertion job chain if required
    var job_chain = spooler.job_chain( '/assertions/assertions' );

	// create a new order object
	var order = spooler.create_order();

	// copy the current order's id to the newly created order: !folder[!sub-folder]#job-chain-name#order-id
    var pos = spooler_task.order.job_chain.path.lastIndexOf( '/' );
	order.id = spooler_task.order.job_chain.path.substring( 0, pos ).replace( '/', '!' ) + '#' + spooler_task.order.job_chain.name + '#' + spooler_task.order.id;

    // signal success or failure
	order.params = spooler.create_variable_set();
	order.params.set_var( 'spooler_process_result', spooler_process_result );
		
	if ( spooler_process_result )
	{
		order.title = 'successful: ' + spooler_task.order.title;
	} else {
		order.title = 'failed: ' + spooler_task.order.title;
	}

   	// submit the newly created order
    spooler_log.info( '.. scheduling resolving order for assertion: ' + order.id );
   	job_chain.add_or_replace_order( order );

	return spooler_process_result;
}
        ]]>
    </script>
</monitor>

Explanations:

  • The Assertion Monitor creates a Shadow Order for the Assertion Job Chain, see below.
    • The Shadow Order id is made up from the following parts: 
      • the folder of the originating job chain (and optional sub-folders): all forward slashes are replaced by exclamation marks (!).
      • separated by a hash character follows the name of the originating job chain.
      • separated by a hash character follows the originating order id.
    • Example:  !my_folder!my_subfolder#my_job_chain#my_order_id
      • the job chain my_job_chain is located in the /my_folder/my_subfolder directory hierarchy.
      • the originating order id is my_order_id.
  • Line 7: should the name or location of the Assertions Job Chain be changed then this should be considered by the Assertion Monitor.

Assertion Job Chain

The assertions job chain implements management of assertions and is provided with the assertions directory of the delivery.

Assertion Job Chain
<job_chain>
    <job_chain_node  state="100" job="assertions" next_state="success" error_state="error" on_error="suspend"/>
    <job_chain_node  state="success"/>
    <job_chain_node  state="error"/>
</job_chain>

Explanations:

  • The job chain includes a single job assertions, see below.
  • The job chain can easily be extended, e.g. by a successor job that is executed in case of failure of the assertions job. The successor job could e.g. implement an e-mail job to send e-mail notifications.
  • By default the assertions job relies on the fact that either JobScheduler is configured to send e-mail in case of any failed jobs or that the JobScheduler Monitoring Interface is used to forward notifications to a System Monitor, such as Nagios, 

Assertion Job

The assertions job effectively does the work of determining if an expectation has been met, i.e. if for a given Assertion Order one or more Shadow Orders have been generated.

Assertion Job
<job  stop_on_error="no" order="yes" title="Manage Assertions">
    <script  language="java:javascript">
        <![CDATA[
function spooler_process()
{
    var jobChainFolder, jobChainName, orderId = null;
    var candidateOrders = [];

    // optionally the required number of orders (successful or failed) is specified
    var numOfOrders = 0;
    if ( spooler_task.order.params.value( 'num_of_orders' ) )
    {
        numOfOrders = parseInt( spooler_task.order.params.value( 'num_of_orders' ) );
    }

    // optionally the required number of successful orders is specified
    var numOfSuccessfulOrders = 0;
    if ( spooler_task.order.params.value( 'num_of_successful_orders' ) )
    {
        numOfSuccessfulfOrders = parseInt( spooler_task.order.params.value( 'num_of_successful_orders' ) );
    }

    if ( numOfSuccessfulOrders > 0 )
    {
        numOfOrders = numOfSuccessfulOrders;
    }

    var parts = spooler_task.order.id.split( '#' );
    if ( parts.length < 2 )
    {
        spooler_log.error( 'wrong format for order id, use: !folder[!sub-folder]#name[#order_id]: ' + spooler_task.order.id );
        return false;
    } else {
        jobChainFolder = parts[0];
        jobChainName = parts[1];

        if ( parts.length > 2 )
        {
            orderId = parts[2];
        }
    }

    // handle asserting order
    if ( !orderId )
    {
        // lookup resolving orders
        var command = "<show_state subsystems='folder order' what='folders job_chain_orders no_subfolders' path='" + spooler_task.job.folder_path + "'/>";
        var xmlResponse = executeXml( command );
        var xPath = "//folder[@path='" + spooler_task.job.folder_path + "']/job_chains/job_chain[@path = '" + spooler_task.order.job_chain.path + "']/job_chain_node/order_queue/order[@path = '/' and @suspended = 'yes' and starts-with(@id, '" + jobChainFolder + "#" + jobChainName + "#')]";
        spooler_log.debug( '.. select nodes by xPath: ' + xPath );
        var xmlNodes = xmlResponse.selectNodeList( xPath );
        // traverse node list
        for ( var xmlIndex=0; xmlIndex < xmlNodes.getLength(); xmlIndex++ )
        {
            var xmlNode = xmlNodes.item( xmlIndex );
            var xmlOrderId = xmlResponse.selectSingleNodeValue( xmlNode, "@id" );

            if ( xmlOrderId == null )
            {
                  continue;
            }

            if ( numOfSuccessfulfOrders > 0 )
            {
                var xmlOrderTitle = xmlResponse.selectSingleNodeValue( xmlNode, "@title" );
                if ( xmlOrderTitle.startsWith( 'failed:' ) )
                {
                    spooler_log.info( '.. failed resolving order found: ' + xmlOrderId );
                    continue;
                }
            }

            candidateOrders[candidateOrders.length] = xmlOrderId;
            spooler_log.info( '.. matching resolving order found: ' + xmlOrderId );
        }

        // default: remove all resolving orders for an asserting order
        if ( numOfOrders == 0 )
        {
            spooler_log.info( '.. all resolving orders will be removed' );
            for( var i=0; i<candidateOrders.length; i++ )
            {
                spooler_log.info( '.. resolving suspended order is removed: ' + candidateOrders[i] );
                command = "<remove_order job_chain='" + spooler_task.order.job_chain.path + "' order='" + candidateOrders[i] + "'/>";
                executeXml( command );
            }

            if ( candidateOrders.length < 1 )
            {
                spooler_log.error( 'no resolving orders found for asserting order: ' + spooler_task.order.id );
                spooler_task.order.suspended = false;
                spooler_task.order.state = 'error';
                return false;
            }
        } else {
            // remove the specified number of resolving orders considering use of the num_of_orders parameter
            var k = numOfOrders < candidateOrders.length ? numOfOrders : candidateOrders.length;
            for( var i=0; i<k; i++ )
            {
                spooler_log.info( '.. resolving order is removed: ' + candidateOrders[i] );
                command = "<remove_order job_chain='" + spooler_task.order.job_chain.path + "' order='" + candidateOrders[i] + "'/>";
                executeXml( command );
            }

            if ( numOfOrders > candidateOrders.length )
            {
                spooler_log.error( 'number of required orders [' + numOfOrders + '] exceeds number of resolving oders [' + candidateOrders.length + ']' );
                spooler_task.order.suspended = false;
                spooler_task.order.state = 'error';
                return false;
            } else if ( numOfOrders == candidateOrders.length ) {
                spooler_log.info( 'number of required orders [' + numOfOrders + '] matches number of resolving orders [' + candidateOrders.length + ']' );
            } else {
                spooler_log.info( 'number of required orders [' + numOfOrders + '] is smaller than number of resolving orders [' + candidateOrders.length + ']' );
            }
        }
    } else {
        // handle resolving order
        spooler_log.info( '.. resolving order is suspended: ' + spooler_task.order.id );
        spooler_task.order.suspended = true;
        return false;
    }

    return true;
}

function executeXml( command ) {
    var rc = false;

    spooler_log.debug( '.... executing xml command: ' + command );
    var response = spooler.execute_xml( command );

    spooler_log.debug( '.... receiving xml response: ' + response );
    var xmlDOM = new Packages.sos.xml.SOSXMLXPath( new java.lang.StringBuffer( response ) );

    var errorCode = xmlDOM.selectSingleNodeValue( "//ERROR/@code" );
    var errorText = xmlDOM.selectSingleNodeValue( "//ERROR/@text" );
    if ( errorCode || errorText ) 
    {
      spooler_log.error( 'xml response: errorCode=' + errorCode + ', errorText=' + errorText );
    } else {
      rc = xmlDOM;
    }

    return rc;
}
        ]]>
    </script>
    <run_time />
</job>

Explanations:

  • The job checks if for a given Assertion Order a matching Shadow Order has been created by the Assertion Monitor.

Usage

Use of the Assertion Monitor by Jobs

The Assertion Monitor is used by any jobs that should signal execution of a job chain to the Assertion Job Chain.

Use of the Assertion Monitor with job1
<job  stop_on_error="no" order="yes">
    <script  language="shell">
        <![CDATA[
echo "some job"
exit 0
        ]]>
    </script>
    <monitor.use  monitor="assertion_shadow_order"/>
    <run_time />
</job>

Explanations:

  • The <monitor.use> element references the Assertion Monitor. This monitor ships with the assertion_shadow_order.monitor.xml file in the assertions directory of the delivery.
  • Should the Assertion Monitor be used from jobs outside of the assertions directory then an absolute path to its location can be used like this:
    • <monitor.use monitor="/assertions/assertion_shadow_order"/>
  • The Assertion Monitor is executed when an order completes a job node. It will create a Shadow Order for the Assertions Job Chain that signals that an expectation has been met, see below.
  • Therefore only one job in a job chain should use the Assertion Monitor, e.g. the first job node of the job chain.

Use of an Assertion Order for the assertions Job Chain

For a given job chain job_chain1 users should create an Assertion Order with an order id like this: !<folder>#job_chain1.

Sample Assertion Order for Assertion Job Chain
<order  job_chain="/assertions/assertions">
    <params >
        <param  name="num_of_orders" value="2"/>
    </params>
    <run_time >
        <weekdays >
            <day  day="1 2 3 4 5">
                <period  single_start="18:00"/>
            </day>
        </weekdays>
    </run_time>
</order>

Explanations:

  • The Assertion Order does not have to use parameters at all: if no parameters are specified then by default all matching Shadow Orders for a given Assertion Order will be removed. 
  • If the Assertion Order makes use of the optional parameter num_of_orders then this parameter specifies the number of Shadow Orders that are expected and that will be removed when the Assertion Order starts. Should a smaller number of Shadow Orders be found than specified by this parameter then the assertion is considered being failed. This parameter is useful if e.g. more than one incoming file is expected from a directory monitored by file watching. 
  • If instead the parameter num_of_successful_orders is used then this signals that Shadow Orders for successfully executed job chains only should be considered. Otherwise successful and failed execution of the originating job chain are both counted as matching events.
  • If the num_of_orders or num_of_successful_orders parameters are used with a value 0 then this causes the default behavior to be applied, i.e. all matching Shadow Orders will be removed.
  • The run-time of the Assertion Order specifies the point in time when the order will start and will trigger if expectations have been met.
  • Any number of Assertion Orders can created for the Assertion Job Chain.