Introduction

  • Users might be interested in specifying jobs and dependencies from Microsoft Excel® spreadsheets.
    • In a situation when business staff is in charge of specifying jobs then Excel® is a frequently used tool.
    • As a no-code approach this allows to specify jobs and dependencies by users without detailed know-how of JS7 - Workflows.
  • The solution explained from the article creates jobs and workflows from Excel® spreadsheets ready for import into JS7.
    • The solution makes use of a PowerShell cmdlet for Linux and Windows and can be operated for PowerShell 5.1, 6, 7.
    • The solution works for .xlsx files. Files from formats such as .xls and .csv have to be imported to Excel®.
    • The solution is both a tool for simple use cases and a programming example how to create workflows from Excel®. More elaborate approaches for processing of customized Excel® input are subject to SOS professional services.

Prerequisites

The following PowerShell modules are used for this purpose:

Download

The following files are used for the solution:

Getting Started

Specifying Jobs and Dependencies with Excel®

Download the jobs.xlsx Excel® spreadsheet file.

  • The worksheets included with the spreadsheet file offer a number of jobs and dependencies from the following columns:
    • NodeID: A unique identifier for the occurrence of a job. If the same job occurs a number of times then the Name column holds the same job name, however, a unique value has to be specified for the NodeID column. NodeIDs have to start with a character and must not include spaces.
    • SuccessorID: Specifies the NodeIDs of jobs that are direct successors of the current job. If more than one job is specified then they will will be executed in parallel branches. A number of NodeIDs in this column are separated by a space.
    • Instruction:  Makes use of the fixed value Job. This column is reserved for later support of further JS7 - Workflow Instructions. This column can be missing or hold empty values.
    • Name: Specifies the job name. This column is required and cannot hold empty values. Job names have to start with a character and must not include spaces. The same job name can be used in a number of rows indicating that the job is re-used for each occurrence with a unique NodeID.
    • Description: Optionally specifies a descriptive title for the job. This column can be missing or hold empty values.
    • Argument: Optionally specifies arguments that are passed to jobs.
    • Agent, Subagent Cluster: Specifies the name of a Standalone Agent that is configured in your JS7 scheduling environment. If an Agent Cluster is used then its name is specified as the Agent and the respective Subagent Cluster is specified, see JS7 - Agent Cluster. If the columns are missing or hold empty values then a default Agent name can be specified from the PowerShell cmdlet's -AgentName parameter, see next chapters.
    • Script: Specifies the job script that will be executed. Multiline input is allowed. This column is required and cannot hold empty values.
    • Job Resource: Specifies one or more JS7 - Job Resources separated by a space that will be assigned the job. This column is not required.
    • Job Class: Specifies one of two values for Shell jobs or JITL jobs from JS7 - JITL Job Templates. If this column is missing, then then job class Shell is assumed.
    • Job Template: If a Job Class with the value JITL is specified then this column holds the Java class name of the job template. This column is required for use with JITL jobs.
    • Error Handling: Optionally specifies one of the following options for JS7 - How to apply error handling. If the column is missing or holds empty values then then STOP error handling will be applied.
      • STOP: If the job fails then the order will be put to the FAILED state and will remain with the job. Failed orders require user intervention to resume, to suspend or to cancel the respective order.
      • IGNORE: Any job error is ignored and order processing continues, see JS7 - Try-Catch Instruction.
      • RETRY: The job will be retried in case of error for 3 times with a delay of 60s, see JS7 - Retry Instruction.
      • LEAVE: The order will leave the workflow with an unsuccessful History outcome, see JS7 - Finish Instruction.
    • Fail on stderr: Optionally specifies that in case that the job will write output to the stderr channel the job will be considered to be failed, see JS7 - Job Instruction. The column holds Boolean values. If the column is missing or holds empty values then the value False will be applied.
  • Users can adjust column names, see JS7 - PowerShell Import from Excel Script - New-JS7WorkflowFromExcel.ps1.



Explanation:

  • Job dependencies are designed along a Directed acyclic graph (DAG). This includes a few rules for input to the Excel® spreadsheet:
    • One or more initial jobs are required as the graph's root node(s). Initial jobs must not be specified as successors in any of the SuccessorID column values.
    • A final job can be specified that does not indicate a successor job in its SuccessorID column.
    • The order of entries to the Excel® spreadsheet is not relevant.
    • Not too much error checking of the DAG is performed, however, missing job nodes will be detected.
  • In a simplified way a directed graph represents job dependencies from two columns:
    • The NodeID column specifies the unique identifier of the given occurrence of a job.
    • The SuccessorID column specifies one or more NodeIDs of successor jobs that are separated by a space.
      • If more than one NodeID is specified then this will fork processing of the indicated successor jobs to be performed in parallel branches.
      • To join parallel branches of jobs the NodeIDs specified from the SuccessorID column of the forking job have to hold empty values in their respective SuccessorID column or hold the same NodeID of a later job.
      • Forking can include any number of parallel branches specified by NodeIDs in a SuccessorID column.
      • Forking can be nested preferably at not more than 15 levels.

Running the PowerShell Cmdlet

Download the PowerShell cmdlet: New-JS7WorkflowFromExcel.ps1

  • Basically two arguments have to be provided:
    • -ExcelPath: Specifies the path to the Excel® spreadsheet file that holds the specification of jobs and dependencies.
    • -OutputDirectory: Specifies the directory to which workflow *.json files will be stored.
  • JS7 workflow files will be added to the js7-import.zip archive file in the current working directory.
    • A different location and file name can be specified using the -OutputArchive parameter.
    • Users can import the archive file into JS7 using the JOC Cockpit's Configuration view that offers the Import button. The archive file does not include a folder hierarchy but holds workflows from its root. When importing archive files with JOC Cockpit users can specify the target folder to which workflows will be added.
    • For details see JS7 - Inventory Export and Import.


Example for use of PowerShell cmdlet for Linux
./New-JS7WorkflowFromExcel.ps1 -ExcelPath /home/sos/excel/input/jobs.xlsx -OutputDirectory /home/sos/excel/output/jobs
Example for use of PowerShell cmdlet for Windows
./New-JS7WorkflowFromExcel.ps1 -ExcelPath "C:\js7\excel\input\jobs.xlsx" -OutputDirectory "C:\js7\excel\output\jobs"


The above examples will create workflows per each worksheet included with the Excel® spreadsheet file. The name of the workflow corresponds to the worksheet. The resulting js7-import.zip file will be created in the current working directory.

Users who prefer to run the PowerShell cmdlet directly from the Unix Shell can add the call to the PowerShell cmdlet and its parameters to an executable shell script like this:

Example for use of PowerShell cmdlet for Linux
#!/usr/bin/env pwsh

$env:SCRIPT_HOME=$(cd "$(dirname "`$0")" >/dev/null && pwd)
Set-Location -Path $env:SCRIPT_HOME

./New-JS7WorkflowFromExcel.ps1 `
    -ExcelPath "$($env:SCRIPT_HOME)/input/jobs.xlsx" `
    -OutputDirectory "$($env:SCRIPT_HOME)/output/jobs"

Managing Job Arguments

Job arguments optionally are specified from the following formats:


SeparatorQuotingData TypeExampleArgument Usage
JSON Stylecommayesnumeric{ "p_calc_id": 365985, "p_calc_ord": 1 }Default: numeric data type

commanonumeric{  p_calc_id : 365985,  p_calc_ord : 1 }Default: numeric data type

commayesstring{ "p_calc_id": "AB321", "p_calc_ord": "RollingOrder" }Default: string data type

commayesBoolean{  is_valid: true,  is_invalid: false }Default: Boolean data type
HashTable Stylecommanostringp_calc_id = AB321, p_calc_ord=RollingOrderDefault separator is the comma

semicolonnonumericp_calc_id = 365985 ; p_calc_ord=RollingOrdersRequires -ArgumentSeparator ';'

newlinenostringp_calc_id = AB321
p_calc_ord=RollingOrders
Default: arguments separated by newline

Explanation:

  • JSON style arguments are specified using { ... }. Otherwise arguments are considered being specified in HashTable style.
  • JSON style arguments require the comma to be used as a separator. Quoting of argument names is optional.
  • HashTable style arguments by default consider the comma  or newline used as a separator.
    • The argument separator can be specified using -ArgumentSeparator '<separator-character>'.
    • Data types for string input and numeric input are dynamically assigned.
  • HashTable style arguments do not suggest quoting.
  • For both JSON style arguments and HashTable style arguments spaces between argument names, values and separators are not relevant.

Automating Import

Users who wish to automate the process of importing workflows to JOC Cockpit can use the JS7 - PowerShell Module that performs operations based on the JS7 - REST Web Service API.


Example for use of PowerShell module for Linux/Windows
Import-Module JS7

Connect-JS7 -Url http://root:root@localhost:4446 -Id controller
# connect using the default root account and root password, check to apply more secure connection methods

Import-JS7InventoryItem -TargetFolder /imported -FilePath /js7/excel/js7-import.zip -Overwrite
# imports the .zip archive to the JOC Cockpit's /imported folder and overwrites existing workflows with the same name

Disconnect-JS7

Examples

Examples for advanced parameterization are provided as follows.

Create Workflows for specific Worksheets and Column Names

Example for use of PowerShell cmdlet for Linux
./New-JS7WorkflowFromExcel.ps1 `
    -ExcelPath /home/sos/excel/input/jobs.xlsx `
    -WorksheetName Workflow01 `
    -WorksheetColumns @{ 'nodeid' = 'NodeID'; 'successorid' = 'SuccessorID'; 'instruction' = 'Instruction'; 'name' = 'Name'; 'description' = 'Description'; 'argument' = 'Argument'; 'agent' = 'Agent'; 'subagentCluster' = 'Subagent Cluster'; 'script' = 'Script'; 'errorHandling' = 'Error Handling'; 'failOnStderr' = 'Fail on stderr' } `
    -OutputDirectory /home/sos/excel/output/jobs `
    -WorkflowName EOD-Workflow `
    -Title "EOD Processing" `
    -AgentName primaryAgent


Explanation:

  • The -WorksheetName parameter is used to limit creation of workflows to the given worksheet within the Excel® spreadsheet file.
  • The call to the PowerShell cmdlet specifies the -WorksheetColumns parameter that holds a hashtable for the mapping of internal names and column names in the given spreadsheet file. This allows to use different column names in a spreadsheet file that will be mapped to internal names. Any additional columns in a spreadsheet are ignored.
  • The -WorkflowName parameter specifies the name of the workflow. If this parameter is omitted then the workflow name will be used from the name of the worksheet.
  • The -Title parameter specifies the title that will be assigned the workflow.
  • The -AgentName parameter specifies the name of the Agent that will be assigned to jobs for which the Excel® spreadsheet file does not specify a value in the related column.

Create Workflows from a number of Excel® Spreadsheet Files

Example for use of PowerShell cmdlet for Linux
Get-ChildItem *.xlsx | Foreach-Object { ./New-JS7WorkflowFromExcel.ps1 `
    -ExcelPath $_.FullName `
    -Title $_.BaseName `
    -OutputDirectory /home/sos/excel/output/jobs
}


Explanation:

  • The Get-ChildItem Cmdlet will select any Excel® spreadsheet files in the current directory. Each Excel® file is pipelined to the PowerShell cmdlet.
  • This allows to create workflows from a larger number of Excel® spreadsheet files in a bulk operation.

Further Resources




  • No labels